Dr Richard Hipp, Geek of the Week

Simple-Talk's Geek of the Week is Dr Richard Hipp. His code is probably running on your PC, and running completely reliably, for he almost single-handedly wrote SQLite, the most widely deployed SQL Database system in the world. Then he put it in the public domain for all of humanity to benefit from. We sent Richard Morris off to ask this remarkable man why he did it.

Hipp, hipp, Hooray!

526-drh2.jpgSQLite is a phenomenon. It is probably already on your computer, since so many applications such as Firefox, Mac OS X and Skype use it. It could be in your pocket too, as it is used in the iPod Touch and iPhone. It is, of course, the most widely deployed SQL Database in the world.

What you get when you link the SQLite library into your application is your own private SQL92-compliant relational database. It is very lightweight and easy to install and use. It supports multithreaded reading of data, and behaves almost like a full RDBMS. It is remarkably robust, reliable, simple and fast.

The most singular thing about SQLite is that it is in the public domain. This generous act has transformed the database market and brought relational database systems into the mainstream. None of this is quite as remarkable as the man who, almost single-handedly, wrote it.

In a world where people are motivated by money and celebrity Dr Richard Hipp stands out for two significant reasons.

He has actively waived royalities worth possibly hundreds of thousands of dollars on his best-known software SQLite (sqlite.org) he is disarmingly modest about his achievements and, for a busy man, remarkably generous with his time.

Though best known as the architect and primary author of SQLite, he has also authored Lemon LALR parser and CVSTrac. He is also a member of the Tcl (Tool Command Language team) a scripting language created by Electric Cloud chairman John Ousterhout.


I wrote SQLite because
it was useful to me
and I released it into
the public domain
with the hope that
it would be useful
to others as well.
                  “

Born in Charlotte, North Carolina in April 1961 he grew up in the suburbs of Atlanta, Georgia and graduated from Stone Mountain High School in 1979 when he enrolled at Georgia Instiutute of Technology (Georgia Tech) which is one of the oldest and most respected polytechnical universities in the United States.

Although he was formerly an atheist, he turned to Christianity as a freshman, a foundation in life that guides his work today. He graduated from Georgia Tech in 1984 with Master of Science in Electrical Engineering and worked at AT&T for three years before returning to graduate school at Duke University in North Carolina to study under Alan W. Biermann in the Department of Computer Science.

Richard graduated as Doctor of Philosophy and finding, he says, the academic market for Ph.Ds saturated with better qualified candidates, started his own software development consulting company.

He married Ginger G. Wyrick, a musician and author, in 1994 and changed the name of his company to Hipp, Wyrick & Company Inc. Ginger and Richard  moved to their present home in Charlotte, North Carolina in August 1995.

RM

Richard, when did SQLite begin?

RH

The first code check-in was on 29 May 2000.. But SQLite contains component parts (such as the “Lemon” parser generator and the “printf()”  implementation) that date back to the late 1980s.

RM

When multi-megabyte installations are the order of the day SQLite has a self-imposed limit on its size of 250KB. Why is that & how much spare space do you have left?

RH

We work hard to keep SQLite small because it is commonly used on embedded devices that simply do not have the space. Note that the 250KB size assumes that one compiles with size optimization turned on (-Os in GCC).

If you use the amalgamated source code with an optimizing compiler, the compiler will typically do lots and lots of loop unrolling and function inlining and the binary will end up being larger – perhaps twice as large.

 Within the past year, I’ve been informally polling many of the embedded device manufacturers that use SQLite and what they are now telling me is that new features are becoming more important to them than small size. So in the next release, we will be busting the 250KB barrier.

Last time I checked (2 weeks ago) we where at 252KB. And since then, we’ve added a new query optimization so we are likely up to around 260KB by now. Our new limit is 300KB.

For developers who are are still concerned about squeezing SQLite into the smallest space possible, there are compile-time options to remove optional features. You can still get SQLite down below 200KB if a small footprint is important to you and you do not need all of the bells and whistles.

RM

What size of databases can it support?

RH

The theoretical upper limit is 70368 Gigabytes (2^46 bytes). Though as far as I know, nobody has ever tested that limit. Most users deploy databases no larger than a few dozen Gigabytes.

With current technology, if you have more data than that, you likely want to be deploying a server anyhow.

Until recently, SQLite had a constraint that it had to allocate and zero N bytes of memory at the start of every transaction where N was the number of pages in the database file divided by 4. Assuming a 1KiB page size (the default) this gave you a practical limit of perhaps 100 gigabytes for the size of a database. But that big memory allocation was removed in version 3.5.7 (2008-05-17) so as far as we know, SQLite will now actually work with databases up to the maximum theoretical limit.

RM

SQLite is now part of Apple’s Mac OS X operating system and it’s used by Google, Adobe, Sun and others but you receive no royalty. Why did you choose to place it in the public domain? What was the reason behind it?

RH

SQLite version 1.0 used GDBM as its storage backend. So it was of necessity under the GPL since GDBM is GPL and the GPL is transitive.

When I was writing SQLite version 2.0, I considered all of the popular open-sources licenses of that time, but really didn’t see the benefit of using any of them. So I just released the code to the public domain, thinking that would be the simplest approach. I have since learned that many legal jurisdictions do not recognize the public domain, and that even where it is recognized it is only recognized in common law and is thus on shaky legal ground.

Being in the public domain has caused concern among the lawyers for many of the prominent users of SQLite. They are accustomed to dealing with open- source, but public domain software was a new concept to many of them. Furthermore, public domain creates problems in attracting new developers, since in order to keep the code in the public domain I am forced to obtain a affidavit from the developer and their employer before I can put their code into the source tree.

 If I had known as much about copyright in 2001 as I know now, I probably would have gone with something like the Apache license. Live and learn…

I had never intended to make any money off of SQLite. I had always made my living doing custom, proprietary software development contracts. I wrote SQLite because it was useful to me and I released it into the public domain with the hope that it would be useful to others as well. I never dreamed that it would catch on as it has or that I might someday be able to make my living by supporting it. I originally never intended to make a dime off of SQLite. But things did not play out as I originally planned. Beginning in about 2004, folks began paying me to support and enhance SQLite. This grew until beginning in 2008, SQLite is all I do.

We do have a couple of extensions for SQLite that are proprietary: The SQLite Encryption Extension (SEE) and the Compressed and Read-Only Database Extension (CEROD). Sales of source code licenses to those extensions helps to support our continuing development efforts. We also sell support contracts of various kinds. And we are working on a version of SQLite for use in safety-critical systems.

RM

I like your note that reads

‘The author disclaims copyright to this source code. In place of a legal notice, here is a blessing: may you do good and not evil. May you find forgiveness for yourself and forgive others. May you share freely, never taking more than you give.

Who or what inspired you to write that?

RH

People customarily put a copyright notice at the top of each source file. But SQLite version 2.0.0 had no copyright so I had to think of something else to go in that space.

The second sentence, “May you find forgiveness for yourself and forgive others”, is a loose interpretation of Matthew 6:12, part of what is commonly called ‘The Lord’s Prayer’ and more recognizable as ‘Forgive us our debts as we forgive our debtors’.

The third sentence tries to capture the concept of paying debts forward. The ‘never take more than you give’ part is a paraphrase of one of the lyrics from The Lion King. The first (hokey) sentence is there because it seemed like a good benediction needed three sentences.

RM

Any particular human influences in your work? Anyone who particularly inspired you?

RH

I have learned programming techniques and design ideas from countless people, many of whom do not know even know how much I learned from them.

Probably the biggest influence on SQLite came from John Oosterhout – it is my life’s ambition to be able to write code that is as lucid and readable as the original Tcl/Tk source code written by “JO”.

RM

Being open-source you must have a few contributors?

RH

There have been between one and two dozen contributors to SQLite, but most of the contributions have been relatively small.

Dan Kennedy, on the other hand, has been a major contributor to SQLite since 2004 and he deserves a lot of credit for SQLite’s success. The last time I checked, about 40% of the core code had been written by Dan. I was responsible for 59% and the remaining 1% was from about a dozen other people.

Peter Weilbacher does a great job of maintaining the OS/2 port for us. Recently, Shane Harrelson has come on board the team and has been an tremendous help with maintaining the windows ports. And Mihai Lambasan has been making some great contributions to the documentation just within the past two weeks. I’m hoping to be able to convince Shane and Mihai to stick around long-term!

The most of the full-text search engine code comes from Scott Hess and his colleagues at Google. The project has received smaller, but no less valuable contributions from many others.

RM

How have you made SQLite so stable?

RH

 First, I have worked hard to resist feature creep. I have not been 100% successful at that, but my efforts have at least preventing the complexity of SQLite from growing exponentially.

Second, we have worked to keep the SQLite code easy to understand (as easy as an SQL database can be). The code commenting style strives to clearly state the “contract” each subroutine makes with the rest of the system, and to explain “why” the code is doing something, not how that something is being done. We also work to keep the code modular, so that changes to one part of the system have minimal impact.

Third, we have a very extensive test suite based on that does an excellent job of stressing the code. The test suite runs 98% of the code. (The remaining 2% is mostly unreachable code such as the “default:” case of switch statements.) Whenever a bug is found, we always create new test cases to exercise the bug so that after it is fixed it will not recur. The resulting test suite comprises two- thirds of the total source code (only 33% of the SQLite source code actually gets delivered into end products) and gives us a very good indication of whether or not the code is working. We can do major surgery on the SQLite internals, and as long as the test suite still passes, we have high confidence that nothing as broken.

RM

You chose not to licence it, why was that ?

RH

Who knew that it would be so popular? I have done many open-source projects before SQLite but none of the others ever caught on. If I had known ahead of time that SQLite would be such a big hit, I might have been seduced by the love of money and tried to license it. But it is unclear that any such effort would have been successful since I have precious little business sense.

The history of SQLite might not have have been the most profitable (to me) but it has worked well enough.

RM

Are you worried about the code fragmenting into many different versions, because it is public domain?

RH

There have been between one and two dozen contributors to SQLite, but I just assumed that SQLite would fork shortly after I released it to the public domain.

I intentionally removed my name from all of the code and “set the code free”, so to speak. I figured that others would fork it and I would lose control of the project completely. I was rather surprised that did not happen.

At this point, the code has become so complex that forking is less of a danger, I think. There are not many people who would want the task of maintaining it now, I suppose. But if it forks tomorrow and I lose control of the project – so be it. I have given the code away. It is no longer mine to control in the first place.

RM

I was speaking to Sir Tim Berners-Lee about his decision not to cash-in on the web and from his answers I sensed an inner calm about him. Do you have the same feelings about not commercially benefiting from SQLite?

RH

I am delighted and honoured that SQLite has found such success. Looking back, I understand now that I walked away from what could have been some very big deals. But there are no regrets.

RM

Do you ascribe your talents and skills as God given or human made?

RH

Everything that I am is by Grace alone. The longer I live and the more I see, the more obvious this fact becomes. There have been many times in the past (and, no doubt, there will be more occasions in the future) where I have looked upon “my” accomplishments with pride, thinking that “I” have done well. Such thinking is utter foolishness. I would be less than nothing but for the unmerited favour of God.