Taming the SQL Beast

The recent articles 10 Common Mistakes Java Developers make when Writing SQL and the follow-up, Ten More Mistakes… highlight some common crimes against SQL and offer sound advice. These mistakes aren’t restricted to Java programmers or Oracle. .NET programmers make them with SQL Server (see, for example, Plamen Ratchev’s Ten Common SQL Programming Mistakes from 2009). Why are they so persistent?

Some of this can be blamed on the classic OO – procedural divide. When intelligent developers find that standard and accepted techniques from their main language perform poorly in a database, then both SQL and the database itself get a reputation for being antiquated, quirky (functions don’t work properly in SQL Server, joins are slow, and so on). A few of these gripes are justified, but most are pure myth.

Perhaps it’s fear. Even for well-informed and experienced developers, a relational database can appear to be an unsettling creature of random savagery that is best kept at arm’s length, or locked, growling, behind an ORM. Development isn’t easy at any time. Phil Factor, in a recent editorial, used the analogy of lion taming for describing the developer’s craft.

Should database professionals shoulder some of the blame? Grant Fritchey, in an article featured in this newsletter, suggests that DBAs and database professionals, by being too eager to control and dominate on database matters, are ‘doing it wrong’ in the service they provide and aren’t effective in assisting development teams to gain insights into the best ways of using relational databases.

These insights can be sudden and far-reaching. In his chapter for our forthcoming Tribal SQL book (due in time for PASS, watch this space!), Mark Rasmussen recounts his early misadventures with SQL Server. He explains how and why he decided to split his statistical data across hundreds of databases to help SQL Server “cope with the data load”. He describes his moment of clarity, sitting in a SQL conference session, finally understanding how b-trees worked, and realizing that he’d been trying to replicate the very idea behind a clustered index, just in a horribly inefficient way. Mark is now a developer who knows more about how SQL Server works internally than most DBAs, and speaks regularly on the topic.

Maybe the way out of the repetitive cycle of database mistakes is to find better ways of helping developers to understand the basic nature of how SQL and the database storage structures really work? If you’ve experienced similar “moments of clarity” with regard to how to work effectively with SQL and SQL Server, please share them.