Normalization’s other little side effect…

It isn’t that I don’t like a challenge, really it isn’t.  I like puzzles, mazes, Suduko, video games with challenging levels where I have to really think about a problem to solve it.  So why don’t I like poorly designed databases, where finding a value that you want can be just as challenging, or maybe more?  Glad you asked (or maybe you didn’t, what do I care, this is my blog 🙂   And for any of you who might be asking: “Shouldn’t I be able to use SQL Server like I want to?”  It is a good thing you didn’t actually ask that out loud, or I would have to ban you 🙂

When someone creates a puzzle, maze, or video game, they design in the challenge purposefully. Itzik Ben-Gan frequently sets up SQL puzzles on SQL Server Magazine that can be kind of fun.  But when someone builds a poorly designed database, then don’t leave you breadcrumbs to find your way around.  Quite the opposite.  If they had spent time trying to think about how other people might get around in their database, they would have naturally started normalizing.   No, usually a poorly designed database is an act of “selfishness” or “ignorance” by a person who is not thinking at all.  Or even worse, was.

This all comes up because I spent a good amount of my week this week digging through a database. First I would identify some data, go back to some other people with some knowledge of the data, have a meeting, then go back and try again.   What should have been a quick, painless task to identify 20 columns from this database took much longer than necessary.  Not 100s of tables worth, no, just a few measly columns.  So what was wrong?

  • Columns that had meaning in one row, but not another
  • Columns named FieldName1, FieldName2…
  • Two related tables that were not related by key, but by a concatenation of columns
  • One database per day of activity, not a key that denotes a different day
  • Data with embedded values, and not always the same format. 
  • Domain value with no apparent meaning

I could go on, but I would just be making stuff up.  And frankly this is enough mess for a Sunday afternoon.  The problem is, had someone taken the time to design this database using proper normalization techniques, none of these problems would be evident.  So what is the side effect?

Documentation

Not that it is impossible to build an incomprehensible normalized database (lots of companies do stuff to make it hard for competitors to understand their data (like having column names be non-sensical without their data dictionary).  But the problem here is that yet another person spent time creating a “general purpose” database.  It never ceases to amaze me the lengths that people will go to never change tables in SQL.  So they do all of this nasty mapping in their code. 

But you know what.  It turns out that SQL Server has a lot of really cool stuff that lets you customize data storage.  For example:

ALTER TABLE allows you to add columns

CREATE TABLE allows you to create tables

sp_addextendedproperty allows you to add documentation to these properties

Note that I am not necessarily suggesting that the answer to all “open schema” type problems necessarily should (or even could) be solved by simply adding to the schema.  In many cases you would not want that at all.  But in this case, I am not talking about that situation.  Why?  Because the code had to change…

If the code that accesses the data needs to change based on the structure, extend the data structures.  It will save you code if your database is cleanly created and normalized.  Use SQL to do the job it was made for.  Then, it will be easy to go to one table that has a name that matches what I think I want, see what it is used for, then go to related tables, and to their related tables and so on.  Legal values for a column will be documented and checked.  Names given to columns will reflect their meaning.  You might even have descriptions stored in extended properties.

And then a job that took a week could have taken an hour.  (On the other hand, I get another anecdote for my presentation next weekend at devlink and for my next book.  And this blog.  Maybe I do like poorly normalized databases…Nah!)