An Agile House of Straw

The ideal Agile application developer welcomes changing requirements, even late in development. The DBA or Database Developer doesn’t. Why is that?

You can’t create complex databases in the Agile way, by breaking tasks into small increments, with minimal planning. Building a database that will perform quickly, reliably and securely over time, as it grows, is more like building a house: it involves architecture and has to be planned in detail beforehand. The tables, their data types, their key structure, constraints, and the relationships needed between the various tables, need to be thought out before you start the first cut.

The database structure then needs to be loaded with realistic transactional data and rigorously tested to make sure that the expected results are always returned in the various reports. It needs to be stress tested under load to ensure adequate performance, smooth scalability, and that there are no data integrity issues caused by predicted levels of concurrent access/modification of the data. (If you think SQL Server automatically protects you from such eventualities, it doesn’t, as Alex Kuznetsov proves in his Defensive Database Programming book).

When a database architect finally gets all this right, it’s only with the greatest reluctance that any changes to the database structure are allowed. Sure, you can make changes, but it is far trickier to do. Every change necessitates another round of regression, stress testing and so on.

This need to test data structures plus data is the critical difference between database and application testing. Data retrieval logic is not enshrined in the SQL code; the database engine decides how to do it based on a range of factors including table/index structure, the data that is stored, its volume, distribution, and so on. This means that, often, each round of testing must use not just “realistic” data but also consistent data, in terms of the data types, ranges of values, and distribution of data within those ranges. The test data must stay the same between builds so that you can compare aggregations to ensure that nothing is broken, and get relative performance figures.

All of this makes Agile database development hard. This becomes a severe problem when the database development becomes a full participant in the Agile process. There is no easy way to accommodate the evolutionary changes that some developers would like to freely make part of an Agile sprint, without destroying the resilience, maintainability and performance of the database. After all, you wouldn’t build a house by an evolutionary process, would you?