Software developers have trouble in treating all the tiers of their application consistently as equally important first-class citizens, because the Interface layer and data-access layers of applications have always benefitted from more sophisticated tools than the database itself.
Productivity tools are abound in IDEs such as Visual Studio, but SQL Management Studio remains a niche application. Local autonomous copies of web layers usually reside on developers workstations whereas databases all too often end up being shared on a server. Source control has been readily accessible and easily used for the tiers above the data layer, but aren’t often well-implemented within it.
Now Red Gate’s SQL Source Control has evened up this disparity.
The problem with databases
It is a nightmare to maintain database objects in version-control without a dedicated tool. The simplicity with which we version other applications files – HTML, images, classes, etc – hasn’t been readily achievable in the database world. Sure, there are various mechanisms out there to script objects out into the file system and version those, but it’s a real pain to actually synchronise back into other environments.
To make database versioning practical, it needs to integrate seamlessly into the development process; so that it aligns to the tools and practices of developers. In the database world, that means it has to play nice with SQL Server Management Studio (henceforth known only as SSMS), just as tools like VisualSVN or AnkSVN play nice with Visual Studio.
When you consider the challenges, it’s not surprising that databases have for so long remained, for source control, the second class citizens. Source control management systems like to work with discrete files on the file system. Products like Subversion don’t care what type of file it is ,although there are functional limitations on what can be done with binary files: they just need something they can pick up, and associate with a repository.
SQL Server, of course, stores all objects and data within .mdf files which are totally unsuitable for versioning. Firstly, they’re binary files with no fidelity of individual database objects so there’s no easy way of comparing them. Secondly, they hold data as well as objects and this should never make its way through to source control. Finally, .mdf files are not conducive to simply being checked out over an existing instance of the database.
Don’t even get me started on putting full database backups into source control – it’s just never ok! This will be painfully obvious to most people, but I’ve seen it done before, and more than once too. Source control management exists to version, um, source code, it’s not there to version content, and backing up .mdf, .ldf or backup files is never ok.
The list of problems this creates is endless; continually changing versions with no functional change resulting in endless pointless updates, huge redundancy from binary files which you can’t ‘delta’, inability to ‘diff’ between versions and so on and so forth.
This is the realm of disaster recovery and retention strategies, not of source control management. Don’t do it!
For database versioning to be successful with a product like Subversion, the individual objects must be scripted out to the file system into discrete files. If you can’t apply a version to each independent object within the database, much of the value of source control is lost. In a multi-tiered application, those files would be versioned along with the other tiers in discrete revisions. This ensures that the repository always contains an overall picture of the entire application.
Defining an atomic application
One of the great features of Subversion is the enforcement of atomic commits. Change a dozen files, try to commit and if any one file can’t go all the way through to the repository, the commit fails and the transaction rolls back. The beauty of this model for developers is that it means the repository shouldn’t get into a state which could potentially “break the build” because of an incomplete ‘commit’.
For an application to be truly atomic under source control – that is for all its components to be successfully versioned into a single revision – the data layer must coexist as an equal partner with the other application tiers. Developers can then have confidence in the integrity of the repository and know that any single revision has an entire picture of the application from that point in time.
If this can’t be achieved, the application becomes fragmented. For example, if a revision successfully commits an ADO.NET call for a stored procedure but does not commit the data layer containing that procedure, the application will break at runtime.
Models in which only selected components are versioned provide a very incomplete story about what the application truly looks like. Not only is this a risky proposition for keeping ongoing development synchronised across a team, it makes rolling back to previous revisions extremely tricky.
The problem we’ve got when attempting to holistically ‘version’ an application directly from the IDEs is that the data layer is being committed autonomously from the application layer. Assuming this happens in sequence (perhaps from SSMS then Visual Studio), there is a window where someone else may pull down the data layer only, and if that includes a dropped column on which the app layer has a dependency, then things will start to break.
In this regard, commits do become kind of semi-atomic. Sure, the actual SVN transaction is atomic but the fact you need two of them to holistically commit the application does leave a window where problems can occur.
Another typical problem that can happen if the database layer is not put in to source control occurs when new tables are added to the database and persisted to the data access layer via an ORM such as LINQ to SQL. If dependencies on these persisted classes are created and committed without the underlying data layer, regeneration of the classes – such as by SQL Metal – by another developer will result in build time errors because the objects are no longer persisted.
Exploratory processes and rollback
Software development involves exploration and experimentation. Developers will often pursue a particular path, and then elect to come at the problem from a different angle. It is enormously important for both productivity and creativity to be able to do this without being hindered by the fear of lengthy or irreversible changes.
With Red Gate SQL Source Control, the familiar “revert” functionality now makes its way into SSMS as “Undo changes”. This brings the ability to explore and make mistakes – which are now rectifiable – right into a very familiar development environment.
A typical use case for this functionality is found in data modelling exercises. Various data models are often trialled and discarded during the development processes, with code changes occurring across both the database and application layers. It is invaluable to be able to simply right-click and undo changes.
Likewise with performance profiling; this is an exercise which is rarely perfected on the first attempt and often involves numerous cycles of exploration, and trials of various indexes before the correct balance is found. Obviously rollback is very handy when you’re doing this, but Red Gate SQL Source Control can also give you a complete summary-view of exactly what was changed before code is committed.
The ‘commit changes’ window lists each modified object along with a ‘compare’ window that shows precisely what’s new since the last revision. For development work that requires extensive modifications of objects between commits, it is useful to be able to get a quick snapshot of the changes. Of course SQL Compare has always done a fantastic job of this, but only when comparing against another target database which may now be well and truly out of sync with the development stream.
Retrieving application tags
In addition to all the advantages of source control at the database level during development, there is another advantage that exists long after a version has gone live and that’s the ability to retrieve a complete picture of a previous release.
This is very much a source control 101 concept, but tags are simply no more than a labelled instance of the code at a point in time. For example, each release of a product is often accompanied by a tag bearing the version number. If version 1.0 of a product is ever required at any time in the future, the developer can have confidence it’s captured for perpetuity within the 1.0 tag.
A common use case for retrieval of a tagged version is where the development of a product has continued after a release but a bug has been identified in the live version. The development trunk is not necessarily the right place to resolve this bug as it often contains new features not yet ready for release. The only way to reliably fix the bug, and release nothing more, is to pull the code from the tag into a new branch of development.
This brings us back to the discussion about the need for an ‘atomic unit’. If the 1.0 tag does not reliably represent the application in its entirety at version 1.0, then recreating the application state at that point in time is going to be a very risky proposition. Often the database is neglected so when that version is pulled the developer gets a nicely versioned instance of the application but no database. In this case, development of the application trunk has progressed and the database has quite likely changed, so we now have a problem in that the application simply can’t be pieced back together again without some potentially arduous work.
Conclusion: Red Gate SQL Source Control reduces risk and increases productivity
Source control is often viewed as simply a means of versioning an application and whilst that may be true from a purely functional perspective, the real story is in the mitigation of risk, and in productivity advantages.
The thing about Red Gate SQL Source Control is that it takes that, well and truly, to the next level. The advantages that developers have enjoyed for years in the application space now truly extend to the data layer and finally the application becomes a complete atomic unit in the repository.
To be able to do all this in an entirely integrated fashion with full support for distributed development databases across a team – without resorting to clunky scripting processes – is a very big leap forward indeed. Can you afford the risk and productivity loss of doing database source control management any other way?