26 September 2016
26 September 2016

Versioning an existing database using SQL Source Control

A while ago, maintaining a reliable database change history was an ongoing problem within our development team. It seemed we were forever asking questions such as “Who made this change?”, “When was the stored procedure last edited?”, “When was this view created?”, and so on. Our initial approach was simply to store a list of all database changes in a Word document! I cannot count the number of times we forgot to update that document and missed an important component. As a result, pushing changes to production was often a time-consuming and error-prone task.

None of this was a problem with our C# or JavaScript code because it was all under version control, so the answer seemed simple: get the database in version control too! Finding the right solution was harder. It had to support Git, it had to integrate with SQL Server Management Studio (SSMS), the tool almost every developer in the company uses to query SQL Server, and it had to be easy to set up.

After some evaluation, we chose Redgate’s SQL Source Control. It met all our criteria and the first time I used it, I had my database under version control in under five minutes, without resorting to the manual for help. Nevertheless, getting database version control set up and running on your own laptop is one thing. Reaching the point where every development team is ready to use it is another. I had to learn some hard lessons in order to move from evaluation to company-wide adoption at Farm Credit Services of America. Something new would come up each time a team started using the tool.

This article describes a few of the important planning considerations, and then provides a simple walkthrough of how our team brings an existing database under version control using SQL Source Control.

Planning for database version control

Before embarking on database version control, aka database versioning, the team of course needs to consider the practicalities: which version control system would suit them best, which databases need to come under version control and what that entails, whether each developer works on their own local, dedicated copy of the database, or all developers use one shared database, and so on.

In my experience, these are two of the most important initial steps to get right:

  • Reconciling schema differences in existing versions of a database, prior to version control
  • Making the right choice between the shared and dedicated database development models

I’ll only cover these two considerations in this article but the team also need to think about how they can coordinate their development efforts once the database is under version control, how they will build and deploy new databases from version control, how to automate these processes, which branching strategy will best support their database deployments, upgrades, bug fixes and so on. You can find a good discussion of broader database versioning strategies and challenges in the eBook, SQL Server Source Control Basics.

Reconciling schema differences

Before you even think of putting a database under version control, you need to reconcile all the schema differences between each environment.

Once a database is in version control, the build scripts that define the state of each database object, for a given database version, become the ‘truth center’ for that database. If you’ve not reconciled the environments properly, and a mission-critical table exists in one environment but its DDL script is not in version control, the next time you deploy to the environment, you risk that object being wiped out. Deleted. So, let me say this again, because this is the most important step:

Reconcile all schema differences between each environment before putting a database under version control.

When I first put a legacy application’s database under version control, I could not believe the differences I found between environments. Test tables existed in development. Column properties existed in development and test but not production. Views existed in production but not in development.

The hard part in this situation is determining which environment should be the environment of record. The production environment seems like the obvious choice, but we often had many database changes sitting in our final testing environment, fully tested but waiting to be deployed. Generally, therefore, I chose Test as the environment of record, and made sure all other environments matched it, in terms of schema and static data, using SQL Compare and SQL Data Compare. Having done this, you’re then ready to get that instance of the database put under version control.

Choosing between the shared and dedicated models

You have two choices, dedicated or shared. Shared means everyone uses the same database on the same server, while dedicated means each developer has their own copy of the database running locally. Shared is a great place to start because it lowers the barrier to entry due to the lower setup time, and the fact that a DBA can be responsible for all the configuration changes of which most developers are not even aware, such as security and file groups. That said, I recommend the dedicated model (as does Redgate) because it offers many benefits:

  1. Schema changes and code changes (C#, JavaScript) go hand-in-hand. This allows developers to get all the schema changes needed (table changes, stored procedures, views, functions, and so on) working locally before pushing them out.
  2. Branching strategies are supported. A shared model only supports one branch, the master. With shared, you better hope all new changes made to your database are non-breaking changes or you will have issues. In one case, one pair of developers were making a change to the UI code, while another pair were renaming a database column. The rename took five seconds, the code changes took a few hours. Without the new UI code in place, the column rename operation brought down anyone who was pointing their local code to that database. It should have all been done in a branch.
  3. Local data can be messed up. If I mess up my database, no one but me cares. I can easily recover by recopying data using SQL Data Compare or restoring from a backup. I don’t hurt anybody but myself or waste anyone else’s time.

Demo: version controlling AdventureWorks using SQL Source Control

Having taken care of the preliminaries, the actual process of version controlling an existing database, for a given team, involves the following steps

  1. Add the database to the master branch in version control
  2. Install SQL Server on each developer machine
  3. Set up the database on each developer machine (using database creation scripts)
  4. Bring changes down from version control and apply them to each development database

For this demo, I’ll assume you have SQL Source Control installed (I’m using version 5.2), and we’ll put the AdventureWorks2014 under version control. The version control system in this case is Git. I’ve put the project up on GitHub if you would like to see what everything looks like before getting started.

Step 1: Add database to version control

Let’s say you’ve established the AdventureWorks2014 database that you want to put into version control (such as the one in the test environment), and you’ve reconciled any schema differences in other environments.

Make one developer responsible for this initial step of getting the chosen database into the version control master branch. I saw one project where three developers had attempted to do this independently. The results were … interesting.

In step 4, later, I’ll show how all other developers can subsequently pull down the changes from master.

Link to version control system or to working folder?

If you right-click on the database you want to put under version control, AdventureWorks2014 in this case, and select Link database to source control…, SQL Source Control will open up and you’ll immediately need to choose between linking directly to the version control system, or linking instead to a working folder.

If you are using TFS or SVN then direct version control integration with SQL Source Control will work just fine. However, Git is rather unique because of the way it handles branches, and SQL Source Control doesn’t yet fully support some of the Git functionality like complex merge conflicts. Critically, for us, there is no way to see easily all pending changes (C# and database) not checked into Git. We store our application source code (C#, JavaScript) in the same repository, right alongside the database. This is key for us because it means we can bundle all related changes into one commit.

As such, I recommend developers on my team use the “working folder” option (see Figure 1). By checking into the working folder, the developers can then follow their typical workflow, using their Git tool of choice to commit, update, push and pull those changes. It is treating the database just like C# or JavaScript code.

Database version control 1

Figure 1

The next step is to specify the working folder location. The team should have agreed upfront on a folder structure for the version control repository (which of course the working folder will mirror). Our standard is as follows:

  • GIT or Mercurial: /[RootSourceControlFolder]/Database/[DatabaseName]/
  • All Others: /[RootSourceControlFolder]/[BranchName]/Database/[DatabaseName]/

Enter the working folder location, and below that select the Shared database option, since we are establishing the database in the master branch of version control, to which all developers will have access. Subsequently, in step 5, all developers will link to this shared location, pull changes down to their local machines, and sync their local, dedicated databases.

Database version control 2

Figure 2

Commit changes to the working folder

Click on the Commit button on the top menu, in order to commit all the changes to the working folder. SQL Source Control will present a list of all database objects that exist in the AdventureWorks2014 database but not in the working folder, or that exist in both but are different. In this case, since the working folder is currently empty, you’ll simply be reviewing the CREATE scripts that will be created for each object in the working folder.

Select all the objects by checking the very top checkbox and hit Save changes.

Database version control 3

Figure 3

The Commit screen should now register no changes to be committed to version control, and your working folder should look like that shown in Figure 4. SQL Source Control has generated the object creation script for all the tables and code objects in the database, automatically categorized according to object type.

Database version control 4

Figure 4

You can now use your version control tool of choice to commit the changes to the local repository, and then push them to the remote master branch, where the rest of the team can access them. In this example, I am using Git as my version control system and GitKraken as my GUI.

Database version control 5

Figure 5

Finally, if each developer will pull from here, in order to synchronize their own dedicated database, then you will want to unlink this database from source control.

Database version control 6

Figure 6

From a developer’s perspective, putting the tables and all database code objects into version control, along with any necessary static data (not covered in this article) is the major requirement. However, just bear in mind that SQL Server has a lot more components, such as ETL scripts, SQL Jobs, and so on, which you should also consider versioning. The ultimate objective is to be able to recreate a fully operational database system, at any version, just from what’s in version control.

Step 2: Script local SQL Server setup

If you use the dedicated database model, each developer will need to install SQL Server on his or her local machine. For uniformity, I highly recommend creating a .bat file that will perform the install. Microsoft provides some excellent documentation on how to do this. I recommend putting this in a shared location or in version control. The creation folder (/[RootSourceControlFolder]/Database/[DatabaseName]/creation) works fine for this.

Step 3: Script database creation

To establish a local working copy of the AdeventureWorks2014 database, synchronized with the master version control repository, each developer first needs to create an empty local copy of the database. SQL Source Control will not create the database creation scripts for you. It’s a good idea to generate a basic database creation script in SSMS and then place it somewhere where all developers can access it, such as:

/[RootSourceControlFolder]/Database/[DatabaseName]/creation

Each developer should open this script and run it.

Database version control 7

Figure 7

Step 4: Synchronize local database with version control master

The final step is for each developer to synchronize their local database to the version control master branch. Having pulled from master (remote) to their local repo, and then updated their local working folder, each developer can then link their local database to this working folder.

The steps are very similar to those outlined earlier. Each developer will choose to link to a working folder, specify its location, but will choose the dedicated development model.

Database version control 8

Figure 8

You will be prompted to get the latest changes and clicking on Get latest will reveal a list of changes that will synchronize the local database with the working folder.

Database version control 9

Figure 9

Unfortunately, you’ll be presented with a warning, shown in Figure 10. SQL Source Control (which uses the SQL Compare engine under the covers) wraps the synchronization script within a transaction, so that it can roll back all changes should something go wrong. AdventureWorks uses a Full-Text Catalog and one of the quirks is that a full-index cannot be created in a transaction.

In this case, the JobCandidate table has a full-text index to support full-text queries against the Resume column. As the warning indicates, SQL Compare will deploy the full text index changes after the transaction completes, but if changes to objects that reference that index are included in the transaction, then the synchronization script may fail.

Database version control 10

Figure 10

If you click Acknowledge, the synchronization will fail, because the uspSearchCandidateResumes stored procedure uses the full text index on this table to search within the candidate resumes.

There are a couple of ways around this sort of problem. If you have a lot of objects that reference a full-text index, you might just consider running the whole synchronization script outside a transaction. To do this, click Cancel instead of Acknowledge.

Go to the Setup menu, click Comparison options, check the Don’t use transactions in Deployment scripts option, and then hit Save.

When you rerun Get latest, you’ll receive a high-level warning about the lack of transactions, and advice to back up the database before you start (it’s good advice!). Once the synchronization is complete, remember to re-enable transactions for deployment scripts.

An alternative, if there are only a few dependencies, is to exclude from the initial synchronization any objects that reference that full text index, in this case uspSearchCandidateResumes.

Database version control 11

Figure 11

On applying the changes, you’ll receive the same warning, as shown in Figure 10, but this time the synchronization will succeed. When complete, the Get latest screen will report that uspSearchCandidateResumes is the only object that exists in the working folder but not in the database. Simply run Apply changes a second time to complete the synchronization.

Smoothing the path to team adoption

Schedule a meeting and have the developers bring their laptops. When it was time for my team to start using the tool, I took two different approaches. My first was to send an email with step-by-step instructions. That had a 40% success rate. I ended up at a lot of developer’s desks helping them set up. For the next database, I scheduled a meeting. The success rate shot up to 90%. When you schedule a meeting for the developers on the team, an ideal set of goals should be:

  • Install SQL Server
  • Install Redgate SQL Source Control
  • Create their local databases, using database creation scripts
  • Bring changes down from version control and apply them to the database
  • Do test commits.

It takes about an hour to do all of this and by the time the developers walk out the room they should be able to make changes and check them in.

Conclusion

Getting our databases in version control has solved a lot of problems for us. We no longer wonder when a change was made or who made it.  We know exactly what changes need to be deployed to production. And finally, it has allowed us to start making our database changes locally, which in turn has allowed us to adopt a modified version of the Gitflow branching strategy.

Redgate’s SQL Source Control makes this all very easy. The hardest part is getting all the prep-work done. I have put multiple databases under version control, all with varying complexity, some with a few tables, and others with 100s of tables. It now takes me about a couple of hours to a day, from start to finish, with most of that work being prep-work.

If you’d like to know more about database deployments or .NET development, Bob Walker’s blog is regularly updated with informative articles.

Tools in this post

SQL Source Control

Connect your databases to your source control system.

Find out more

Share this post.

Share on FacebookShare on Google+Share on LinkedInTweet about this on Twitter

Related posts

Also in Hub

Building reusable table build scripts using SQL Prompt

You've been working on a query, function or procedure and you want to store the results of executing it in a table; maybe a table variable, or a temporary table. These all require column definitions, ...

Also in Product learning

Using SQL Data Compare to Synchronize Custom Error Messages

One way to offer a better experience to our application users is to anticipate likely mistakes they could make, such as when filling in a web form, and provide them with a meaningful error message tha...

Also in SQL Source Control

A strategy for implementing database source control

Much has been written on the benefits of having a database under source control though many articles are clear on "why" but conspicuously vague on "how". Prior to our organization's decision to embrac...

Also about SQL Source Control

Moving from application automation to true DevOps by including the database

The recent State of Database DevOps Report revealed that within two years, 80% of companies will adopt DevOps. That’s an interesting finding in itself, but the report also showed that 75% of compani...