Once you’ve committed to changing your culture in order to automate your database deployments, what’s next? You’ve already done the hard part, making the decision to shift the culture. Now, what’s involved is just lots of labor. There are three things you can do to begin your Database DevOps journey:
- Get your database into source control
- Start using a tool to build databases from source control
- Automate those builds through a Continuous Integration process
Let’s discus these in detail. It’s important to understand, not just why these are your first three steps, but why they should occur in this precise order.
The first thing you have to do to get started on your DevOps journey is get your databases into source control. From a technical standpoint, this is probably the easiest of the three steps. You can use almost any tool you like to get your databases scripted out to the code that defines them. Then, it’s a very simple matter to add the scripts to the source control management system of your choice. That’s all very simple. No, the challenge in getting your databases into source control comes after the fact. Developing the disciplines and habits to ensure that all changes are made first to your code that defines the database, instead of directly against your databases, can be a real challenge. This, not the physical act of getting your code into, and out of, source control, is where you’ll be spending your effort. It really is simply a matter of developing a new habit though, so it should be achievable by most people.
Then, you immediately get benefits. This step, of the three, is the one that has a more or less instantaneous return on investment. For free, as soon you start working through source control in your database code, you get change tracking and auditing abilities. Who made that change to the code? The source control system can tell you not only who, but when, and precisely what change occurred.
You also instantly receive a unique ability that’s normally not achievable with a database, the ability to undo certain changes. Yes, if you drop a table, you’ll still need to go to a backup. However, what about modifying a view or a stored procedure? If you make a change there, but then want to change it back, source control gives you the exact code you had before.
Finally, another benefit is the fact that source control provides another level of backup for your system. This one isn’t as thorough as the full backup with all the data. However, it is one more layer of protection that gives you the ability to at least recover the architecture of your system in the event of some dire occurring.
In fact, I argue that getting your databases into source control is worth doing, even if you never plan to change how you develop and deploy your databases. The benefits are too many, and too easily achieved, to ignore.
With your database in source control and all your changes being managed there, it’s time to start to automate database builds.
Building Your Databases
The next step on your DevOps journey is ensuring that you can build your databases from the code in source control. Here, you are going to have to seriously consider tooling. While you don’t need any kind of tool at all to work with source control, beyond the source control system itself, automating database deployments demands some kind of tool. The issue is simple enough to define; what has changed in the code since the last deployment? Determining which pieces of your database has changed, and then, ensuring that the scripts you have can be successfully deployed, is a real challenge.
I have seen this challenge met without using tools, however, it’s a very tedious and slow process. You simply have to document every single change you make. Along with that documentation, you create change scripts by concatenating your changes, or writing custom scripts. Then, you deploy that script and start the process over again. The slightest deviation or mistake leads to all sorts of rework. This is frankly the wrong way to go about it.
Instead, you should look to build or buy a tool. With the tool, you have one of two approaches to begin automating your database builds from source control: state based, or migration based. Let’s talk about what each of these entails.
State Based Deployments
A deployment tool that works on a state basis uses two known states to compare the states and arrive at a list of changes. These two states can be a database, a backup, or, your code in source control. All you need to do is compare your current version of source control, or a given branch, a previous version, whatever, to your database. The tool will then generate any necessary scripting need to make the database look exactly like what is in source control.
There are a number of strengths to this approach. First, the tool does a lot of work for you. You don’t have to track your changes much at all. Get everything into source control, then the tool will figure out what everything entails. Further, these tools are extremely good at generating functional scripts to ensure that your changes are done efficiently and safely. Also, conceptually, the state-based approach is easy to understand and communicate with others.
There are some weaknesses with this approach however. Not all changes can be easily scripted by a tool. For example, renaming a column is very straight forward. But, to a tool, it looks as if you dropped an existing column and added a new one. That will be the kind of scripts that you see generated. Also, if you have kind of custom data movement necessary to support a database change, not only will the tools not generate this for you, but they can be challenged to run them in the right way when needed.
People will generally work around these issues by using pre- and/or post-deployment scripting. You can capture the unique script you need to run and then run it as needed, either manually or by using some automation. However, this does add some manual processing to your deployment automation.
Another way to address this is through the migrations approach.
Migrations Based Deployments
We’ve actually already discussed migrations based deployments earlier. When we talked about you documenting all your changes and getting them into the right order, that’s what a migrations approach is all about. However, you can put tools to work to help make this process a whole lot easier. The tools can readily track what version of the database you’re on by marking the database, using a checksum, or updating a table. With that information in hand, the tool will know which set of changes are needed to move the database from one version to the next. Some tools will even help you generate your scripts. Others will count on you doing the scripting.
The strengths with migrations are almost literally opposing the weaknesses of state-based deployments. Because you can directly control the scripts and the order in which they are run, any kind of customization or special needs are just put into place. As long as the scripts are valid, the tool will run them for you. Migrations is also closer to how people actually work and do their deployments, so implementation is much simpler.
There is one glaring weakness when using migrations. This is the tendency for development to find it’s way into production. What I mean by this is that it’s common, indeed, expected, to make lots of changes in development. You may add, and subsequently drop, a given column three or four times while you’re trying stuff out. If that set of changes is in source control, you could see that same series of actions played out in production.
The best way to work around this is through a disciplined approach to when changes get committed to source control. Keeping source control in order means your deployments will also be in order. Another way to work around it is to use a hybrid approach where in, while you’re developing, everything is done in a state-based fashion, so only changes are dealt with. Then, when it comes time to deploy to testing, and then on to production, a migration script is generated and only migrations are used moving forward through the process. Redgate Deploy can be used in this hybrid fashion.
After Picking Your Tool
Once you’ve selected a tool and a deployment approach, your work here is straight forward enough. Learn how to control the tool through the command line. Yes, something like Redgate Deploy or SQL Source Control gives you a graphical interface and you can deploy from there. Our goal is to automate these processes. So, we need to spend time learning that automation. Practice deployments using the tool and your code from source control until you can successfully deploy the database every time. Once this is completed, you’re ready for your final step, automating everything.
Step 3 in your journey is to set up a continuous integration process. Continuous integration is simply the steps needed to take changes from your source control system and then automatically deployment them to a testing location. You may also run automated testing after that successful deployment. You can read all about what defines Continuous Integration (CI) here.
However, I don’t want you to sweat all the additional testing just yet. Remember, we’re just getting started on our database deployment automation journey. You’ve successfully started using source control, where you’re already seeing benefits. You’ve figured out how to get your code out of source control and deploy it to a database in an automated fashion. Now, all I want you to do is set up a process that runs that automation. I want you to decide how it’s going to work. You could schedule the process so it runs once a day, or even several times a day. This wouldn’t require much in the way of tooling. Or, you could look to getting a tool involved.
Tools like Azure DevOps, AWS Developer Tools, Octopus Deploy, Jenkins, or a whole slew of others, are designed around automating deployments in support of Continuous Integration. These tools can plug directly into your source control system. With that, you can direct the tool to build your database every time code gets committed, or every time a branch gets merged, of whatever works best for your situation.
I want you to only focus on getting your database deployed automatically. In fact, I’d encourage you initially to only deploy to empty databases. Yes, ultimately we’re going to have to deal with data. However, for these starting steps, tossing data in is just going to muddy the waters. You’re going to be going through a bit of a learning curve, from source control, to the tool that does your deployments, to the flow control tool that’s going to get you going in CI, your plate is pretty full with learning. Toss in a lot of time spent troubleshooting data migrations, not to mention the time needed for moving data around, and I really encourage you, while you’re learning, just leave data out of it.
Further, don’t bother putting a lot of tests into your CI process at the start. Ultimately, yes, just like the data, you’re going to want to do all sorts of testing. However, as before, let’s focus our time and effort on just getting the CI process to run. Ultimately you can begin to add tests in an incremental fashion.
The rewards here are not as immediate, but they are present. Just having the ability to test the code used for deployments means that you’ve added a layer of protection to your production servers. You’re much less likely to introduce bad code to production now that you can validate that code as you work with it.
Now that you’re on the DevOps path, these three steps are the most important ones to be able to move on to doing all the incredible amounts of automation your system will ultimately need. Start easy. Get your database into source control. Once you have it source control, figure out how to get it back out of source control and build a database with it. Finally, automate that through a CI process. Now you see why I picked these three steps, and the order they’re in, to get you started with database DevOps.
After you get started, you’ll want to design a full workflow. Download this white paper for guidance.
Was this article helpful?