I was so excited to host our 4th webinar in the DLM Coffee Break series last week, and have a chance to talk to the authors of the new book from Redgate, “Database Lifecycle Management: Achieving Database Continuous Delivery.” The book itself is actually being written on a continuous delivery (iterative) publishing platform called Leanpub. It applies the principles of lean and agile to publishing a book, meaning the authors can get sections of the book out early, and readers can give feedback and help refine later versions. The book is going to be available at http://thedlmbook.com, so sign up for updates of when content gets released!
A recording of the webinar will be available on Youtube soon (I’ll update this post as soon as it’s live), and if you watch the last 5 minutes, you’ll hear more about the book from the authors. If you want to check out some of our previous webinars, the recordings are available in our DLM Webinar playlist.
This webinar was incredibly interesting because DLM is more than just the database, and we had two experts with really different backgrounds on the panel discussion. Grant is the Scary DBA. He’s been a DBA for a long time, and is a SQL Server MVP. Matthew’s background, on the other hand, is more focused on development and operability, but he has a lot of experience with database development and is a Redgate Certified Partner, providing consultancy on DLM.
To summarize the webinar, DLM is more than just the database and the reference data. It involves metrics, auditing, infrastructure, configuration, ETL, datawarehouses, and more. It sounds very complicated and involves so many different parts that I asked Grant and Matthew for some help getting started.
There were a few crucial take-aways, such as the fact that DLM works alongside ALM, Agile, Continuous Delivery, and DevOps. One of the biggest takeaways for me is that DLM is essentially about avoiding surprises. Trying to move small batches of changes through at a time, having similar environments (we discussed production vs dev data a lot too) and good testing so that there aren’t any surprises when it’s time to deploy to production. The deployment process should be really easy and therefore bug fixes should go through this process as well.
In addition to the discussion, there were some really great questions throughout, so Grant and Matthew have replied to these questions here and we wanted to share these with you.
If you have any more questions, please comment below or get in touch with us at firstname.lastname@example.org. I hope to see you at our next webinar in August with Dave Farley, a co-author on the Continuous Delivery book. Now, on to the questions!
If the change script is based on something created by sqlcompare, then that change script is untested
Right, so, I don’t ever use compare to generate a script against production. Instead, I use it to generate a script against a copy of production, in staging, qa, pre-prod, testing, continuous integration, so that I can generate a script once, then test that script over and over. Then, I know it works.
Is anyone is actually doing this? I want to and I agree that it should be done but is it actually being done?
Yes. I’ve seen it. Heck, about 15 years ago, I did it. Amusing story… in retrospect.
Grant, as a DBA how do you handle somewhat legitimate requests from an ill mannered disrespectful Developer?
That’s a very tough question to answer in this forum. When the requests are legitimate, do them. If the manner of the requests are disrespectful, do them anyway, but note the attitude. You’ll have to try working with the developer to understand what’s going on. If that doesn’t work, you may have to resort to management. I’d try to avoid management though if I could.
I would use the disrespect shown to you and turn it around as an opportunity to build dialogue with the developer and whole Dev team. Explain that you felt their approach was disrespectful, but say that you’d like to work with them to make the requirement behind legitimate requests happen. Show incredible patience and good humour, and where possible conduct all initial communication in person – certainly not email or JIRA tickets.
Databases source control: last database image vs database migrations – How to manage this two approaches?
Huge topic. Here’s as short an overview as I can summarize:
Most people start working on database development by either manually generating their scripts, or by using a graphical user interface directly against the database management system. In order to start from source control though, you have to begin working with scripts. An initial, primitive, approach is to simply have a manifest of scripts. Each change is recorded on a new script. That script is added to the manifest so that it runs at the correct moment. Usually this means just adding the new script to the end of the manifest. Then the scripts are all added to source control, along with the manifest. Deploying any given version of the database involves obtaining the correct set of scripts and manifest from your source control system and then running the scripts in order.
This simple approach works fairly well for small teams and small databases. But, as the team size increases, you begin to see a lot of conflict in maintaining the correct order of scripts in the manifest. As the size of the database increases, your database builds take longer and longer. Further, you’ll find that you create and drop an object multiple times, as the changes inherent in a normal development process occur. Or, you’ll at least modify the same object over and over again. As you begin to become interested in maintaining your data, this makes the build process slower and slower over time.
Sooner or later this primitive approach breaks down and people begin to develop and build their databases using scripts that only update from one known state to another. This stateful approach generally requires the addition of tooling that will enable the person developing and deploying the database a mechanism for generating a script that migrates the database between states. You’ll still have the ability to manually edit scripts as necessary. But, you’re no longer maintaining a manifest to attempt to order the objects manually. Also, if there are an accumulation of changes for a given object, they’ll be consolidated as one change. If a given object has been dropped and recreated, no action will be taken since the object will exist and won’t require updates.
Moving to this first step on the road to automation also begins the process of ensuring that data is retained by each deployment since each deployment will only ever make changes in the system. This can be done using the manifest style of deployment, but each and every script has to be written in such a way that it’s idempotent, each change will only be made if necessary. You’ll still have the problem of changes being individual instead of cumulative and dropping and recreating objects will still occur.
It’s also worth considering using different approaches at different phases of development. For instance, using the stateful approach during very rapid development periods where there is lots of churn, and then flipping to use a migrations approach once the early refactoring has settled down. Several teams I know of have used this approach to good effect.
Git is better than TFS? Been using TFS for a long LONG time.
It really is down to preference, but the differences are profound. Check it out, but understand how GIT works, because it can appear the same, but it isn’t.
Git enables whole capabilities that are either difficult or not likely to happen with TFS. I should clarify that I am referring to the version control part of TFS specifically, rather than TFS as a development environment. In fact, the TFS suite now offers Git as a version control option, which is brilliant. Using Git for version control rather than the TFS version control drives many good patterns: smaller, more focused repositories; separation of libraries from application code; better tooling support; easier history and log analysis; less ‘theatre’ around version control in general; and – perhaps most important – avoiding the file locking that TFS version control tends to lead to.
What are some tools for DLM and Continuous Delivery?
Here are links to some of the tools we mentioned:
Github for Git-based version control and issue tracking: https://github.com
GuardRail from ScriptRock for confirguration drift detection: https://www.scriptrock.com/product
GoCD for Continuous Delivery deployment pipelines: http://www.go.cd
Redgate DLM Dashboard for database drift detection: http://www.red-gate.com/products/dlm/dlm-dashboard
It’s not necessary that dev data and production data are the same for bug fixing and development. You can’t always have dev be the same as production.
If prod is 20TB, it’s just going to take too long. So, no. You won’t always have that. You just have to try to make it as close as you can and have as many tests and simulations as you can to ensure that any given change won’t break production. And, of course, that the change fixes the problem too. Should have said that.
MS: I think that it’s actually a good thing that regulatory constraints often mean we cannot take a copy of Production data for upstream environments, because this forces us to think more carefully about the specific data conditions under which certain query performance problems occur. For example, does a stored procedure really need the 10TB dataset in order to trigger a timeout, or is the timeout also present with (say) 10,000 cities in a reference data table? We should investigate more carefully the actual reasons for performance problems and isolate these as specific test cases with pre-generated data. Knowing more about the causes of behaviour in our Live system cannot be a bad thing!