The Mindset of the Enterprise DBA: Harnessing the Power of Automation

After you have done the necessary groundwork of standardizing and centralizing your database administration processes, you are now in a position to implement some effective automation of some of these processes. Which ones do you choose, and how do you set about automating these tasks?

DevOps, Continuous Delivery & Database Lifecycle Management
Culture and Organization

   On Adopting the Mindset of an Enterprise DBA
The Mindset of the Enterprise DBA: Creating and Applying Standards to Our Work
  The Mindset of the Enterprise DBA: 8 Ways To Centralize Your Work
The Mindset of the Enterprise DBA: Harnessing the Power of Automation
  The Mindset of the Enterprise DBA: Delegating Work

Introduction

Once you have both standardized and documented all your processes, and centralized them wherever possible, you will then be able to fully exploit a method of decreasing the amount of busy work you have to suffer every working day, namely automating many of these processes.

Automation, for our purposes, is defined as the deployment of software tools and scripts that reduce, for the working DBA, the amount of manual input and time required to complete our work. The type of tools we use will vary; they will range from sophisticated third-party tools to scripts of our own making. We may also use and modify community scripts. The extent to which they decrease our workload will vary too; As long as we’re making positive steps in that direction, we’re on the right path. Any time not spent being a MATM (Monkey At The Monitor) and working on higher-value tasks is time well spent.

I was, for example, able to decrease the amount of time I spent building and installing SQL servers from around a day of work down to under twenty minutes of human-interactive time (the process took longer, but it ran in the background). Being a development DBA, I did this task more often than I’d like (developers are demanding types; I know because I am one), so this reduction in time significantly lightened my workload, and had the added benefit of making my customers happier by providing them with faster deliveries. Automation for the win!

Choosing Candidates for Automation

So how do we go about choosing amongst the many candidates for automation from our typical work? Now we have identified, documented and standardised these jobs, the task is a great deal easier. The best way to do this is to focus on two main questions.

How Much Time Do I Spend Doing This?

If you don’t spend a lot of time doing a task, then it is unlikely to warrant the effort of automating it. Even if it is a painful process, as long as it’s documented and you don’t have to do it very often, that’s probably sufficient and this should be pushed off in favor of more suitable options.

Mind you, I’m talking here about the total amount of time spent on a particular activity, not the average. That is, if there’s a task you need to do that only takes you ten minutes, but you do it upwards of fifteen times a day, that’s still a pretty large amount of time overall. This is where tracking your time can prove very handy, as I’ve found that memory is an unreliable and inaccurate guide as to how we spend our time.

How Complex Is Automating This Going To Be?

Some tasks will be easy to automate, and others will prove difficult. In general, there are several factors involved in determining just how hard this is going to be.

How Complex Is The Logic Behind The Work?

In some cases the work of a DBA is a simple series of steps executed in the same order. Examples of this might include the job of setting up users on servers or creating new databases. Tasks with linear progressions are generally easier to automate.

On the other hand, there are tasks that require a long series of decisions that are based on external inputs and perhaps some subjective judgement. An example of this would be something like doing performance tuning using index creation. Even if there is no human decision making involved, tasks that require branched and complex logic are generally harder to automate since that logic must be translated into some form of machine-executable code.

Are Existing Interfaces Available?

Perhaps some of the work we are doing involves things that can’t be automated because there is no way to write code that accomplishes what we must do manually through clicks of the mouse. Installing software comes to mind here, though in many cases vendors are getting better at providing mechanisms for passing inputs to installer processes (look no further than Microsoft’s good work on enabling command-line installation of SQL Server). Whatever the case, if there are no hooks or API to which your code can attach, the task of automation becomes much harder if not impossible.

The Two Forms of Automation

Let’s assume that, after going through the analysis described above, you’ve found some good candidates. At this point, it’s important to distinguish between what I call the two forms of automation. Both are valuable and should be pursued to the best extent possible, but it is equally crucial to know the difference.

Full Automation

The best outcome possible of any exercise in automation is that we eliminate the need for any effort whatsoever on our part. This means that either the tasks require no input to complete (think daily maintenance jobs, which always run at the same time every day and whose parameters are generally pretty static), or others can give the input directly to our automation itself (let people input things into a form which our automation then reads and processes accordingly). These “set-it-and-forget-it” types of automation schemes are the often those that give highest value and should be tackled first since, once you’ve put them in place, you can then effectively let them be. Think of them as the work-equivalent to passive income: once you’ve put in the time to create them, you can sit back and enjoy the benefits without further effort.

Partial Automation

If something is only partially automated, it still requires human intervention or action to complete even though many individual steps or actions have been automated. To give an example, let’s say, to restore a database into a test system, you still must run a script which executes the restore command, providing it with the database name and the path to the backup file. All other steps however, such as resetting permissions or determining where to place log and data files, are handled by automated logic, based on predefined rules.

Now, you might be tempted to say “But that’s not automated at all! You still have to do work.”

And you’d be correct, at least in part. There is still human interaction involved here, but in all likelihood you’ve reduced the amount of hands-on time significantly by using tools of automation, and if you haven’t, this probably wasn’t a good candidate for automation in the first place or you haven’t done enough. Continuing with the example I’ve already given, I was able to get the time for restoring databases from around a half hour each down to under five minutes of “touch time”: I did this by automating parts of the work that would normally involve heavy manual work, such as looking up the configuration of the drives on the server, running extra scripts to regnant permissions, and other post-restore activities. For something that I usually did several times a day (and under pressure, those pesky developers want their databases now) this was a huge improvement. So just because you can’t get something down to no effort on your part, don’t push it aside entirely.

Tools For Automation

Now we get into the part where we actually get to work. Here, I’m going to give a few examples of tools that can be used for actually building automation into our daily work. People tend to have preferences in their choice of tool, and you should; one of the more important criteria in selecting a tool is your familiarity with it. Sure, perhaps Powershell is a great choice for automating tasks that touch multiple servers, but if you’re clueless about how it works, then perhaps you should try something else. At the same time, I’m going to challenge you to be open to trying new things and learning different methods or tools. As we’ll see in discussing the various options, there are clear advantages to several, which translates into much more value and ability to achieve greater success in automation.

T-SQL

Ah, good old Transact SQL. The bread-and-butter language of our dear old SQL Server.

What’s that, you say? “Transact SQL is for writing queries, not automation!”

Well, yes, T-SQL is at its best when used in set based notational form. But it can also be a pretty powerful tool for automating work in SQL Server.

Pros

DBAs by nature are likely to be familiar with T-SQL and it’s syntax / usages. We use it daily in our work (unless you’re still using the GUI completely, in which case I’d highly suggest you branch out) and won’t need as much time to learn it. Even in its more complex procedural forms (think lots of IF/ELSE type statements or complex logical trees) you’re probably going to have less of a ramp-up time before understanding its full potential.

T-SQL is the “native” language of our primary platform, namely SQL Server. There’s really almost no task save for some (hopefully one-off) configuration that can’t be done with plain old SQL. Need to restore a database? You can do that. Create a user? Yep, that too. Parse some XML? Sure, you can do that too (but be careful, it can get expensive).

Cons

Interacting with elements outside of SQL Server is troublesome at best. Sure, you can use something like XP_CMDSHELL (properly please, no granting to the public role!) to run operating system tasks, but the interface in and out isn’t pretty (try parsing a return result set for all the program output). It’s true you could write CLR stored procedures for more complex interaction, but honestly how many DBAs out there are going to be comfortable firing up Visual Studio and cranking out some C# or VB.NET code?

With T-SQL you have one point for inputs and one only: runtime parameters. If, during the course of executing a process, your automation is going to require some input, there’s just no way to do it within pure T-SQL code.

Personally I find that T-SQL is great for automation of a lot of work inside SQL Server. I have a large library of scripts that take some inputs and give back defined outputs after doing some work for me. That way, I know that the work is always done in the same way every time (standardization anyone?) and my manual work is minimal. Things like scheduled maintenance can also make heavy use of T-SQL, since most of the operations needed (index maintenance, database backups, etc) is wholly within SQL Server itself.

SQL Agent

SQL Agent is a great tool when it comes to automation of DBA work, but in general it’s use cases are fairly narrow.

Pros

Again, thanks to our job as DBAs, we’re likely to be familiar with it and use it daily.

SQL Agent gives us multiple ways to execute work, such as T-SQL, OS commands (think DOS or CMD scripting), and Powershell. It also interacts well with other components of SQL Server, such as Analysis Services or Integration Services.

Agent jobs can be triggered by many different things, ranging from time based schedules to WMI events or SQL Server error messages.

SQL Agent has a robust and highly configurable system for sending notifications on job success or failure.

Cons

Beyond statically defined parameters specified in job definitions, or having jobs look up information in SQL Server tables, there’s very few ways to pass input into jobs.

Agent jobs are single-threaded. This means that you can’t have more than one instance of a job running at any given time, and there’s no way to parallelize work within a particular job, at least not without some coding trickery and use of outside systems. Note: I don’t mean to say that SQL Agent itself is single-threaded; you can have multiple jobs running at once, just not multiple instances of the same one.

Ultimately SQL Agent is good for fully automating work that needs to be done on a scheduled basis or in response to some event, and where the inputs are generally static. You would be wise to make use of SQL Agent Master and Target servers in cases where the same job needs to be in place on multiple servers.

Powershell

I’ll admit that I’m a bit of a PowerShell fanboy. It’s been a huge help to me in my career in terms of lightening my workload and I’m very happy that Microsoft provided it for administrators to use.

Pros

If T-SQL is a scalpel, then PowerShell is a Leatherman multi-tool. You can pretty much do anything with it, given enough time and knowledge. You can interact with the operating system through WMI or other interfaces, execute T-SQL or use built in .NET assemblies like SMO to perform work, and you can blend them all together seamlessly.

PowerShell has robust remote management capabilities, either through WMI or the much touted PowerShell remoting. This means you can run things from one place against multiple targets and collect everything at once.

Between background jobs and runspaces, PowerShell has excellent multi-threading potential. The ability to process work in parallel can result in huge reductions in the time to complete work.

Cons

For those who haven’t had experience in programming, the task of learning to use PowerShell is probably going to involve departing from your comfort zone. It’s a true language and you’re going to have to learn things like error handling, object instantiation, and .NET object models. That said, there’s a lot of tutorials out there and this isn’t a huge mountain to climb.

It’s easy to get deeply into using things like the SMO interface and end up with overly complex code, where using straight T-SQL would be better. I mean, have you every tried to restore a database using SMO? It’s enough to make you want to crack open that emergency bottle of whiskey you keep under your desk.

Anything Else

The truth is, the best automation tool is the one you’ll use. If you’re already a Perl guru, then there’s no reason to switch gears because Powershell happens to be a really popular thing now. If you’re an ex-programmer and are a C# whiz, then there’s no reason you can’t write compiled programs for automation, though at times that might be a bit excessive.

In general, your choice of tool should be dictated by how well it solves the problem at hand. Often times, the best result can be achieved by combining different tools together. For example, using PowerShell to build and execute SQL commands is a great way to combine the strengths of both. The biggest thing to avoid is using a particular tool for something it isn’t good at. As psychologist Abraham Maslow once said, “If you only have a hammer, you tend to see every problem as a nail.”

The Dangers of Automation

I’d be remiss in my duties if I didn’t mention that automation does carry some risks with it. They’re minimal and can be mitigated, but we need to be aware of them.

First, it’s possible that you might do such a good job automating things that you get very dependent on them and lose the sharp edge to your skills. This has happened to me before: I got so used to my one-line PowerShell restore script that I completely forgot how to manually restore a database. This isn’t a bad thing per say, as you should hopefully always have your toolkit available, but it’s not a bad thing to put some of your freed up time into practicing your core DBA skills.

Second, when building in automation we need to think carefully about ongoing support and maintenance. It’s easy to build a wonderful system of dohickeys and whiz-bangers that works great, but only we know how it works. Thorough documentation of automation processes is a necessity, so that others can understand the moving pieces. The more complex the automation, the more this is true (and unfortunately the less fun it is to complete). We also need to consider the organizational knowledge base when choosing tools for automation. If you’re in a shop that has a heavy .NET developer focus, writing everything in Perl might not be the best approach. After all, we need to consider what might happen if we’re not immediately available and the automation fails.

I’d like to thoroughly quash the idea that we can somehow automate ourselves out of a job. For one thing, it’s highly unlikely we’re going to be able to fully automate all our work, so there will still be a need for some human interaction. It’s also true that the knowledge and experience of a good DBA is priceless and something that good organizations tend to value highly, and increasing that specialized knowledge will be easier to do when you get the silly busywork out of the way. So please don’t let that fear stop you from pursuing your automation goals with all the vigor you can.

Conclusion

In the end, the automation mindset is really about two things: seeing processes as a series of inputs and outputs based on rules, and creating tools that implement those rules so that we do our work in a more efficient manner. It is a goal worth investing time in. If chosen wisely, you will see that investment return back to you in ways that are tangible and visible. Instead of being mired in simplistic tasks, you’ll be able to focus on higher order activities like increasing your knowledge, being proactive about problems, and helping others. The more you put in, the more you’ll get out. Achieving one hundred percent automation just not possible, but that shouldn’t stop you from putting every bit of effort you can towards it. And for those pesky remaining tasks that just can’t be handed off to machines, we can work towards the next best thing: letting other people do them. Stay tuned, we’ll cover that in the next instalment of the series.

Useful Resources

  • SQLPSX – A Powershell library for controlling and interacting with SQL Server.
  • SQLServerCentral Script Library – A vast array of community authored scripts and utilities. If you need something, chances are it’s here.
  • SPADE – A Powershell framework useful for automating installations of SQL Server (I used it for the automation mentioned in this article).
  • Ola Hallengren’s Maintenance SolutionOla’s set of scripts is one of (if not the) best in the business when it comes to a flexible and robust solution for automating maintenance of SQL Server objects.

DevOps, Continuous Delivery & Database Lifecycle Management
Go to the Simple Talk library to find more articles, or visit www.red-gate.com/solutions for more information on the benefits of extending DevOps practices to SQL Server databases.