Routine SQL DML Testing for the Unenthusiastic Tester

There are more exciting things in life than unit testing SQL Statements, checking the results, timings, and execution plans. Phil Factor demonstrates a PowerShell-based technique taking the tedium out of testing SQL DML.

DevOps, Continuous Delivery & Database Lifecycle Management
Continuous Integration

I find testing to be a very tedious business: All developers do. Testing SQL databases is something you like doing if you find the process of testing applications too exciting. The answer for the rest of us is, of course, to mechanise as much of the test process as possible.

Here is a simple form of SQL DML test for SQL Select statements. I think of it as a type of integration test. Where you are doing Continuous Integration, it is a good way of ensuring that nothing serious is broken. By ‘serious’, I’d include poor performance as well as the wrong result.

The idea is that you just develop, in your favourite IDE such as SSMS or VS, a batch of SQL queries or execute statements. You check them  as necessary to make sure they’re right. You then save the file in a directory. You do as many tests as you like, and use as many subdirectories as you like.

Here is an imaginary test. Don’t worry about the SQL Statements, they are just there to demonstrate

You’ll notice some print statements here. They’re only necessary if you want to do something different to the default behaviour. When you bang the button to run the test, every query in a batch will have its own timings, results, I/O information and execution plans, so when they’re saved to disk, they need a unique name.

That PRINT '(final)' not only gives it a name but also makes it easier to pick out the messages that belong to each query. The test harness saves the result to disk, using that name as filename and in the XML.

Sometimes, you’ll have a query that has an enormous result that you don’t want saved to disk.  That PRINT '-noMoreResult' deals with that. As the name suggests from then on in the batch, unless you then do PRINT '-SaveResult'. Yes, there are ways of being more controlling if you need to be.

A lot of the time, you will want to make sure that the result that is returned is correct. This is why the result is automatically saved as an XML file. If you leave a file with the correct result in the directory, then the PowerShell script obligingly compares the result with that and produces a report that tells you what the differences are.

That’s it, basically.

Why do it?

Throughout development, you must continuously check that certain key queries produce the correct data, and do it quickly. Not only should they do it quickly on the test or QA server, but also under realistic loads. They must use effective execution plans.

This means testing a number of relevant SQL Expressions or batches against maybe several databases, normally replicas of the same database in the same version, but stocked with different amounts of data. I do it to see whether the result is what it should be, how long it took and what were the CPU or I/O resources used. If a problem crops up, I will want the query plan that was used.  These SQL Queries could be simple or complex, and they are intended to check out the vital functions to make sure you never pass on a release that is severely broken. I do it on every build, so it has to be able to work unattended. The benefit is that if something breaks, I can be pretty sure of what did it: it is what I’ve done on the day before the test. Because you have the execution plans, you can email an expert for help as well.

There are plenty of test frameworks that will do this sort of test, but nothing I’ve come across that does it all, in a way that is  unobtrusive and easy to use. Above all, I want it to be easy to change the SQL. (Actually, I like changing the PowerShell too, but that is another story)

What does it do?

With PowerShell you get some timing metrics for free with the Measure-Command cmdlet that gives you the end-to-end time. This helps give you the end-to-end time for a whole batch, but this isn’t sufficient. We also need to get a rough idea of the server CPU and timing, the query execution plan and maybe even a more accurate measure of the time taken to execute the command on the server. Basically, you don’t want to be bothered with the results of all this unless something goes wrong. You also will want to specify a number of server instances/databases to do the test run with.

For me, the most important thing is to be able to add a number of queries to a file and have them executed against a particular database and server.

How does it do it?

Here is a simple test harness to show the guts of the script. The actual script does rather more but this sample should work if you fill in the server, database credentials and the path to the test. Here, I leave out the distracting routine stuff of getting each database and each batch, and just show the important stuff. As you can see from the script, it is a simple task that executes the batch, and checking the results by monitoring the messages that come back for errors and special statements that switch on or off certain features. For comparisons of results, I use the .NET library for comparing XML files. You will need to fill in the details (look for 'These below need to be filled in! ‘) before it will work.

And an inspection of the directory shows that the files are there

 2302-clip_image001.png

The execution plans can be clicked on and inspected if you have SQL Server Management Studio, or Visual Studio installed. If not you can drag the files and drop them on SQL Server Central’s SQL Tuneup page

The .io files and all.messages files are text files, but this is a script so you can call them whatever suits you.

 That ‘final.io’ file has the following information:

The all.Messages file has all the messages from all the individual SQL Select statements, but also the following

If you want the PowerShell to detect differences in the result you just put in an XML file with the correct result (usually from a previous verified run). Here is a run with an error in it, showing the file with the correct result

2302-clip_image002.png

The XML files will need an XML editor to view them. If there is a difference between what there should be and what there is, there will be an XML difference report.(in this case in the file Second.differences

The Full Script

Here is the full script. (it is too big to embed in the article). Anyone who reads my PowerShell articles will know that I prefer to use either the registered server group of the central management servers, rather than specify connections. I set up a test group, usually with all supported levels of SQL Server.

I have the practical difficulty that I sometimes have the same physical server under two different names and so I correct for that just to get the unique physical servers, along with their connection strings. For good measure, I have a list of servers to avoid (the ones at a version level I’m not interested in supporting)

For this version, I’ve set it to pull out all your registered servers, but I use a version that looks in a subgroup so I change the path to the list of test databases. You can use it either way because you have the  $AvoidList  to prevent servers getting into the list that you don’t want to test against.

For each server, I then run all the tests, serially. I get these from whatever files I can find, including subdirectories. I then filter out any files I don’t want.

The result files are contained in subdirectories named after the name of the instance that ran the test, and the name of the result.

Because it is so easy to get the registered servers using  the SQL Server PowerShell provider, I use that, so there is something else you’ll need to worry about before you run the code.

Wrapup

This routine doesn’t replace any of the standard tests, but it can catch errors that would otherwise cause problems to these tests, or maybe break the CI build. I find it handy because it takes almost no effort to run, doesn’t leave artifacts in my code, and gives me, saved on file, everything I need to see what the problem might be (including anythe SQL Error Message). I can even archive the data and spot trends. Naturally, I’ve given you just enough to customise a script to your particular requirements, and I hope that I’ve kickstarted someone to use PowerShell to the max for database development.

DevOps, Continuous Delivery & Database Lifecycle Management
Go to the Simple Talk library to find more articles, or visit www.red-gate.com/solutions for more information on the benefits of extending DevOps practices to SQL Server databases.