Performance Testing Databases with Flyway and PowerShell
Performance tests are central to the quality of the database changes we deliver because they ensure that any business process that accesses the database continues to return its results in an acceptable time. When Flyway creates a new version of the database, it is the ideal time to run these performance checks.
This article is part of a series on the requirements and practicalities of database testing, with Flyway:
- Testing Databases: What’s Required?
- Planning a Database Testing Strategy for Flyway
- Test-Driven Development for Flyway Using Transactions
- Basic Functional Testing for Databases using Flyway
- Running Unit and Integration Tests during Flyway Migrations
- Performance Testing Databases with Flyway and PowerShell
- Running Structured Database Tests in Flyway
- Running Database Assertion Tests with Flyway
Also relevant are the series of related articles on Test Data Management.
Why bother with performance tests?
Traditionally, the database developer finds it hard to get enthusiastic about developing code that performs well. In fact, it is often the case that they regard it more as a ‘production issue’, dealt with during staging, by a DBA with an inexplicable interest in indexing. This is understandable when it isn’t that easy to get even basic SQL timings from the IDE that you use. In truth, performance is fundamental to the quality of database design and the results should be easy to view and assess.
SQL databases can catch you out, especially if you are used to writing procedural code. The problem is that SQL code isn’t just executed. It merely tells the query optimiser what data you want. When faced with an unfamiliar query, the RDBMS will engage in ‘planning time’ (PostgreSQL). The query optimizer will whistle through its teeth, look at the quantity and distribution of data in the database, sees what indexes are available to it and gather whatever other information it can from constraints before working out a few strategies to get the data. It then picks an execution plan. This means that queries will vary in their performance based on the volume and distribution of data as well as the presence of indexes, constraints and existing execution plans. You can’t design part of a database, write a query and put a mental tick in the box when it seems to run OK on your test machine. Your design must be resilient in the face of increasing data (scaling) and changes in distribution.
Performance testing is essential for database development because it indicates where the design of table relationships, or the choice of keys and indexes aren’t good enough. The query optimizer will do its best with a badly designed database, but a good database design will be more resilient to scale and load. It never pays to try to avoid good relational data-modelling.
It is customary to test the part of the database development that you’re working on, but once it is in place, it is tedious to routinely run further checks, later, to ensure that other changes in the database haven’t adversely affected what you’ve done. Perhaps an index was changed or removed? A FOREIGN KEY or CHECK constraint became marked as untrusted? A NOT NULL constraint was removed? A foreign key was changed? Who knows, but it pays to check, if it can be automated.
What does Flyway do to help?
With a complex database design, you will often notice, suddenly, that there is a performance problem. The sooner you can be alerted to it and fix it, the better. The first question is usually, ‘when did this happen?‘ If you can keep records of performance tests, with each version of the database that Flyway deploys, this becomes much easier to track down, and you can then investigate what happened in that migration to change the performance.
It will be much easier to collect and archive this performance data if you provide scripts that Flyway will execute, at the point you specify in the migration run. Flyway becomes the means by which changes are made to the database, and so becomes the obvious agent for kicking off support processes such as performance testing.
When should one do performance tests?
Basic performance tests take little time to build up and tear down and can be done on every migration just to check that all is well. The problem is that a certain type of performance test, the scalability test, requires specially prepared datasets of various sizes. Whatever way this is done, there is always an overhead to replacing the data with datasets that will help ensure that the design of the database will allow performance to scale to predictable usage. As such, scalability testing is usually only done once the team have a release candidate.
However, scalability testing is crucial. This is where, for example, those table scans that lie unnoticed when there isn’t much data, will suddenly rear up and bite. Your test-automation processes should be able to deal with all types of performance test.
Getting accurate performance metrics for database performance tests
Database performance tests don’t aim to give the developer any hint as to why a process is slow, fast, or has slowed down after a change. They are designed merely to do the basic checks that end-to-end processes perform as expected, with the dataset provided. They must give good coverage of the timings for every type of data retrieval that is performed, by the applications using the database.
Every database developer hits the problem of getting accurate metrics for the time taken for a SQL query (one that produces a table-valued result set) or SQL statement (a generic term for any SQL expressions or batches).
The first instinct is usually to capture the timings for your SQL queries from a script or application merely by timing how long the ODBC/JDBC connection takes to execute the statement. This method is fine for very large datasets, where the execution time dominates the connection time, but isn’t accurate enough for smaller datasets. There is too much extra time absorbed, between the application and the database, to create and maintain the network connection or transfer the data to the client. Put another way, a statement executed via an ODBC or JDBC connection can take far longer in preparing and communicating than executing.
So how do we get just the time that the SQL Statement took to execute, without the overhead that is involved in creating and maintaining a connection to a database? Each relational database product will have its own non-standard way of ‘explaining’ the query and delivering to the user these timings, with minimal overhead (although there is still some involved, in capturing and transferring the extra information to the client).
The Command-line interface (CLI) of most RDBMSs will usually tell you the ‘real’ timing of queries you ask it to execute. For example, we can use the \timing switch in PSQL, the set statistics time on/off command in TSQL, the -p switch in SQLCMD, -vvv switch in MySQL and the .timer on option in SQLite.
If you’re using the latest version of my Flyway Teamwork PowerShell framework, which is designed to work with Flyway Teams edition, then it now includes a series of utility “GetdataFrom…” tasks that will return JSON-based results from executing any SQL command plus query execution timings. In each case, the task provides the most accurate basic timing, which is for each implicit transaction, usually a single query. Also, use of these techniques means we can retrieve the results, including the timings, without needing to maintain an extra connection to the database.
Getting query execution times from SQL Server
We’ll demonstrate this with SQL Server and SQLCMD, in a simple test harness. I’ll use the same Pubs database project that I used in the previous article, which explained Running Unit and Integration Tests during Flyway Migrations. To follow along you’ll need to migrate your sample database to v1.1.12.1, which creates the “Google-style” text search function (dbo.searchNotes).
Here’s the test harness that uses the SQLCMD -p switch to return the timings. The -p1 format provides the same information in a different format. Don’t panic about the details because, in the Teamwork framework, all this is wrapped into the $GetdataFromSQLCMD script task so that a ‘one-liner’ will execute every performance test in the project (as I’ll demonstrate shortly):
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | $MakeMeasuredQuery = { <# a Scriptblock way of accessing SQL Server, purely to make timings #>     Param ($SQLQuery)     #create a path string for the query file and the Output file     $querysource = "$env:TEMP\DeleteMePlease.sql"     $Output = "$env:TEMP\DeleteMetooPlease.dat"     <#  write out the SQL Query to the file to make SQL input to SQLCMD more          reliable  #>     $SQLQuery >$querysource     $profile = '-p1';     <# execute the SQL Statement  #>     sqlcmd -S 'MyServer' -d 'PubsDevSearch' `            -i $querysource -U 'MyUserID' -P 'mycunningpassword' `            -o "$Output" -u -y0 -b $profile     if (Test-Path "$Output")     {         $result = [IO.File]::ReadAllText("$Output")         # if there was timing data appended         $regex = '(?<PacketSize>\d{1,6}):(?<NoTransactions>\d{1,6}):(?<TotalTime>[.\d]{1,20}):(?<AverageTime>[.\d]{1,20}):(?<AverageTPS>[.\d]+)'         if ($result -imatch $regex)         {             $timingData = [pscustomobject]$matches | convertto-json             [IO.File]::writeAllText("$Output", ($result -replace $regex, ''))         }         else         {             $timingData = ''         }         if ($timingData -ne '')         {             $TimingHashTable = $TimingData | convertfrom-JSON             if ($TimingHashTable.NoTransactions -gt 1)             {                 write-output "the $($TimingHashTable.NoTransactions) transactions in  '$SQLQuery' took a total of  $($TimingHashTable.TotalTime)ms "             }             elseif ($TimingHashTable.NoTransactions -eq 1)             {                 write-output "the $($TimingHashTable.NoTransactions) transaction in '$SQLQuery' took  $($TimingHashTable.TotalTime)ms "             }             else             {                 Write-Output "there were no transactions to time in $SQLQuery"             }         }         Write-Output "The result was $([IO.File]::ReadAllText("$Output"))"         @($querysource, $Output) | remove-item     }     else     { Write-Output "failed to execute $SQLQuery" } } $MakeMeasuredQuery.Invoke('select count(*) from people.note where note like ''%nurse%'' and note like ''%disgusting%''') $MakeMeasuredQuery.Invoke('SELECT count(*) FROM dbo.searchNotes (''nurse disgusting'')') $MakeMeasuredQuery.Invoke('select count(*) from people.note where note like ''%appalling customer service%''') $MakeMeasuredQuery.Invoke('SELECT COUNT(*) FROM dbo.searchNotes(''"appalling customer service"'')') | 
On my test server, this produces the following timings that test out how effective the ‘Google-style’ search can be on the sample data.
PS <myPathTo>\FlywayTeamwork\Pubs\Branches\develop\Branches\search> S:\work\programs\powershell\TimeYourQueries.ps1
The transaction in 'select count(*) from people.note where note like '%nurse%' and note like '%disgusting%'' took  1219ms 
The result was 12
The transaction in 'SELECT count(*) FROM dbo.searchNotes ('nurse disgusting')' took  47ms 
The result was 11
The transaction in 'select count(*) from people.note where note like '%appalling customer service%'' took  1172ms 
The result was 209
The transaction in 'SELECT COUNT(*) FROM dbo.searchNotes('"appalling customer service"')' took  266ms 
The result was 211
Getting timings using the PowerShell Teamwork framework for Flyway
With the Flyway Teamwork framework, running the tests is simple because all this work is done by built-in script tasks and functions. Currently, there is support for MySQL ($GetdataFromMySQL), PostgreSQL ($GetdataFromPsql), SQLite ($GetdataFromSqlite) and SQL Server ($GetdataFromSQLCMD). I’ve also written a Execute-SQLStatement cmdlet that detects what RDBMS you’re using automatically and then calls the appropriate GetDatafrom... script black task.
To get the timings for a statement, all you need to do is to use the current Execute-SQLStatement cmdlet with a few extra parameters. Set –simpleText to $true, as we don’t want this as a JSON result. We need -timing set to $true, of course, but we can set -muted to $true if we don’t want to see the resultset returned by the query. In the following example, I’ve set the -fileBasedQuery parameter to null since I supply the SQL string directly, but alternatively we can supply the path to a file (set the SQL string to dash (-), rather than blank, if you do this.)
| 1 | Execute-SQLStatement  $dbDetails 'SELECT  * FROM dbo.authors WHERE city=''Tacoma'';' -fileBasedQuery $null -simpleText $true -timing $true -muted $true | 
To make it even simpler, I’ve provided as special Execute-SQLTimedStatement cmdlet for timing just SQL statements as Strings.
| 1 | Execute-SQLTimedStatement $dbDetails 'SELECT  * FROM dbo.authors WHERE city=''Tacoma'';'-muted $true | 
Assuming you have executed the preliminary.ps1 file, you’ll have the $DbDetails hashtable stocked with your details and then you simply provide the queries for which you want timings and execute Execute-SQLTimedStatement:
| 1 2 3 4 5 6 7 8 9 | cd <MyPathTo…>\Pubs\Branches\develop\Branches\search . '.\preliminary.ps1' @('select count(*) from people.note where note like ''%online%'' and note like ''%disgusting%''',   'SELECT count(*) FROM dbo.searchNotes (''online disgusting'')',   'select count(*) from people.note where note like ''%appalling customer service%''',   'SELECT COUNT(*) FROM dbo.searchNotes(''"appalling customer service"'')')|   foreach{    Execute-SQLTimedStatement $dbDetails $_ -muted $true     }, | 
the transaction in 'select count(*) from people.note where note like '%online%' and note like '%disgusting%'' took  1188 ms.
the transaction in 'SELECT count(*) FROM dbo.searchNotes ('online disgusting')' took  172 ms.
the transaction in 'select count(*) from people.note where note like '%appalling customer service%'' took  1125 ms.
the transaction in 'SELECT COUNT(*) FROM dbo.searchNotes('"appalling customer service"')' took  328 ms.
Surely, you might think, one should be diving into execution plans at this point and producing more detail. No: the aim of performance testing is to get the big picture. To make the results comparable, it pays to do all performance testing on a single server. I have an old server that is slow but steady, perfect for doing timings.
It is then up to the individual developer to study the results and go on to tackle the detail of what has caused the performance problem within the database. This is outside the scope of what we’re describing here.
Running automated performance tests within the framework
Performance tests can simply execute SQL queries. They can also change data and simulate actual usage. This means that it must be possible to expand the system either with PowerShell/DOS/BASH scripts, or with SQL scripts. We’re happier with PowerShell because it has components that were specifically designed for running tests. This means that we need to provide a system that will execute SQL as well as PowerShell scripts. With SQL Scripts, it merely reports how long they took, but with PowerShell scripts, it must report the results returned as output by the script.
Flyway Teamwork framework allows a Tests directory that is associated with every branch or variant of the database. This allows special tests for a feature that is under development, as well as established tests for the development or production/main branch. Performance tests require rather different handling to unit or integration tests, so we have a ‘P’ prefix to the file. This means that the file P1.1.12.1-__Check_SearchNotes_Performance.ps1 is a PowerShell file that is suitable from 1.1.12.1 onwards.
Here are the contents of the Tests folder for the develop branch of the Pubs project. There is PowerShell file that runs an integration test (T), as described in Introduction to Testing a Flyway Development, and a set of performance tests beginning with P, which is this case are just SQL queries.

In a feature branch of the develop branch, called search, there is another Tests directory that contains a set of tests for the SearchNotes function in the database. The T test checks its functionality (see Running Unit and Integration Tests during Flyway Migrations), and the P tests check the performance for various search queries.

Now, if we are developing a feature in this branch, a way of searching the customer notes, we may or may not want to execute any other Tests folders in the project, such as the one in the parent develop branch.
Assuming you have a valid $DBDetails hashtable generated by preliminary.ps1, and have then got the current version using the $GetCurrentVersion script task, then we can run all our SQL or PowerShell tests, whether unit, integration or performance tests, as follows:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | cd <MyPathTo…>\Pubs\Branches\develop\Branches\search . '.\preliminary.ps1'  Process-FlywayTasks $DBDetails $GetCurrentVersion  #to run all the tests for this project $dbDetails.TestsLocations | foreach {     #for each test location     $TestLocation = $_     @('sql', 'ps1') | foreach {         #for each type (sql or powershell) of test         #run the unit and integration tests         Run-TestsForMigration -DatabaseDetails $DBDetails -ThePath $TestLocation -type 'T' -script $_         #run the performance tests         Run-TestsForMigration -DatabaseDetails $DBDetails -ThePath $TestLocation -type 'P' -script $_     } } | 
The Run-TestsForMigration cmdlet searches through the test locations, gathers up all the tests that, according to their filename, are valid for the database version under test, then calls Execute-SQLStatement cmdlet to get the timings. It saves the results in the Reports subdirectory of the current version. In the case of this example, it would be …\Versions\1.1.12.1\Reports\Tests.
The above example will run all the valid P and T tests found in the current branch and any parent branches of the project. If you want to only run the Tests folder in the current search branch, for example, then you simply supply the index [0] for TestLocations (i.e., $dbDetails.TestsLocations[0] (or [1] to run those tests plus those in the immediate parent, which might be useful when merging).
If you just want to run all the performance tests in the project, you can use:
| 1 2 | $dbDetails.TestsLocations|foreach {     Run-TestsForMigration -DatabaseDetails $DBDetails -ThePath $_ -type 'P' -script 'sql'} | 
executing P1.1.12.1-__phrase-based_Search_With SearchNotes.sql (phrase-based Search With SearchNotes) the transaction in 'P1.1.12.1-__phrase-based_Search_With SearchNotes.sql' took 266ms executing P1.1.12.1-__wildcard_search_of_Notes.sql (wildcard search of Notes) the transaction in 'P1.1.12.1-__wildcard_search_of_Notes.sql' took 1171ms executing P1.1.12.1-__wildcard_search_of_Phrase_in_Notes.sql (wildcard search of Phrase in Notes) the transaction in 'P1.1.12.1-__wildcard_search_of_Phrase_in_Notes.sql' took 1172ms executing P1.1.12.1-__Word-based_Search_With SearchNotes.sql (Word-based Search With SearchNotes) the transaction in 'P1.1.12.1-__Word-based_Search_With SearchNotes.sql' took 16ms executing P1.1.10-__Publishers_By_Publication_Type.sql (Publishers By Publication Type) the transaction in 'P1.1.10-__Publishers_By_Publication_Type.sql' took 110ms executing P1.1.10-__Titles_and_Editions_By_Publisher.sql (Titles and Editions By Publisher) the transaction in 'P1.1.10-__Titles_and_Editions_By_Publisher.sql' took 594ms executing P1.1.10-__Titles_And_Their_Authors.sql (Titles And Their Authors) the transaction in 'P1.1.10-__Titles_And_Their_Authors.sql' took 109ms executing P1.1.10-__Top_twenty_from_Livre_publications.sql (Top twenty from Livre publications) the transaction in 'P1.1.10-__Top_twenty_from_Livre_publications.sql' took 94ms executing P1.1.10-__Top_Twenty_Most_Expensive_Publications.sql (Top Twenty Most Expensive Publications) the transaction in 'P1.1.10-__Top_Twenty_Most_Expensive_Publications.sql' took 16ms
To run SQL integration tests, we simply use the same code but change the type to ‘T’.
In the above examples, I’m just running the code from the IDE, but in my GitHub project, I’ve provided a version of this code that Flyway will use as an afterVersioned callback script, meaning that we can run the tests after every successful migration run.

Reporting performance test results
The framework currently puts each report in the Reports directory for the current version. It would be impossible to conjure up a test report that would be generally acceptable. Some tests have a lot of information…

…whereas others that merely time a file with some SQL in it need just the name of the report and an integer in milliseconds.

To accommodate more sophisticated ways of generating reports, the output of each test is easily captured and parsed to give a format that can be stored in a structured format.
Sometimes, you will need to capture the detailed timings of a long batch of SQL statements to discover where the delays are. I’ve written a SQL Prompt snippet that is provided with the tool to allow you to do this. The result is returned from the SQL that is executed. In this case, you need to capture the data returned by the SQL batch because it will contain the timings. To do this, you will need to run them as integration tests, or customize the code for Run-TestsForMigration.
Conclusions
I surprised myself when writing this article because I’d always run a far more customised SQL-based approach to performance testing and hadn’t realized the hidden power within the CLI interfaces to the various RDBMSs that I use, where they can give you the timings for the SQL statements or transactions that you execute. By integrating this CLI-based approach with Flyway, it is possible to do much more testing with much less effort.
I’ve heard the argument that the database developer should not get too distracted by performance issues while developing code, in the expectation that any grindingly slow table-scans, data-jams or other gridlocks can be sorted out by the DBA. I like the idea of a database developer being free of distractions, but my own belief is that good query performance tells you that the database design is satisfactory. The DBA can’t easily change your bad design retrospectively, it is up to the database developers to get this right, and as early as possible in development too. Performance testing is important, and if one can automate away the tedium of it, it is more likely to get done.
 
                             
                             
                             
                         
	 
                                                                    