Improving ReadyRoll performance, one line at a time

In our latest release, we sped up ReadyRoll’s performance more than x10 on large databases in two key areas of the product: script generation and project packaging.

If you are new to a codebase and haven’t done any performance work on the application before, it is a good idea to get a big, bad test case and observe the application running. Look for points where you wait for over a second. Pause the debugger twice over a couple of seconds and drill down further if the method is the same. If that fails, it is time to get out your handy performance profiler.

Find the biggest baddest database ever

First of all we took the biggest database you have ever seen in your life. We are talking 80 MB of schema.

This database contains over twenty thousand objects:

  • 15,000+ stored procedures
  • 2,000+ tables
  • 1,000+ views

Try to import it

We tried to import it using ReadyRoll. It failed hard on initial import with an Out of Memory exception. By simply pausing the debugger and checking task manager, we found Visual Studio was taking an enormous amount of memory, and was stuck on a single line of parser code.

Before ReadyRoll switched to use Redgate’s SQL Compare to generate migration scripts, we had to run a cleanup operation. This was done on SQL parse tree in memory so that we could guarantee the changes were syntactically correct. After the switch, we found that many of these cleanup operations were no longer necessary. This meant we could remove parsing of the generated script altogether, and instead rely on simple text manipulations for the remaining cleanup operations.

By simply removing an obsolete line of code from the product, we not only prevented the memory issue from occurring, but also significantly reduced the amount of time it took to do the initial import of a database.

Try to refresh it

ReadyRoll shows you the list of changes when you refresh the DbSync tool. After your initial import this is used to indicate the objects that require updating in the project.

VS-RRDBSync-Refreshing

During the refresh operation, we sort all of the pending changes by object type and object name. Under the hood, we were notifying the rest of the application when the position of each element within the collection changed. The effect of this behavior was hardly noticeable on the small schemas we use for testing. However, on large schemas containing thousands of objects the sort would never finish, preventing further schemas from being imported.

The solution was to stop watching pending changes before sorting.

The result of this change was that the sorting of this enormous collection now completes in an instant.

Try to build it

When you press build, ReadyRoll packages all your migrations into a deployment script. Normally this is a fast process. On this database, we waited for 30 minutes until the build process finally died with an Out of Memory exception.

What was going on? We took our trusty debug pause button and stopped it. The culprit was the Sql110ScriptGenerator class (from the Microsoft.SqlServer.TransactSql.ScriptDom library shipped with Visual Studio). It was stuck generating a script with 80,000 batch statements.

Using Redgate Reflector, we investigated the Sql110ScriptGenerator class and found a huge amount of complexity. The loops were complex and the batches were processed together. In our case, SQL Compare generates scripts with the correct order and the batch statements did not require re-ordering or any complex analysis.

The fix is simple, we just had to replace processing the whole script:

With building the script in batches:

Instead of generating a script with 80,000 batch statements, we now generate 80,000 scripts. This improved performance so much, building the package now takes under a second for all 80,000 batches.

What’s next?

The OctoPack deployment package produced from 20,000 objects is ~100MB (uncompressed) and can take a number of minutes to deploy, even if there are no pending migrations. We could reduce that significantly if we could guarantee a minimum starting point across all environments. If you know all environments are already at a particular version, you don’t have to provide information about the entire schema to them. This would be a bonus for people who use ReadyRoll to continuously deploy larger databases as it would reduce the incremental cost of deploying through multiple Test/Staging/Production environments.

I hope you have enjoyed hearing about our latest work. The ReadyRoll team has expanded since its acquisition by Redgate and we look forward to telling you more about what we are working on.

Tools in this post

SQL Change Automation

Automate your database changes with CI and automated deployment

Find out more