Improving SQL Monitor’s Database Performance
Jeremiah Peschka, a tech lead for SQL Monitor development, explains the new SQL Server configuration options available for the data repository, and how enabling them might improve SQL Monitor performance, when reading and writing monitoring data.
SQL Monitor collects a huge amount of data from every monitored SQL Server instance and database, storing it all in a single database. As part of our drive to scale SQL Monitor across larger estates, we have been exploring how to configure SQL Server to get the best performance out of SQL Monitor. To understand how these configuration changes would affect SQL Server and the SQL Monitor data repository, we used a test system with five base monitors, each monitoring 400 fake SQL Servers. There was, of course, a sixth instance of SQL Monitor monitoring the five data repositories.
Customers with many SQL Servers per base monitor may observe a lot of
WRITELOG waits in the SQL Monitor data repository. This means that the SQL Monitor data repository log file can’t keep up with the volume of data that SQL Monitor is creating. SQL Server synchronously writes data to its transaction log (this ensures certain guarantees about the data are maintained). SQL Monitor can only operate as fast as your SQL Server transaction log can write data to disk.
One way to make the transaction log writes faster is to buy high-performance solid-state storage. Not all customers want to dedicate this faster, more expensive storage to their SQL Monitor data repository. But everyone wants SQL Monitor to work quickly. Thankfully, there is a way to improve performance without buying expensive solid-state storage.
SQL Server 2014 introduced delayed durability for transaction log files. Under normal operation, database changes are applied to the transaction log, on disk, before the commit is acknowledged and control returns to the client. Delayed durability writes data to a buffer in memory, acknowledges the commit to the client, and only writes that buffer to disk only when the buffer is full (or after some amount of time has passed).
Enabling delayed durability on the SQL Monitor data repository means that SQL Monitor doesn’t need to wait for each small write to commit to the log file before continuing. DBAs should see a reduction in
WRITELOG waits and should see SQL Monitor becoming more responsive. Although some data could be lost, the delayed durability buffer is small (around 60KB).
Even though our test system uses high-performance SSDs, we observed that the SQL Monitor data repository still had a lot of
WRITELOG waits. The volume of
WRITELOG waits increased as we added additional monitored SQL Servers. This makes sense: we’re writing additional data. As soon as we enabled delayed durability, we saw an immediate reduction in
WRITELOG waits. The important part is that we did not see an increase in other waits. In other words, the bottleneck was not moved to another part of SQL Server; we only saw the total waiting time decrease.
SQL Server provides several mechanisms to compress the data in a particular table or index. Compression does more than just save disk space: data remains compressed in memory allowing SQL Server to keep more data in memory. And, as you already know, keeping more data in memory reduces disk IO.
SQL Server 2016 SP1 makes data compression available across all editions of SQL Server. For new SQL Monitor installations, we enable data compression whenever it is possible to do so. For existing SQL Monitor installations, DBAs can compress the entire SQL Monitor repository on the Data settings page. This can take a while and needs to be scheduled during a period when SQL Monitor activity is low.
We enabled data compression during our performance testing and observed several benefits. First, free space increased in the SQL Monitor repository; this is the obvious and desirable effect of enabling compression. Second, I/O was dramatically reduced; by enabling data compression, SQL Server wrote data to the SQL Monitor repository significantly less frequently. And third, I/O stalls became more predictable; the amount of time spent performing I/O before compression was highly variable. After compression, I/O variability was reduced.
Query Optimizer hotfixes
Many DBAs are familiar with SQL Server’s trace flag 4199, affectionately referred to as the grab bag of optimizer tricks. This setting enables any query optimizer fixes that have been introduced since the current version of SQL Server was released. Trace flags are a blunt tool. A trace flag can either be applied to a single query or applied to all active sessions on the SQL Server (either at startup or using a DBCC command). If SQL Monitor’s data repository is co-located with other databases, it may not be advisable to enable trace flag 4199.
Starting with SQL Server 2016, DBAs can apply database-scoped configuration settings and by enabling the
QUERY_OPTIMIZER_HOTFIXES setting can get all the performance improvements that are introduced after a new major version of SQL Server is released, without resorting to trace flags. While this fix didn’t cause any noticeable performance improvements on our test systems (all our test systems are running SQL Server 2019), we anticipate that this will improve performance on SQL Server 2016 and beyond.
Where can I enable these settings?
All these settings can be enabled in the Data settings page of the SQL Monitor Configuration settings. Depending on your SQL Server version and edition, some options may not be available.
The Data settings page presents these new options:
- Compression: Users can estimate the savings enabled by data compression and then enable data compression. If, for some reason, data compression is stopped while some tables are not compressed, SQL Monitor will continue compressing tables when it comes back online.
- Performance options: Each option can be enabled or disabled individually.
How do we know that it worked?
The wait graph below is from one of the SQL Monitor data repositories. Prior to applying configuration changes, the waits graph is jagged and unpredictable. There is a correlation between high waits and poor disk IO. We applied configuration changes just after 18:00 and saw a dramatic decrease in overall waits and IO. Additionally, CPU usage became more consistent and had fewer peaks.
By optimizing the configuration of SQL Monitor’s SQL Server database, we hope that some users will see a noticeable improvement in SQL Monitor’s responsiveness, especially those users monitoring high numbers of SQL Server instances from a single base monitor.
Learn more about SQL Monitor, and download your 14-day free trial.
Was this article helpful?