{"id":1662,"date":"2013-07-02T00:00:00","date_gmt":"2013-07-02T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/baselining-with-sql-server-dynamic-management-views\/"},"modified":"2021-08-24T13:39:53","modified_gmt":"2021-08-24T13:39:53","slug":"baselining-with-sql-server-dynamic-management-views","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/database-administration-sql-server\/baselining-with-sql-server-dynamic-management-views\/","title":{"rendered":"Baselining with SQL Server Dynamic Management Views"},"content":{"rendered":"<div class=\"article-content\">\n<p class=\"start\">With the advent of the Dynamic Management Views (DMVs) in SQL Server 2005, Microsoft vastly expanded the range and depth of metadata that could be exposed regarding the connections, sessions, transactions, statements and processes that are, or have been, executing against a database instance. These DMOs provide insight into the resultant workload generated on the server, where the pressure points are, and so on, and are a significant and valuable addition to the DBA&#8217;s troubleshooting armory.<\/p>\n<p>In short, if you look hard enough, you will find an almost overwhelming amount of data regarding user activity on your SQL Server instances, and use and abuse of the available CPU, I\/O and memory. Unfortunately, this data is often useless without proper context. It is very difficult to pinpoint a problem just by looking at a single, point-in-time snapshot of the data, or to spot retrospectively what caused a problem by examining aggregated data for the server, collected over many months.<\/p>\n<p>This is why we need to capture <strong>baselines<\/strong> for this data, so that when attempting to diagnose a particular problem they know what &#8220;normal&#8221; looks like. It&#8217;s an activity many DBAs end up postponing indefinitely perhaps because it seems like a huge task to work out which data to collect, let alone to start collect it on a regular basis, manage its storage over time, and perform detailed comparative analysis.<\/p>\n<p>I hope that this article will prove that getting started with baselines isn&#8217;t as hard as you might think. You&#8217;ve probably already established some troubleshooting queries of the type <code>SELECT Value FROM SomeSystemTable<\/code>. Capturing a set of baseline values for such a query can be as easy as changing it as follows:<\/p>\n<p><code>INSERT into BaseLine.SomeSystemTable (value, captureTime) SELECT Value, SYSDATETIME() FROM SomeSystemTable;<\/code><\/p>\n<p>I&#8217;ll demonstrate this with a few examples of how to capture baselines from DMV data, specifically how to capture baselines to:<\/p>\n<ul>\n<li>Measure I\/O distribution across your database files<\/li>\n<li>Track &#8220;per second average&#8221; PerfMon counter values, such as &#8220;Page Lookups per second&#8221;<\/li>\n<\/ul>\n<h2>Point in time versus cumulative DMV data<\/h2>\n<p>We can query data held on the DMVs just as we would any other table, view or function. However, always remember that the data returned is &#8220;dynamic&#8221; in nature. SQL Server collects it from a range of different structures in the database engine and it represents, in the main, a point-in-time &#8220;snapshot&#8221; of the activity that was occurring on the server at the time we executed the DMV query.<\/p>\n<p>Sometimes, this is exactly what is required; we have a performance issue right now, and want to find out which sessions and queries are running, right now. Bear in mind, though, that this point-in-time data can and likely will change each time you query it, as the state of the server changes. Occasionally, you should expect to see anomalous or non-representative results and you may need to run a script many times to get a true picture of the activity on your instance.<\/p>\n<p>We need to capture baselines for this data, as it is quite difficult to query the data in these point-in-time DMOs in the hope that the problem will simply reveal itself. How many DBAs, for example, can run a query of the form <em>select [columns] from [locking DMV]<\/em>and know instantly whether the locking pattern revealed is normal?<\/p>\n<p>In other cases, the data in a DMV will be cumulative. In other words, the data in a given column is accumulative and incremented every time a certain event occurs. In most cases, the data resets only when SQL Server restarts. If you have regular maintenance windows, for example to apply patches, then you can gauge from this the collection period.<\/p>\n<p>In two cases (<code>sys.dm_os_latch_stats<\/code> and <code>sys.dm_os_wait_stats<\/code>), the data accumulates since server restart but we can also clear it out manually. For example, every time a session waits a period for a resource to become available, SQL Server records this in a column of the <code>sys.dm_os_wait_stats<\/code> DMV. When querying such a DMV, you will see the total amount of time spent waiting for various resources, across all sessions, since the server last restarted or someone cleared out the statistics manually, using <code>DBCC<\/code><code>SQLPERF(\"sys.dm_os_wait_stats\",CLEAR);<\/code>.<\/p>\n<p>In all cases, when analyzing data accumulated over a long period, it will be hard to see the smaller details. You will want to capture a baseline (for example, a suitable period after a server restart, in order to capture the full workload), and then recapture the data at regular intervals thereafter. Likewise, if you want to measure the impact of a certain change to the database (a new index for example), you&#8217;ll need to take a baseline measurement, make the change, and then measure the difference.<\/p>\n<h2>Baselining Physical I\/O statistics<\/h2>\n<p>For each database file that SQL Server uses, which includes not only the data files, but also the log and full text files, the <code>sys.dm_io_virtual_file_stats<\/code> Dynamic Management Function provides cumulative physical I\/O statistics, indicating how frequently the file has been used by the database for reads and writes, since the server was last rebooted. It also provides a very useful metric in the form of the &#8220;I\/O stall&#8221; time, which indicates the total amount of time that user processes have waited for I\/O to complete, on the file in question. Note that this object measures physical I\/O only. Logical IO operations that read from cached data will not show up here.<\/p>\n<p>To get an accurate view of the data, you&#8217;d need to reboot the server at same time every day, and take a snapshot at the same time so you could compare day-to-day activity and trends. Since this is neither practical nor a very good idea, we can, instead, take a &#8216;baseline&#8217; measurement followed by the actual measurement and then subtract the two, so that you can see where I\/O is &#8220;accumulating&#8221;. In order to measure a &#8220;time slice&#8221; of activity, we take a baseline measurement, inserting data into a temporary table, as shown in Listing 1.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT DB_NAME(mf.database_id) AS databaseName , \r\n    mf.physical_name , \r\n    divfs.num_of_reads , \r\n    divfs.num_of_bytes_read , \r\n    divfs.io_stall_read_ms , \r\n    divfs.num_of_writes , \r\n    divfs.num_of_bytes_written , \r\n    divfs.io_stall_write_ms , \r\n    divfs.io_stall , \r\n    size_on_disk_bytes , \r\n    GETDATE() AS baselineDate \r\nINTO #baseline \r\nFROM sys.dm_io_virtual_file_stats(NULL, NULL) AS divfs \r\n    JOIN sys.master_files AS mf ON mf.database_id = divfs.database_id \r\n          AND mf.file_id = divfs.file_id<\/pre>\n<p class=\"caption\">Listing 1: Capturing baseline disk I\/O statistics from <code>sys.dm_io_virtual_file_stats<\/code> in a temporary table<\/p>\n<p>Listing 2 shows a query against the <code>#baseline<\/code> table, returning read statistics for a particular database.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT physical_name , \r\n    num_of_reads , \r\n    num_of_bytes_read , \r\n    io_stall_read_ms \r\nFROM #baseline \r\nWHERE databaseName = 'DatabaseName'<\/pre>\n<p class=\"caption\">Listing 2: Querying the <code>#baseline<\/code> temporary table<\/p>\n<p>This returns the following data for the read statistics:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">physical_name num_of_reads num_of_bytes_read \r\nio_stall_read_ms \r\n---------------------- -------------------- -------------------- -------- \r\nF:\\MSSQ...DATABASE.mdf 1560418               381784449024        176090340 \r\nE:\\MSSQ...BASE_log.LDF 925                   592683008           7000 \r\nI:\\MSSQ...SE_index.ndf 398504                310491209728        39664904 \r\nk:\\mssq...TABASE2A.mdf 540176                155267350528        319640508<\/pre>\n<p>This data, taken on a server that restarted about 12 hours previously, is not especially interesting or meaningful, in its own right. However, the next step is where turn this broadly interesting data into specific information that can pinpoint a problem that occurred at a specific time.<\/p>\n<p>Having captured the baseline, wait a set amount of time, or for some process to complete, and then take a second measurement. On a very busy server, you may wait only 10 seconds before taking the second measurement, as is the case in this example. We use a CTE to capture the current values, then join it to the temporary table to subtract the baseline values and calcualte the difference in the readings.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">WITH currentLine \r\n     AS ( SELECT DB_NAME(mf.database_id) AS databaseName ,\r\n           mf.physical_name , \r\n           num_of_reads , \r\n           num_of_bytes_read , \r\n           io_stall_read_ms , \r\n           num_of_writes , \r\n           num_of_bytes_written , \r\n           io_stall_write_ms , \r\n           io_stall , \r\n           size_on_disk_bytes , \r\n           GETDATE() AS currentlineDate \r\n     FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS divfs \r\n           JOIN sys.master_files AS mf \r\n              ON mf.database_id = divfs.database_id \r\n                 AND mf.file_id = divfs.file_id \r\n     ) \r\n  SELECT currentLine.databaseName , \r\n     LEFT(currentLine.physical_name, 1) AS drive ,  \r\n     currentLine.physical_name , \r\n     --gets the time diference in milliseconds since \r\n     -- the baseline was taken \r\n     DATEDIFF(millisecond,baseLineDate,currentLineDate) AS elapsed_ms,\r\n       currentLine.io_stall - #baseline.io_stall AS io_stall_ms ,\r\n       currentLine.io_stall_read_ms - #baseline.io_stall_read_ms \r\n                                       AS io_stall_read_ms ,\r\n       currentLine.io_stall_write_ms - #baseline.io_stall_write_ms \r\n                                       AS io_stall_write_ms ,\r\n       currentLine.num_of_reads - #baseline.num_of_reads \r\n                                       AS num_of_reads ,\r\n       currentLine.num_of_bytes_read - #baseline.num_of_bytes_read \r\n                                       AS num_of_bytes_read , \r\n       currentLine.num_of_writes - #baseline.num_of_writes \r\n                                       AS num_of_writes , \r\n       currentLine.num_of_bytes_written - #baseline.num_of_bytes_written \r\n                                       AS num_of_bytes_written \r\n  FROM currentLine \r\n     INNER JOIN #baseline \r\n        ON #baseLine.databaseName = currentLine.databaseName \r\n     AND #baseLine.physical_name = currentLine.physical_name \r\n  WHERE #baseline.databaseName = 'DatabaseName'<\/pre>\n<p class=\"caption\">Listing 3: Capturing 10 seconds of disk I\/O statistics, since the baseline measurement<\/p>\n<p>Following are a sampling of the result, again focusing only on the read statistics:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">physical_name  elapsed_ms num_of_reads num_of_bytes_read  io_stall_read_ms \r\n-------------- ---------- ------------ ------------------ ------------------ \r\nF:\\MSSQ.SE.mdf   10016        915             128311296        34612  \r\nE:\\MSSQ.og.LDF   10016         0                  0              0 \r\nI:\\MSSQ.ex.ndf   10016        344             172933120         8000 \r\nk:\\mssq.2A.mdf   10016         0                  0              0<\/pre>\n<p>These results show that over the 10-second sampling period, processes using the data file on the F: drive waited a combined total of 34 seconds. Of course, this data would have to be assessed in light of how many processes ran during the sampling period; if it was four, then the result would be very worrying, if it was 100 then perhaps less so.<\/p>\n<p>It is interesting that user processes read more data from the I: drive, with fewer I\/O stalls, which is most likely explained by different usage patterns. The I: drive is subject to a smaller number of mostly sequential reads, whereas the F: drive is subject to many more reads that are mainly random in nature. Obviously, one can only know for sure with some with some knowledge of the activity that was occurring during the sampling. In any event, it is certainly worrying to see that, on the F: drive, the stall times are substantially greater than the elapsed time, and I&#8217;d want to investigate further to find out why.<\/p>\n<p>I hope that this simple example illustrates how, by taking baselines, we can analyze this data to identify file I\/O bottlenecks. By comparing this data to that obtained routinely from performance counters, profiler traces and other DMV snapshots, we can really start to get an idea of how our disks are being utilized for a given server workload.<\/p>\n<p>We can expand this technique to a more permanent solution by storing the rows in a permanent table, giving you the ability to do the math on rows to see what is happening, and what has happened &#8211; see my <a href=\"http:\/\/www.red-gate.com\/products\/dba\/sql-monitor\/resources\/#webinars\">What Counts for a DBA webinar<\/a> for more detailed coverage of this. The only caveat is that you have to deal with restarts and resets of the DMV data, which is a small problem at most.<\/p>\n<h2>Baselining PerfMon Data using the DMVs<\/h2>\n<p>SQL Server provides a number of database-level and instance-level objects and associated counters, which we can use to monitor various aspects of SQL Server performance. SQL Server exposes these counters in the <code>sys.dm_os_performance_counters<\/code> DMV. These counters indicate the &#8220;queues&#8221; in your system; the places where there is a lot of demand for a given resource, and the reasons for the excessive demand, via specific resource measurements such as disk writes\/sec, processor queue lengths, available memory, and so on.<\/p>\n<p>Generally, these performance counters are investigated using Performance Monitor (PerfMon), a Windows OS monitoring tool that provides a vast range of counters for monitoring memory, disk, CPU and network usage on a server (for example, see <a href=\"http:\/\/technet.microsoft.com\/en-us\/library\/cc768048.aspx\">http:\/\/technet.microsoft.com\/en-us\/library\/cc768048.aspx<\/a>), and also exposes the counters maintained by SQL Server. Most DBAs will be familiar with setting up PerfMon to record statistics from various counters at regular intervals, storing the data in a file and then importing it into Excel for analysis.<\/p>\n<p>However if, like me, you prefer to save the statistics in a database table and interrogate them using SQL, the <code>sys.dm_os_performance_counters<\/code> DMV is a very useful tool. Just write the query to retrieve the data from the DMV, add <code>INSERT<\/code><code>INTO CounterTrendingTableName...<\/code> and you have a rudimentary monitoring system! In addition, it&#8217;s not always possible to get direct access to PerfMon, and accessing it from a different machine can be slow.<\/p>\n<p>Unfortunately, using this DMV is far from plain sailing. With that warning in mind, let&#8217;s take a look at the columns that the <code>sys.dm_os_performance_counters<\/code> DMV provides.<\/p>\n<ul>\n<li><strong><code>object_name<\/code><\/strong> &#8211; name of the object to which the counter refers. This is usually a two-part name, starting with <code>SQL Server:<\/code>. For example, <code>SQL Server:Databases<\/code> or <code>SQL Server:Locks<\/code>.<\/li>\n<li><strong><code>counter_name<\/code><\/strong> &#8211; name of the counter. For example, the <code>SQL Server:Databases<\/code> object exposes the <code>Log Shrinks<\/code> counter, to monitor transaction log shrink events.<\/li>\n<li><strong><code>instance_name<\/code><\/strong> &#8211; specific instance of a counter, such as the database name for <code>SQLServer:Databases:LogShrinks<\/code> or users errors for <code>SQLServer:SQL Errors:Errors\/sec<\/code>.<\/li>\n<li><strong><code>cntr_value<\/code><\/strong> &#8211; most recent value of the counter.<\/li>\n<li><strong><code>cntr_type<\/code><\/strong> &#8211; type of counter.<\/li>\n<\/ul>\n<p>Note that only SQL Server counters are represented in the DMV, not any Windows or other counters.<\/p>\n<p>Most of these columns look innocuous enough, but don&#8217;t be deceived &#8211; the <code>cntr_value<\/code> and the <code>cntr_type<\/code> values, in particular, are a nest of vipers. The <code>cntr_type<\/code> column exposes WMI Performance Counter Types and, for the most part, the values provided for each type of counter in <code>cntr_value<\/code> will need to be decoded before we can use them. To get the list of counter types, we can execute the query shown in Listing 4.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT DISTINCT \r\n   cntr_type \r\nFROM sys.dm_os_performance_counters \r\nORDER BY cntr_type\r\n\r\ncntr_type\r\n-----------\r\n65792\r\n272696576\r\n537003264\r\n1073874176\r\n1073939712<\/pre>\n<p class=\"caption\">Listing 4: Returning a list of PerfMon counter types.<\/p>\n<p>Books Online does a poor job of documenting the <code>cntr_type<\/code> values, but my research revealed the following:<\/p>\n<ul>\n<li><strong><code>65792 = PERF_COUNTER_LARGE_RAWCOUNT<\/code><\/strong> &#8211; provides the last observed value for the counter; for this type of counter, the values in <code>cntr_value<\/code> can be used directly, making this the most easily usable type<\/li>\n<li><strong><code>272696576 = PERF_COUNTER_BULK_COUNT<\/code><\/strong> &#8211; provides the average number of operations per second. Two readings of <code>cntr_value<\/code> will be required for this counter type, in order to get the per second averages<\/li>\n<li><strong><code>537003264 = PERF_LARGE_RAW_FRACTION<\/code><\/strong> &#8211; used in conjunction with <code>PERF_LARGE_RAW_BASE<\/code> to calculate ratio values, such as the cache hit ratio.<\/li>\n<li><strong><code>1073874176 = PERF_AVERAGE_BULK<\/code><\/strong> &#8211; used to calculate an average number of operations completed during a time interval; like <code>PERF_LARG_RAW_FRACTION<\/code>, it uses <code>PERF_LARGE_RAW_BASE<\/code> to do the calculation<\/li>\n<li><strong><code>1073939712 = PERF_LARGE_RAW_BASE<\/code><\/strong>, used in the translation of <code>PERF_LARGE_RAW_FRACTION<\/code> and <code>PERF_AVERAGE_BULK<\/code> values to readable output; should not be displayed alone.<\/li>\n<\/ul>\n<p>Here, we&#8217;ll focus only on the &#8220;Per second average&#8221; (<code>PERF_COUNTER_BULK_COUNT<\/code>) category of Perfmon Counters. Some interesting values to look out for in this category include:<\/p>\n<ul>\n<li><strong><code>Server:Buffer Manager-Page lookups\/sec<\/code><\/strong><br \/>\ngives an indication of cache activity; higher numbers indicate a more active buffer pool (the definition of &#8220;higher&#8221; is largely dependent on your hardware and usage)<\/li>\n<li><strong><code>Server:Databases-&lt;databaseName&gt;- Log Bytes Flushed\/sec<\/code><\/strong><br \/>\ngives an indication of how much data has been written to the log for the database<\/li>\n<li><strong><code>SQLServer:Locks-_Total - Lock Requests\/sec<\/code><\/strong><br \/>\nprovides the number of locks being taken on a server per second, usually to compare to other time periods, to see when the server is being inundated with queries that might block other users.<\/li>\n<\/ul>\n<p>As &#8220;point-in-time&#8221; values, a single snapshot does not necessarily tell us very much. However, when tracked over time, they can help us identify worrying events, trends, or changes. For example, let&#8217;s say that users start experiencing performance problems at a given time of day, and that you notice that this coincides with spikes in the number of lock requests per second, which, in turn, coincides with the time that Joe Doofus, the manager with more rights than brains, issues a major query in <code>SERIALIZABLE<\/code> isolation level.<\/p>\n<p>Nearly all of the counters in this group are named with a suffix of &#8216;\/sec&#8217;, but a few are actually prefixed &#8216;(ms)&#8217;. However, we can interpret the latter as &#8220;number of milliseconds waited per second.&#8221; So, for example, <code>Total Latch Wait Time (ms)<\/code>&#8216; is the average amount of time per second that a certain process had to wait to acquire a latch, over the time the sample was taken. These counters are constantly incrementing values, though if they ever hit the maximum value, they would reset to zero.<\/p>\n<p>The way to deal with these counters is the same as for our example with an accumulating counter DMV. Take a baseline value, wait some number of seconds, then sample again. For ad-hoc monitoring of a given operation, you would set the delay such that you could capture a suitable number of samples over the period of time the operation is taking place.<\/p>\n<p>The example in Listing 5 uses a simple <code>WAITFOR<\/code> statement to implement the delay, in this case, 5 seconds. It uses a<code> datetime<\/code> column, with a default of <code>getdate()<\/code>, in order to capture the exact time the values were sampled (since the delay may not be exactly 5 seconds each time; for example, it might actually take 5020 milliseconds to execute the query).<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">DECLARE @PERF_COUNTER_BULK_COUNT INT \r\nSELECT @PERF_COUNTER_BULK_COUNT = 272696576 \r\n\r\n--Holds initial state \r\nDECLARE @baseline TABLE \r\n   ( \r\n      object_name NVARCHAR(256) , \r\n      counter_name NVARCHAR(256) , \r\n      instance_name NVARCHAR(256) , \r\n      cntr_value BIGINT , \r\n      cntr_type INT , \r\n      time DATETIME DEFAULT ( GETDATE() ) \r\n   ) \r\n   \r\nDECLARE @current TABLE \r\n   ( \r\n      object_name NVARCHAR(256) , \r\n      counter_name NVARCHAR(256) , \r\n      instance_name NVARCHAR(256) , \r\n      cntr_value BIGINT , \r\n      cntr_type INT , \r\n      time DATETIME DEFAULT ( GETDATE() ) \r\n   ) \r\n\r\n--capture the initial state of bulk counters \r\nINSERT INTO @baseline \r\n   ( object_name , \r\n     counter_name , \r\n     instance_name , \r\n     cntr_value , \r\n     cntr_type \r\n   ) \r\n   SELECT object_name , \r\n          counter_name , \r\n          instance_name , \r\n          cntr_value , \r\n          cntr_type \r\n   FROM sys.dm_os_performance_counters AS dopc \r\n   WHERE cntr_type = @PERF_COUNTER_BULK_COUNT \r\n\r\nWAITFOR DELAY '00:00:05' --the code will work regardless of delay chosen\r\n\r\n--get the followon state of the counters \r\nINSERT INTO @current \r\n   ( object_name , \r\n     counter_name , \r\n     instance_name , \r\n     cntr_value , \r\n     cntr_type \r\n   ) \r\n   SELECT object_name , \r\n          counter_name , \r\n          instance_name , \r\n          cntr_value , \r\n          cntr_type \r\n   FROM sys.dm_os_performance_counters AS dopc \r\n   WHERE cntr_type = @PERF_COUNTER_BULK_COUNT \r\n\r\nSELECT dopc.object_name , \r\n       dopc.instance_name , \r\n       dopc.counter_name , \r\n       --ms to second conversion factor \r\n       1000 * \r\n       --current value less the previous value \r\n   ( ( dopc.cntr_value - prev_dopc.cntr_value ) \r\n       --divided by the number of milliseconds that pass \r\n       --casted as float to get fractional results. Float \r\n       --lets really big or really small numbers to work \r\n       \/ CAST(DATEDIFF(ms, prev_dopc.time, dopc.time) AS FLOAT) ) \r\n                                                 AS cntr_value \r\n       --simply join on the names of the counters \r\nFROM @current AS dopc \r\n     JOIN @baseline AS prev_dopc ON prev_dopc. object_name = \r\ndopc. object_name \r\n                       AND prev_dopc.instance_name = dopc.instance_name\r\n                       AND prev_dopc.counter_name = dopc.counter_name \r\nWHERE dopc.cntr_type = @PERF_COUNTER_BULK_COUNT \r\n      AND 1000 * ( ( dopc.cntr_value - prev_dopc.cntr_value ) \r\n                   \/  CAST( DATEDIFF(ms, prev_dopc. time, dopc. time)  AS FLOAT) ) \r\n \/* default to only showing non-zero values *\/ &lt;&gt; 0 \r\nORDER BY dopc. object_name , \r\n         dopc.instance_name , \r\n         dopc.counter_name<\/pre>\n<p class=\"caption\">Listing 5: Returning the values of &#8220;per second average&#8221; PerfMon counters.<\/p>\n<p>It is easy to adapt this code to another specific counter type, such as one of those listed previously but I won&#8217;t show it here. Notice that we default to only showing non-zero values, since that is usually what you will be interested in when using this set of data.<\/p>\n<h2>Summary<\/h2>\n<p>Any monitoring tool, whether it takes the form of a set of hand-rolled scripts and some Agent jobs, or a proper third-party monitoring tool, will track and interrogate the data in various system tables and views over time. Armed with this monitoring data, performance tuning, and resource provisioning, starts to become more proactive than reactive. We can see problems coming and prevent escalation, rather than simply react after the event. We can create some Agent jobs to warn us as soon as a metric strays significantly from a value that represents &#8220;normal&#8221; for our system, as established by our baseline measurements.<\/p>\n<p>Of course, there is quite a bit of data to capture, manage and monitor over time and what I&#8217;ve showed here is really just the start. For more ideas, check out Erin Stellato&#8217;s series over on SQLServerCentral (<a href=\"https:\/\/www.sqlservercentral.com\/author\/Erin-Stellato\">https:\/\/www.sqlservercentral.com\/author\/Erin-Stellato<\/a>).<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>When you&#8217;re monitoring SQL Server, it is better to capture a baseline for those aspects that you&#8217;re checking, such as workload, Physical I\/O or performance. Once you know what is normal, then performance tuning and resource provisioning can be done in a timely manner before any problem becomes apparent. We can prevent problems by being able to predict them. Louis shows how to get started.&hellip;<\/p>\n","protected":false},"author":56085,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143527],"tags":[4156,5875,4170,4364,4179,4150,4151,5876],"coauthors":[19684],"class_list":["post-1662","post","type-post","status-publish","format-standard","hentry","category-database-administration-sql-server","tag-asp","tag-baselining","tag-database-administration","tag-monitoring","tag-source-control","tag-sql","tag-sql-server","tag-sql-server-statistics"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1662","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/56085"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=1662"}],"version-history":[{"count":5,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1662\/revisions"}],"predecessor-version":[{"id":84255,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1662\/revisions\/84255"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=1662"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=1662"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=1662"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=1662"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}