{"id":3497,"date":"2012-02-22T05:00:00","date_gmt":"2012-02-22T05:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/tuning-red-gate-3-of-lots\/"},"modified":"2016-07-28T10:50:43","modified_gmt":"2016-07-28T10:50:43","slug":"tuning-red-gate-3-of-lots","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/blogs\/tuning-red-gate-3-of-lots\/","title":{"rendered":"Tuning Red Gate: #3 of Lots"},"content":{"rendered":"<p>I&#8217;m drilling down into the metrics about SQL Server itself available to me in the Analysis tab of SQL Monitor to see what&#8217;s up with our two problematic servers. <\/p>\n<p>In the previous post I&#8217;d noticed that rg-sql01 had quite a few CPU spikes. So one of the first things I want to check there is how much CPU is getting used by SQL Server itself. It&#8217;s possible we&#8217;re looking at some other process using up all the CPU<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb.png\" width=\"244\" height=\"132\" alt=\"image_thumb.png\" \/><\/a><\/p>\n<p>Nope, It&#8217;s SQL Server. I compared this to the rg-sql02 server:<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_3.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_3.png\" width=\"244\" height=\"131\" alt=\"image_thumb_3.png\" \/><\/a><\/p>\n<p>You can see that there is a more, consistently low set of CPU counters there. I clearly need to look at rg-sql01 and capture more specific data around the queries running on it to identify which ones are causing these CPU spikes. <\/p>\n<p>I always like to look at the Batch Requests\/sec on a server, not because it&#8217;s an indication of a problem, but because it gives you some idea of the load. Just how much is this server getting hit? Here are rg-sql01 and rg-sql02:<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_4.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_4.png\" width=\"244\" height=\"127\" alt=\"image_thumb_4.png\" \/><\/a><\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_5.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_5.png\" width=\"244\" height=\"127\" alt=\"image_thumb_5.png\" \/><\/a><\/p>\n<p>Of the two, clearly rg-sql01 has a lot of activity. Remember though, that&#8217;s all this is a measure of, activity. It doesn&#8217;t suggest anything other than what it says, the number of requests coming in. But it&#8217;s the kind of thing you want to know in order to understand how the system is used. Are you seeing a correlation between the number of requests and the CPU usage, or a reverse correlation, the number of requests drops as the CPU spikes? See, it&#8217;s useful. <\/p>\n<p>Some of the details you can look at are Compilations\/sec, Compilations\/Batch and Recompilations\/sec. These give you some idea of how the cache is getting used within the system. None of these showed anything interesting on either server. <\/p>\n<p>One metric that I like (even though I know it can be controversial) is the Page Life Expectancy. On the average server I expect see a series of mountains as the PLE climbs then drops due to a data load or something along those lines. That&#8217;s not the case here:<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_6.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_6.png\" width=\"244\" height=\"130\" alt=\"image_thumb_6.png\" \/><\/a><\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_7.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_7.png\" width=\"244\" height=\"128\" alt=\"image_thumb_7.png\" \/><\/a><\/p>\n<p>Those spikes back in January suggest that the servers weren&#8217;t really being used much. The PLE on the rg-sql01 seems to be somewhat consistent growing to 3 hours or so then dropping, but the rg-sql02 PLE looks like it might be all over the map. Instead of continuing to look at this high level gathering data view, I&#8217;m going to drill down on rg-sql02 and see what it&#8217;s done for the last week:<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_8.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_8.png\" width=\"244\" height=\"128\" alt=\"image_thumb_8.png\" \/><\/a><\/p>\n<p>And now we begin to see where we might have an issue. Memory on this system is getting flushed every 1\/2 hour or so. I&#8217;m going to check another metric, scans:<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_9.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_9.png\" width=\"244\" height=\"128\" alt=\"image_thumb_9.png\" \/><\/a><\/p>\n<p>Whoa! I&#8217;m going back to the system real quick to look at some disk information again for rg-sql02. Here is the average disk queue length on the server:<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_10.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_10.png\" width=\"244\" height=\"125\" alt=\"image_thumb_10.png\" \/><\/a><\/p>\n<p>and the transfers<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_11.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_11.png\" width=\"244\" height=\"131\" alt=\"image_thumb_11.png\" \/><\/a><\/p>\n<p>Right, I think I have a guess as to what&#8217;s up here. We&#8217;re seeing memory get flushed constantly and we&#8217;re seeing lots of scans. The disks are queuing, especially that F drive, and there are lots of requests that correspond to the scans and the memory flushes. In short, we&#8217;ve got queries that are scanning the data, a lot, so we either have bad queries or bad indexes. I&#8217;m going back to the server overview for rg-sql02 and check the Top 10 expensive queries. I&#8217;m modifying it to show me the last 3 days and the totals, so I&#8217;m not looking at some maintenance routine that ran 10 minutes ago and is skewing the results:<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_12.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_12.png\" width=\"244\" height=\"126\" alt=\"image_thumb_12.png\" \/><\/a><\/p>\n<p>OK. I need to look into these queries that are getting executed this much. They&#8217;re generating a lot of reads, but which queries are generating the most reads:<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_13.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/blogbits\/grantfritchey\/Tuning-Red-Gate-3-of-Lots_8D73\/image_thumb_13.png\" width=\"244\" height=\"130\" alt=\"image_thumb_13.png\" \/><\/a><\/p>\n<p>Ow, all still going against the same database. This is where I&#8217;m going to temporarily leave SQL Monitor. What I want to do is connect up to the server, validate that the Warehouse database is using the F: drive (which I&#8217;ll put money down it is) and then start seeing what&#8217;s up with these queries. <\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/community\/blogs\/scary\/archive\/2012\/02\/09\/105948.aspx\">Part 1 of the Series<\/a><\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/community\/blogs\/scary\/archive\/2012\/02\/13\/105995.aspx\">Part 2 of the Series<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;m drilling down into the metrics about SQL Server itself available to me in the Analysis tab of SQL Monitor to see what&#8217;s up with our two problematic servers. In the previous post I&#8217;d noticed that rg-sql01 had quite a few CPU spikes. So one of the first things I want to check there is&#8230;&hellip;<\/p>\n","protected":false},"author":221792,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[2],"tags":[],"coauthors":[],"class_list":["post-3497","post","type-post","status-publish","format-standard","hentry","category-blogs"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/3497","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221792"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=3497"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/3497\/revisions"}],"predecessor-version":[{"id":25459,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/3497\/revisions\/25459"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=3497"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=3497"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=3497"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=3497"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}