{"id":89215,"date":"2020-12-03T10:14:52","date_gmt":"2020-12-03T10:14:52","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=89215"},"modified":"2020-12-03T11:51:28","modified_gmt":"2020-12-03T11:51:28","slug":"why-it-makes-sense-to-monitor-sql-server-deadlocks-in-their-own-extended-events-trace","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/blogs\/why-it-makes-sense-to-monitor-sql-server-deadlocks-in-their-own-extended-events-trace\/","title":{"rendered":"Why it makes sense to monitor SQL Server deadlocks in their own Extended Events trace"},"content":{"rendered":"<p>We recently had customer ask why <a href=\"https:\/\/www.red-gate.com\/products\/dba\/sql-monitor\/overview\" target=\"_blank\" rel=\"noopener\">SQL Monitor<\/a> creates an Extended Events session to capture deadlock graphs, when SQL Server has a built-in system_health Extended Events trace which also captures deadlock information?<\/p>\n<p>There are a couple of reasons why a dedicated trace is desirable for capturing deadlock graphs, whether you are rolling your own monitoring scripts or building a monitoring application. I like this question a lot because I feel it gets at an interesting tension\/balance at the heart of monitoring itself.<\/p>\n<h2>Segmentation is helpful for users<\/h2>\n<p>One reason that SQL Monitor uses a separate Extended Events trace is to segment off what is used by SQL Monitor. This helps administrators understand what is impacted if they stop or modify an Extended Events session.\u00a0<\/p>\n<p>While Microsoft recommends that <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/extended-events\/use-the-system-health-session\" target=\"_blank\" rel=\"noopener\">administrators don&#8217;t stop, alter, or delete the system_health session<\/a>, in practice it&#8217;s quite easy to do any of these things. An administrator might assume if they have installed other monitoring software that they could stop, delete, or modify the definition of system_health without impacting the alternate monitoring.<\/p>\n<h2>Extended Events sessions have retention policies<\/h2>\n<p>The current implementation of the system_health session is that it writes data both to an asynchronous ring buffer target and to an event_file. The event files currently have a maximum file size of 5MB and can roll over to 4 files.<\/p>\n<p>While this is a sensible configuration for most systems, the system_health session collects multiple events. In some scenarios, it&#8217;s possible for these events to generate a significant amount of data quickly, which could plausibly use up a lot of the event file space and roll off other events.\u00a0<\/p>\n<p>It&#8217;s also quite possible for Microsoft to add additional events or change the amount of data retained for these logs at any time.<\/p>\n<p>These factors make it desirable to use a separate trace for distinct events which you care about for monitoring purposes. This way you control your own retention policies and can isolate events to their own traces as needed.\u00a0<\/p>\n<h2>Deadlock reports are lightweight to collect in Extended Events<\/h2>\n<p>When creating any Extended Events trace against a production instance, it&#8217;s important to evaluate the performance impact of the events you&#8217;re collecting.\u00a0<\/p>\n<p>The xml_deadlock_report event, for example, is a lightweight event to collect.\u00a0<\/p>\n<p>Other events have a greater impact on the instance. The most famous example of this: starting an Extended Events trace that collects &#8216;actual&#8217; execution plans using the query_post_execution_showplan event can very quickly slow down a SQL Server instance. This is true even if you have applied a filter to only collect plans for a very specific query! This event unfortunately has a very high overhead which filtering does not reduce. (There are <a href=\"https:\/\/techcommunity.microsoft.com\/t5\/sql-server\/using-xevents-to-capture-an-actual-execution-plan\/ba-p\/392136\" target=\"_blank\" rel=\"noopener\">some alternatives<\/a>, but it gets complex pretty fast.)<\/p>\n<h2>Monitoring is an art of balancing between observation and impact<\/h2>\n<p>I like this example because it gets at a core challenge of monitoring: we always need to balance the impact of observation with the benefits of the data we collect. This is always a tough problem as you build monitoring software, as monitoring queries are also subject to variances in query optimization and performance in a database, just like any other activities.<\/p>\n<p>In the case of deadlock graphs, the impact of collecting these in a dedicated Extended Events session is low enough that the benefits of segmenting this out are persuasive, in my view.<\/p>\n<h2>Want to learn more about deadlocks?<\/h2>\n<ul>\n<li>I have a free online SQL Server course, <a href=\"https:\/\/littlekendra.com\/course\/troubleshooting-blocking-and-deadlocks-for-beginners\/\" target=\"_blank\" rel=\"noopener\">Troubleshooting Blocking and Deadlocks for Beginners<\/a>, which features a section dedicated to deadlocks.<\/li>\n<li>Redgate has a Product Learning article on <a href=\"https:\/\/www.red-gate.com\/hub\/product-learning\/sql-monitor\/monitoring-troubleshooting-deadlocks-sql-monitor\" target=\"_blank\" rel=\"noopener\">Monitoring and Troubleshooting Deadlocks with SQL Monitor<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>We recently had customer ask why SQL Monitor creates an Extended Events session to capture deadlock graphs, when SQL Server has a built-in system_health Extended Events trace which also captures deadlock information? There are a couple of reasons why a dedicated trace is desirable for capturing deadlock graphs, whether you are rolling your own monitoring&#8230;&hellip;<\/p>\n","protected":false},"author":321187,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[2],"tags":[],"coauthors":[61458],"class_list":["post-89215","post","type-post","status-publish","format-standard","hentry","category-blogs"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/89215","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/321187"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=89215"}],"version-history":[{"count":15,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/89215\/revisions"}],"predecessor-version":[{"id":89231,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/89215\/revisions\/89231"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=89215"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=89215"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=89215"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=89215"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}