Extended Events Workbench

The Extended Events (or XEvents) feature has been part of SQL Server since 2008, but many database professionals struggle to get started using it. In this article, Phil Factor demonstrates several useful Extended Event sessions that measure just one thing in each. He then provides the code necessary to parse the resulting XML into something you can use.

A lot of the information about the way that SQL Server is working that can only be provided by Extended Events (XEvents). It is such a versatile system that it can provide a lot more than would otherwise require SQL Trace. If done well, Extended Events are so economical to use that it takes very little CPU. For many tasks, it requires less effort on the part of the user than SQL Trace.

Extended Events (XEvents) aren’t particularly easy to use, and nothing that involves having to use XML is likely to be intuitive: In fact, many DBAs compare XML unfavourably in terms of its friendliness with a cornered rat. SSMS’s user-interface for Extended Events, thankfully, removes a lot of the bite. For the beginner to XEvents with a background in SQL, it is probably best to collect a few useful ‘recipes’ and grow from that point. Templates and snippets are invaluable.

SSMS has a very good GUI that is perfectly usable to get an XEvents session up and running in order to see what is in the current queries. Its use is well covered by various articles such as Advanced Viewing of Target Data from Extended Events in SQL Server and Robert Sheldon’s Getting Started with Extended Events in SQL Server 2012. I particularly like the ‘New Session’/’session properties’ screen, which allows you to create a new session, to view and to change the properties of a session.

The reason that XEvents is more like a box of Lego bricks than like a complete model of the Starship Enterprise is that there is a huge range of requirements. It is difficult, for example, to get developers to agree on what constitutes a poor-performing query that is hogging resources or to get DBAs to come up with a consensus on the definition of a system resource contention. We want specific questions answered, and for this, XEvents is ideal because we can frame the question exactly.

The Anatomy of Extended Events

The data that is collected by Extended Events is stored in data documents; each individual session has a corresponding document. These documents are an array of events. If you specify more than one event in your session, then you will have entries for each event, which are likely to have different attributes. You might also want to add common attributes to the predicates that are special to the event.

A picture containing text, map

Description generated with very high confidence

To deal with this, it makes sense to use a data document language such as XML for storing the data, because each event is likely to have its own special data schema, containing both predicates and common attributes.

Most Extended Events sessions consist of two main parts. Firstly, the session is set up and started, and secondly, the results are queried. The session creation is done in SQL, and this is easily available even if you use the GUI. As part of the setup of the session, the session needs to be defined along with the actions that are to be used, and any filtering, the target, and the settings. The target of the event is the destination for all of the information that is captured by the session and depends on what your requirements are. Targets can write to a file, store event data in a memory buffer, or aggregate the event data in a series of ‘buckets.’ Probably the simplest is the asynchronous Ring buffer which is a memory-based FIFO. More familiar to users of SQL Trace is the asynchronous Event file, which writes from memory to disk. The Histogram target increments bucket counters to count the number of times that a specified event occurs, based on an attribute of the event. Event pairing is designed for events that occur in pairs such as sql_statement_started and sql_statement_completed and is useful to detect when a specified paired event does not occur in a matched set.

The built-in sessions aren’t always of immediate use. Why not? For a start, they include a lot of different events, so they cannot be viewed as tables because each event will have different values. Also, they can have bewildering characteristics such as those occasions when data comes and goes from the ring buffer, which seems to be due to a syntax error in the NT Service/SQLTelemetry.

You can see what is currently active in the way of sessions by means of the query

If you are using the grid for results, you can click on the targetdata column to view the XML file.

To start with, I suggest that it is worth starting simply with a session that has just a single event.

We’ll try that with sqlserver.error_reported.

Catching Those Pesky Errors

For our first Extended Events session, we will try something with just one event. This records all errors, including those from scheduled tasks and client applications. To do this, I used the SSMS Create Session wizard. Although it is those error messages with low severity numbers that are interesting, I added a filter so that we didn’t need to see anything less than severity eleven.

In order to do our best to track where the error happened, and who caused it, I’ve added some fields from the general pool of event parameters to give you more information in the ACTION clause.

Now we can test it very easily. Right-click the Extended event window and hit refresh. Make sure that the AllErrors session is there and started. Then start the live data window by right-clicking the started session and choosing Watch Live Data.

Depending on which version of SSMS you are using, you may or may not see the data show up in the Live Data pane right away. If you don’t see it, try running the batch again.

 

You can see that a number of event fields are shown in the details tab. You can add more columns to the trace window by clicking the Choose Columns button on the Extended Events toolbar.

With the Choose Columns screen, you can then decide which columns you see in the list.

You can also right-click on the data in the list in the lower pane to add it to, or remove it from, the table in the upper pane.

You can also view the target data as an XML document. To do this, expand the session to see the targets. Right-click on the target and select View Target Data. This display is updated occasionally. If you right-click the target data and then select Refresh Interval from the context menu that appears, you can select the refresh interval from the interval list. Likewise, from this context menu, you can pause and resume the automatic refresh.

As well as using the events properties window, you can find out what is available as a built-in field and predicate source for any event via a query like this, and (hopefully when they’ve filled in the column) an explanation of what it is about. This query is set to tell you what is available for error_reported.

Predicate Source Query

All we need now is some SQL code so we can see the errors in a relational way and maybe do some reporting and investigating. Actually, it is strange to see the errors that crop up.

We now have an effective solution. It is possible to create a view from the SQL if you forgo the creation of the temporary variable (it is quicker to run with the temporary variable), or you might create a stored procedure. You will probably need to add various WHERE clauses, the most obvious being how long ago you want to see errors. Here is an example that lets you see the last twenty minutes worth of errors.

A Performance Test Harness

Very often, we want to understand in some detail where slow SQL code happens in a particular batch or set of batches, and why. To do this, you need to have information about the duration and I/O. You also need to be able to access the execution plan for the batch. An obvious candidate event for this is the sql_statement_completed event. We are here setting up a session, starting it, running our test, and immediately stopping it again. We identify the batch we are interested in by putting a unique string at the beginning of the batch and filtering the event to happen only if the statement is part of this batch. Because we know what batch the statement belongs to, we can get the execution plan. This means that we can see what is in the execution plan to get as much detail as we need. In this example, I’m only doing one batch, but if you run several batches as part of the test, it will catch them all if the batch starts with the same unique string.

The sql_statement_completed event has some very useful fields/predicates. (These come from executing the Predicate Query I showed earlier.)

  • sqlserver.cpu_time (Indicates the CPU time in microseconds that is consumed by the statement.)
  • sqlserver.duration (The time in microseconds that it took to execute the statement.)
  • sqlserver.last_row_count (The last row count for this statement.)
  • sqlserver.line_number (The statement line number, in relation to the beginning of the batch.)
  • sqlserver.logical_reads (The number of logical page reads that were issued by the statement.)
  • sqlserver.offset (The statement start offset, in relation to the beginning of the batch.)
  • sqlserver.offset_end (The statement end offset, in relation to the beginning of the batch. The value will be -1 for the last statement.)
  • sqlserver.parameterized_plan_handle (The plan handle of the cache entry of the parameterized query plan.)
  • sqlserver.physical_reads (The number of physical page reads that were issued by the statement.)
  • sqlserver.row_count (The number of rows that were touched by the statement.)
  • sqlserver.statement (The text of the statement that triggered the event.)
  • sqlserver.writes (The number of page writes that were issued by the statement)

What we are doing here is to test whether the hints for GROUP BY are at all useful. To test it, we run the same GROUP BY query against AdventureWorks, using hints, and not using hints. We then create an index on the column being aggregated and run the queries again. We end up with a report that tells us the resources used every time, and how long the query takes. We can then drill into the detail. Note that you need to change the AdventureWorks2016 database to whatever version you have and also change the name of your test database, here artfully called YourTestDatabase.

This gives a result in SSMS that I’ll show in full and then divide it in half so you can read it.

Although I’ve converted the duration from microseconds to seconds, the time is a bit misleading because it isn’t the same as the elapsed time measured from the application, and the batch duration is longer than the individual durations of the queries in the batch. However, it is proportional to the elapsed time. We can, of course, click on the query plan to see the details of execution. In this case, we have monitored only one batch, so there is only one query plan.

To get the explanations for each query visible, I put a comment into the batch as close as possible to the beginning of the query, so it showed up in the result grid.

One point that needs emphasizing is that the batch or batches that we want to be traced are marked by a string at the start of the batch. --Test the value of GROUP BY options.

If you leave that out, the test won’t work.

In order to filter just those batches that we wanted to report on, we had to search the start of the batch. To make the comparison, we used the rather obscure LIKE comparison sqlserver.like_i_sql_unicode_string. You can see a list of all these predicate comparisons in the GUI, or by a query like this.

List the Predicate Options

In the first example, we did just one batch. You can turn the example into several batches all with their own execution plan simply by adding

Here is the same test with each query that you want to measure in its own batch

Notice the string --Test the value of GROUP BY options at the start of each batch. It doesn’t have to be that of course as long as you use whatever string you choose in the session event filter. Notice too that we’ve replaced the table variable with a temporary table because it is visible across batches.

I’ve written an article about using this technique with SQL Prompt snippets. This gives more details of the technique.

Tracking Troublesome Statements

Using the same SQL report, you can alter this predicate to look for a comment within a SQL statement, rather than looking at the start of a batch. I use the comment --troublesome.

To get a report of the execution of any query that has the comment you specify, you need to change the query that fetches the XML of the target data so that the name of the session is altered.

The query that then shreds the XML will work without change.

Long-running Queries

These filters on the event can become addictive. Do you fancy tracking queries that took a long time to complete? Well, all you need to do is to change the filter to…

… or whatever you consider a long-running query, and you can track them too. You can see that all you need to alter is the name of the session and the query that gets the XML of the target data, just as we did with the troublesome queries. You might have your own notion of what constitutes an expensive query. There is likely to be a predicate that will suit you.

We can also change the event on the fly, while the session is running. This, obviously, could lead to all sorts of odd things happening if the session data was being consumed by other processes or other processes were stopping or starting the process. The advantage, though, is that you can create simple views to shred the XML and display the results. It also means that you don’t then suffer from session bloat either.

Queries That Haven’t Ended

Queries can fail to end as they should for a variety of reasons. It could be, for example, that they are being blocked, that they have been timed-out by the client or have exceptionally poor performance. It could just be an error. It is useful to find out what these queries are and how long they’ve hung there.

It is also a useful introduction to the pair matching target. We’ll match sql_statement_starting and sql_statement_completed. There are a number of obvious pairs that tell us about execution, such as sp_statement, sql_batch, begin_tran, commit_tran, and sql_statement.

Now you can see what queries there are that are still executing or have died. Basically, all orphaned queries for which there is no matching sql_server_completed event.

… so you can see them appear and disappear.

Finding Queries That Failed to Complete Due to an Error

A lot of the orphaned queries from this event are simply those which have terminated with errors. If you still have your AllErrors session running at the same time as the UncompletedQueries queries session, you can relate the two with a SQL Inner join, because both of them track the client connection ID and the SQL Text.

In this case, we can see that there is a SQL Telemetry task that is repeatedly retrying and failing to create the telemetry_XEvents task because of the event name, sqlserver.graph_match_query_count, is invalid, or the object could not be found. This now tells me why it is repeatedly vanishing.

Finding Queries That Failed to Complete but Had No Error

In fact, it is probably as interesting to see those that aren’t simple errors as well. I’ve left out most of the columns from the table to save space.

Blocking

My final example of an event isn’t that new. It simply checks for blocked and blocking processes and allows you to see each unique blocking to spot what got blocked and what did the blocking.

It is very lightly edited from the code generated by the wizard. I just selected the sqlserver.blocked_process_report and sqlserver.xml_deadlock_report

The objective is to show each blockage together with a blocked process report, together with how long the blockage has lasted.

I’ve created two blocked processes artificially by starting a query in one window

But not executing the…

Then in another browser pane, I execute…

…which is blocked by the uncommitted transaction

The events session is set up by this code

And this can then be queried by code such as this

The blocked process report, when clicked on in the results pane, gives you all the information you need about the block.

Conclusion

Extended Events can unravel and analyse complex Database development problems. The problem with many of the examples available is that they are designed for people making the transition from SQL Trace, or for people who are very familiar with the SQL Server dialect of XML. This shouldn’t put you off using XEvents. I tried to keep the event sessions relatively simple and focused on a particular purpose, and I try to reuse as much as possible from successful reports. A collection of useful XEvent session scripts should be part of the toolkit of every developer.