For many SQL Server DBA’s, mastering Extended Events (XE) involves a learning curve. First they must learn the syntax for creating XE sessions to gather system data. Then they will need to learn something about XML and XPATH to query the data gathered by the session if they do not already have those skills.
However it is not necessary to absolutely master XE in order to use it for monitoring the health of your SQL Server. By the time you finish this short article you will have the knowledge and the code to retrieve health check data like deadlocks, errors and other serious things that can compromise your server’s health and performance. Best of all, you can gather all this information without putting any increased load on your server as would happen if you tried to do this using more traditional methods like traces or trace flags.
The system_health Session
You don’t have to know how to create an XE session in order to use XE to monitor SQL Server. All versions later than SQL 2008 contain a pre-defined XE session named system_health that starts each time SQL Server is started. It contains a lot of useful data like:
- The sql_text and session_id for any sessions that encounter an error that has a severity >=20.
- The sql_text and session_id for any sessions that encounter a memory-related error. The errors include a record of any non-yielding scheduler problems. (These appear in the SQL Server error log as error 17883.)
- Any deadlocks that occur
- The callstack, sql_text, and session_id for any sessions that have waited on locks for > 30 seconds.
- The callstack, sql_text, and session_id for any sessions that have waited on latches for > 15 seconds.
- The callstack, sql_text, and session_id for any sessions that have waited for a long time for preemptive waits. The duration varies by wait type. (A preemptive wait is where SQL Server is waiting for external API calls).
- The callstack and session_id for CLR allocation and virtual allocation failures.
- Events for the memory broker, scheduler monitor, out of memory errors, security, and connectivity.
- System component results from sp_server_diagnostics.
- Instance health.
- CLR Allocation failures.
- Connectivity errors
- Security errors
System_health data is stored in a 4 megabyte ring buffer. The older data drops off as new data is acquired, so you will always have access to 4 mb of the most recent system health data.
Querying System Health Data.
As I mentioned, querying an XE session can be complicated. There are two ways to approach it:
- Study long and hard and become an expert at querying the XML output using XPATH queries.
- Borrow the code from dedicated experts who stay up all night learning this stuff and then plaster it all over the internet for free.
I prefer the latter.
This is not as lazy as it sounds. It is one way to achieve practical results quickly. Imitation leads to further knowledge as you tweak and extend the example code to do more useful things.
In the following paragraphs we will look at four example scripts that retrieve different data from the system_health XE session. You can run them on any version of SQL Server from SQL 2008 forward. You can copy and paste the code from this article if you like or you can Email Me.
The Full Monte
This script pulls out all the data collected by system_health session into one big XML document. It is also the basis for the more specific queries that follow and the ones you will create for yourself later.
- SELECT CAST(xet.target_data as xml) FROM sys.dm_xe_session_targets xet
- JOIN sys.dm_xe_sessions xe
- ON (xe.address = xet.event_session_address)
- WHERE xe.name = ‘system_health’
In itself it is useful for those who have some familiarity with xml and those who just want to get familiar with everything that’s available in the session. For those who have little experience with xml, I recommend downloading XMLNotePad free from Microsoft.
Paste the xml output from this query into XMLNotepad, select Expand All from the View menu and scroll normally through the text.
Four megabytes of data is quite a bit to digest, so users will usually prefer to look at specific items within the entire result set. The following queries return subsets of the data in the familiar row and column format.
Summarize the Results
This bit of code will show us the count of specific issues captured by the session.
- SET NOCOUNT ON
- IF (SUBSTRING(CAST(SERVERPROPERTY (‘ProductVersion’) AS varchar(50)),1,CHARINDEX(‘.’,CAST(SERVERPROPERTY (‘ProductVersion’) AS varchar(50)))-1) = 10)
- BEGIN
- DECLARE @UTDDateDiff int
- SET @UTDDateDiff = DATEDIFF(mi,GETUTCDATE(),GETDATE())
- SELECT CAST(xet.target_data as xml) as XMLDATA
- INTO #SystemHealthSessionData
- FROM sys.dm_xe_session_targets xet
- JOIN sys.dm_xe_sessions xe
- ON (xe.address = xet.event_session_address)
- WHERE xe.name = ‘system_health’
- ;WITH CTE_HealthSession AS
- (
- SELECT C.query(‘.’).value(‘(/event/@name)[1]’, ‘varchar(255)’) as EventName,
- C.query(‘.’).value(‘(/event/@timestamp)[1]’, ‘datetime’) as EventTime
- FROM #SystemHealthSessionData a
- CROSS APPLY a.XMLDATA.nodes(‘/RingBufferTarget/event’) as T(C))
- SELECT EventName,
- COUNT(*) as Occurrences,
- DATEADD(mi,@UTDDateDiff,MAX(EventTime)) as LastReportedEventTime,
- DATEADD(mi,@UTDDateDiff,MIN(EventTime)) as OldestRecordedEventTime
- FROM CTE_HealthSession
- GROUP BY EventName
- ORDER BY 2 DESC
- DROP TABLE #SystemHealthSessionData
- END
Deadlock Graph
This query returns the details of all deadlocks that have been captured by the session. Each row of this data is an XML fragment describing the properties of each deadlock captured by the session.
- select XEventData.XEvent.value(‘(data/value)[1]’, ‘varchar(max)’) as DeadlockGraph
- FROM
- (select CAST(target_data as xml) as TargetData
- from sys.dm_xe_session_targets st
- join sys.dm_xe_sessions s on s.address = st.event_session_address
- where name = ‘system_health’) AS Data
- CROSS APPLY TargetData.nodes (‘//RingBufferTarget/event’) AS XEventData (XEvent)
- where XEventData.XEvent.value(‘@name’, ‘varchar(4000)’) = ‘xml_deadlock_report’
Error Summary
This query will show us all of the errors of severity 20 or above that have been captured. Each row contains the error type and the time the error occurred.
- SET NOCOUNT ON
- IF (SUBSTRING(CAST(SERVERPROPERTY(‘ProductVersion’) AS varchar(50)), 1, CHARINDEX(‘.’, CAST(SERVERPROPERTY(‘ProductVersion’) AS varchar(50))) – 1) >= 10)
- BEGIN
- SELECT CAST(xet.target_data AS XML) AS XMLDATA
- INTO #SystemHealthSessionData
- FROM sys.dm_xe_session_targets xet JOIN
- sys.dm_xe_sessions xe ON (xe.address = xet.event_session_address)
- WHERE xe.name = ‘system_health’
- ;WITH CTE_HealthSession(EventXML) AS
- (SELECT C.query(‘.’) EventXML
- FROM #SystemHealthSessionData a
- CROSS APPLY a.XMLDATA.nodes(‘/RingBufferTarget/event’) AS T (C))
- SELECT
- EventXML.value(‘(/event/@timestamp)[1]’, ‘datetime’) AS EventTime,
- EventXML.value(‘(/event/data/value)[1]’, ‘int’) AS ErrorNum,
- EventXML.value(‘(/event/data/value)[2]’, ‘int’) AS ErrSeverity,
- EventXML.value(‘(/event/data/value)[3]’, ‘int’) AS ErrState,
- EventXML.value(‘(/event/data/value)[8]’, ‘varchar(max)’) AS ErrText,
- EventXML.value(‘(/event/action/value)[2]’, ‘varchar(10)’) AS Session_ID,
- EventXML.value(‘(/event/data/text)[1]’, ‘varchar(10)’) AS Error_Category
- FROM CTE_HealthSession
- WHERE EventXML.value(‘(/event/@name)[1]’, ‘varchar(255)’) = ‘error_reported’
- DROP TABLE #SystemHealthSessionData
- END
- SET NOCOUNT OFF
Conclusion
Hopefully these examples these examples will ease you into a familiarity with Extended Events. Your next step in learning might be to extend the functionality of these scripts to retrieve additional information.
By the way, these examples come from Microsoft and were written by Amit Banerjee.
A Correction
I would like to correct what might be an error in last month’s newsletter where I said that SQL Profiler will be removed in the next new version of SQL Server. I have had a couple of people challenge that statement.
I have to admit that I did not double check my source before writing that. However I could not confirm that data when I tried later. It seems that the people who know exactly when it will be removed are not talking and no one can say anything for sure except that SQL Profiler will be removed in some future edition.
Still, it is sure that Extended Events are the future for monitoring SQL Server. There is a lot of incentive to start using them now. Compared to profiler traces, there are many more events available and there is a great deal more flexibility in the ways we can handle those events. Extended Events also imposes very little impact on server resources.