{"id":105047,"date":"2025-01-08T06:56:48","date_gmt":"2025-01-08T06:56:48","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=105047"},"modified":"2025-01-08T06:56:49","modified_gmt":"2025-01-08T06:56:49","slug":"a-use-case-for-memory-optimized-tempdb-metadata","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/blogs\/a-use-case-for-memory-optimized-tempdb-metadata\/","title":{"rendered":"A Use Case for Memory-Optimized Tempdb Metadata"},"content":{"rendered":"<p>I recently had an interesting production SQL Server issue that turned out to be very easy to fix. The fix doesn\u2019t fit every workload or server, but for the limited use cases described below, it\u2019s a simple configuration change. The general best practice for server level configurations is to leave things at default, unless there is a specific reason for changing the settings. This will outline a use case for using memory-optimized tempdb metadata.<\/p>\n<p>This covers a very specific scenario and workload. You won\u2019t use this on a typical server, which is why it isn\u2019t a default setting. You will see this for very specific server workloads, those with many transactions and high temp table usage. Most systems can better use the memory for the regular workload instead of optimizing tempdb metadata, so don\u2019t use this as a default setting for your servers.<\/p>\n<h2>Symptoms<\/h2>\n<p>Symptoms for the issue started with reports of a very poor performing system. Digging into the details showed 100% CPU utilization on the SQL Server and queries performing much slower than normal. After eliminating deadlocks and excessive blocking, current queries were examined and their wait types. Wait types are an important health indicator and will be one of the first things you want to check when troubleshooting generalized issues.<\/p>\n<p>Changing server configuration options presupposes that a lot of additional database and server tuning has already happened. You want to be sure the general database health is good, indexes are getting rebuilt on a regular basis, statistics are updated, recommended \/ necessary indexes are in place, and queries are optimized. You also need to check for deadlocks and blocking.<\/p>\n<p>If you\u2019ve done your standard troubleshooting and notice a high number of page latch waits, your server might be a candidate for memory-optimized tempdb metadata. Latches are light weight locks used internally by the system. You don\u2019t need to know all of the details to diagnose this issue, but they are described in the first link in the references. The confirmation steps to diagnose this issue will follow the general pattern:<\/p>\n<ol>\n<li>Performance is impacted and standard troubleshooting is performed<\/li>\n<li>Large number of page latch waits are found<\/li>\n<li>The page latch waits are in tempdb<\/li>\n<li>Other page latch wait types are ruled out \u2013 leaving only metadata latches\n<ol>\n<li>GAM<\/li>\n<li>SGAM<\/li>\n<li>PFS<\/li>\n<\/ol>\n<\/li>\n<li>Enable server setting &#8216;tempdb metadata memory-optimized&#8217; if appropriate and confirm improvement\n<ol>\n<li>If you know this is a high transaction system that extensively uses temp tables, it\u2019s an even stronger indication to make the change<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<h3>Performance<\/h3>\n<p>You might be lucky enough to catch this before you start having performance issues, but you will likely notice this first due to poor performance. It may be high CPU usage, or just very poor SQL Server performance. Conduct your standard troubleshooting here, including server diagnostics. Once you have confirmed the server is performing poorly and you don\u2019t see an obvious cause, examine current queries and wait types on the server.<\/p>\n<h3>Wait types for current queries<\/h3>\n<p>Part of performance checks include looking at current queries and their wait types. The following shows current processes and their associated wait types. If there are a significant number of page latch waits, it\u2019s an indication to continue investigating possible latch contention. Other wait types don\u2019t immediately rule out latch contention, but it\u2019s a strong sign that it\u2019s a different issue. Keep an eye on wait types as you continue troubleshooting and monitoring, but it may not be the cause of your performance problems if you don\u2019t see it here.<\/p>\n<pre class=\"lang:none theme:none\">SELECT\n\tSP.spid\n\t,SP.lastwaittype\n\t,SP.status\n\t,SP.cpu\n\t,SP.physical_io\n\t,SP.memusage\nFROM sys.sysprocesses SP\nWHERE SP.spid &gt; 50\n\tAND SP.status &lt;&gt; 'background'\n\tAND SP.open_tran &gt; 0<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"378\" height=\"196\" class=\"wp-image-105048\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/01\/word-image-105047-1.png\" \/><\/p>\n<p>If you see page latch wait types when looking at current processes, you need to get additional details. You can look at waiting tasks and the current sessions to get more information. The resource description in the query shows the details for the resource, including the database ID. The number in the resource_description column before the colon indicates the database ID. Tempdb is always 2, as shown in the example below. If you see latch waits concentrated in tempdb, continue investigating. If the page latches are not in tempdb, you can use the Microsoft troubleshooting guide listed in the references.<\/p>\n<pre class=\"lang:none theme:none\">SELECT\n\twt.session_id\n\t,wt.wait_type\n\t,er.last_wait_type AS last_wait_type\n\t,wt.wait_duration_ms\n\t,wt.blocking_session_id\n\t,wt.blocking_exec_context_id\n\t,resource_description\nFROM sys.dm_os_waiting_tasks wt\n\tINNER JOIN sys.dm_exec_sessions es \n\t\tON wt.session_id = es.session_id\n\tINNER JOIN sys.dm_exec_requests er \n\t\tON wt.session_id = er.session_id\nWHERE es.is_user_process = 1\n\tAND wt.wait_type &lt;&gt; 'SLEEP_TASK'\nORDER BY wt.wait_duration_ms desc<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"894\" height=\"101\" class=\"wp-image-105049\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/01\/word-image-105047-2.png\" \/><\/p>\n<h3>Page latch waits in tempdb<\/h3>\n<p>After you have confirmed you are experiencing latch contention in tempdb, you need to confirm that the latches are not GAM, SGAM or PFS latches. Metadata latches don\u2019t show up directly, instead the other latch types are eliminated.<\/p>\n<pre class=\"lang:none theme:none\">SELECT \n\tsession_id\n\t,wait_type\n\t,wait_duration_ms\n\t,blocking_session_id\n\t,resource_description\n\t,ResourceType = Case\n\t\t\tWhen TRY_CONVERT(int,Right(resource_description, Len(resource_description) - Charindex(':', resource_description, 3))) - 1 % 8088 = 0 Then 'Is PFS Page'\n\t\t\tWhen TRY_CONVERT(int,(Right(resource_description, Len(resource_description) - Charindex(':', resource_description, 3)))) - 2 % 511232 = 0 Then 'Is GAM Page'\n\t\t\tWhen TRY_CONVERT(int,(Right(resource_description, Len(resource_description) - Charindex(':', resource_description, 3)))) - 3 % 511232 = 0 Then 'Is SGAM Page'\n\t\t\tElse 'Is Not PFS, GAM, or SGAM page'\n\t\tEnd\nFROM sys.dm_os_waiting_tasks WT\nWHERE wait_type Like 'PAGE%LATCH_%'\n\tAnd resource_description Like '2:%'\nORDER BY WT.session_id<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"841\" height=\"159\" class=\"wp-image-105050\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/01\/word-image-105047-3.png\" \/><\/p>\n<p>After you have confirmed you have tempdb metadata latch contention, and the issue persists, consider enabling the memory optimized configuration for tempdb. If you want additional confirmation, support can be engaged at this time. Be sure to include your troubleshooting steps to expedite your ticket.<\/p>\n<h2>Implementation<\/h2>\n<p>Since this impacts tempdb, the change is implemented at the server level using sp_configure. The setting is \u2018tempdb metadata memory-optimized\u2019. If you haven\u2019t already enabled viewing advanced options, you\u2019ll need to do it for this. If it returns a config value of 0, it needs to be changed to 1.<\/p>\n<pre class=\"lang:none theme:none\">EXEC sp_configure 'show advanced options'<\/pre>\n<p>If you see the following, change the value to 1. It doesn\u2019t require a service restart and is safe to change even under load.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"371\" height=\"37\" class=\"wp-image-105051\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/01\/word-image-105047-4.png\" \/><\/p>\n<pre class=\"lang:none theme:none\">EXEC sp_configure 'show advanced options',1\nGO\nRECONFIGURE\nGO<\/pre>\n<p>Advanced options are now available, but this is just the preamble to making the actual change. First, confirm the current setting.<\/p>\n<pre class=\"lang:none theme:none\">EXEC sp_configure 'tempdb metadata memory-optimized'<\/pre>\n<p>If your server configuration is using the default, tempdb is not memory optimized, as shown here.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"437\" height=\"44\" class=\"wp-image-105052\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/01\/word-image-105047-5.png\" \/><\/p>\n<p>To enable tempdb to be metadata memory-optimized, change the value to 1, run the reconfigure command, and restart the service.<\/p>\n<pre class=\"lang:none theme:none\">EXEC sp_configure 'tempdb metadata memory-optimized',1\nGO\nRECONFIGURE\nGO<\/pre>\n<p>The SQL Server service now needs to be restarted. Ideally, changes like this happen during your regular maintenance window, but you may not realize you need this change until it is a production issue and the server is already struggling. You don\u2019t have to restart the server, but the SQL Server service does need to restart using your usual method. The restart will result in a disruption of service. The disruption will be limited if you have a high-availability setup, but it will impact current connections.<\/p>\n<p><strong>Reminder &#8211; This will disrupt your service and drop existing connections. If possible, do this during a maintenance window.<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"205\" height=\"78\" class=\"wp-image-105053\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/01\/word-image-105047-6.png\" \/><\/p>\n<p>Once the service is restarted, confirm the setting is changed.<\/p>\n<pre class=\"lang:none theme:none\">--Restart the SQL Service\n--Restarting the physical server isn't necessary - just the service\nEXEC sp_configure 'tempdb metadata memory-optimized'<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"434\" height=\"42\" class=\"wp-image-105054\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/01\/word-image-105047-7.png\" \/><\/p>\n<p>You now need to check your server health. Look at your wait types to confirm that you no longer have latch contention in tempdb metadata using the same methods described above. You also need to confirm that query and server performance have improved. It\u2019s a simple change that can have a big impact on systems experiencing this particular issue.<\/p>\n<h2>Reverting the Configuration<\/h2>\n<p>If server performance doesn\u2019t improve or you have additional issues with this setting change, the fix is to revert the setting back to 0 and restart the service. If you confirmed that the server is having metadata latch contention in tempdb (and you did the other troubleshooting steps), I wouldn\u2019t expect that you would need to revert this change, but it\u2019s always good to have a contingency plan.<\/p>\n<pre class=\"lang:none theme:none\">--REVERT SETTING\nEXEC sp_configure 'tempdb metadata memory-optimized',0\nGO\nRECONFIGURE\nGO\n--Remember to restart the SQL service for this to take effect<\/pre>\n<h2>Reproducing the use case<\/h2>\n<p>This use case can be demonstrated by creating as many temp tables as possible, as quickly as possible. If you run a script like the following, in multiple connections in SSMS simultaneously, you will be able to see the problem. I was able to produce the issue locally with 5 scripts running. As noted in the script, don\u2019t run this in a production environment. I wouldn\u2019t even run it in a shared environment \u2013 it will cause performance problems, including high CPU usage. The good news is, there is no cleanup needed when you are done testing. Performance will return to normal once the scripts are stopped or once they finish creating temp tables.<\/p>\n<pre class=\"lang:none theme:none\">--This would normally be a bad example - don't do this!\n--Used to simulate a high-volume \/ high concurrency\n--systems using many temp tables\n--Note that the tables are never used - this is a demonstration of metadata contention\nDECLARE @Counter\tint = 0\n\t,@SQL\t\t\tvarchar(max) = ''\nWHILE @Counter &lt; 100000\nBEGIN\n\tSELECT @SQL = 'CREATE TABLE #' + CONVERT(varchar(10),@Counter) + '(\n\tTestTableId\t\tint\t\tNOT NULL PRIMARY KEY CLUSTERED identity\n\t,Column1\t\tvarchar(100)\tDEFAULT (' + '''' + 'Test table column default' + '''' + ')\n\t,Column2\t\tint\t\t\t\tDEFAULT(' + CoNVERT(varchar(10),@Counter) + ')\n\t)\n\t\t\n\tDROP TABLE #' + CONVERT(varchar(10),@Counter) + '\n\t'\n\tEXEC (@SQL)\n\tSELECT @Counter += 1\nEND<\/pre>\n<p>CPU spikes when the script is run in multiple connections. If you run the troubleshooting queries from the previous sections, you will see page latch contention and you will also see that the contention is in tempdb. This query is how I reproduced the issue for my examples.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"582\" height=\"142\" class=\"wp-image-105055\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/01\/word-image-105047-8.png\" \/><\/p>\n<p>If you enable the setting \u2018tempdb metadata memory-optimized\u2019 on your test server and run the test scripts again, you will see improved performance and that the page latch contention is gone.<\/p>\n<h2>Summary<\/h2>\n<p>This is one use case for enabling the memory optimized tempdb metadata server setting. If your system has a high number of concurrent transactions and uses temp tables extensively, it may be a good fit. It worked well in the scenario I have described above, but remember to always test changes to the system settings in SQL Server and confirm improvement after making changes. The general best practice is to leave things at the installation default unless you have a specific reason to make the change. Check your usage patterns, look at your wait types, and be ready to revert this setting if it degrades performance. Opening a support ticket is always an option if you need additional confirmation before making a server configuration change in a high value production environment.<\/p>\n<h2>References<\/h2>\n<ul>\n<li><a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/relational-databases\/diagnose-resolve-latch-contention?view=sql-server-ver16\">https:\/\/learn.microsoft.com\/en-us\/sql\/relational-databases\/diagnose-resolve-latch-contention?view=sql-server-ver16<\/a><\/li>\n<li><a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/relational-databases\/system-dynamic-management-views\/sys-dm-os-wait-stats-transact-sql\">https:\/\/learn.microsoft.com\/en-us\/sql\/relational-databases\/system-dynamic-management-views\/sys-dm-os-wait-stats-transact-sql<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>I recently had an interesting production SQL Server issue that turned out to be very easy to fix. The fix doesn\u2019t fit every workload or server, but for the limited use cases described below, it\u2019s a simple configuration change. The general best practice for server level configurations is to leave things at default, unless there&#8230;&hellip;<\/p>\n","protected":false},"author":19670,"featured_media":103086,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[2],"tags":[4693,4151,4255],"coauthors":[98702],"class_list":["post-105047","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blogs","tag-configuration","tag-sql-server","tag-tempdb"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/105047","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/19670"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=105047"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/105047\/revisions"}],"predecessor-version":[{"id":105056,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/105047\/revisions\/105056"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/103086"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=105047"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=105047"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=105047"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=105047"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}