{"id":1765,"date":"2014-02-20T00:00:00","date_gmt":"2014-02-20T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/fixing-cache-bloat-problems-with-guide-plans-and-forced-parameterization\/"},"modified":"2021-06-03T16:44:08","modified_gmt":"2021-06-03T16:44:08","slug":"fixing-cache-bloat-problems-with-guide-plans-and-forced-parameterization","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/performance-sql-server\/fixing-cache-bloat-problems-with-guide-plans-and-forced-parameterization\/","title":{"rendered":"Fixing Cache Bloat Problems With Guide Plans and Forced Parameterization"},"content":{"rendered":"<div id=\"pretty\">\n<p>&#8216;Cache bloat&#8217; is a problem \u00a0that I&#8217;ve found \u00a0in the vast majority of those SQL Server instances that I have been asked to work on in order to tune their performance. Despite all the recommendations published on the web that warn people to be careful about this, it is very common to find that developers still run ad-hoc queries in such a way that they bloat the cache with lots of query plans. \u00a0It is frustrating to see because it is so easily avoided. Sometimes, however, you&#8217;re not in a position where you can change the code.<\/p>\n<p>Why is it important to avoid Cache Bloat? How can the DBA avoid the consequences of this problem without requiring the developer to change the application? \u00a0We&#8217;ll answer these questions in this article.<\/p>\n<h2>So, what is cache bloat<\/h2>\n<p>&#8216;Cache bloat&#8217; is a term used to describe a growth in the size of \u00a0SQL Server&#8217;s procedure cache, using memory &#8216;stolen&#8217; from the buffer cache. \u00a0There is a memory area that is used to store execution plans that are created by SQL Server. It is also used for a variety of other volatile storage such as locks, connections and cached data. SQL Server will take space from this shared buffer pool for the extra plans that need to be stored, and at a certain point this will start to restrict the memory available for other processes. This could cause a very high CPU usage without any blocking, slow performance, and finally errors in executing queries. \u00a0<\/p>\n<p>The procedure cache is used by the relational engine to avoid having to compile an execution plan every time a particular query is executed. Before a plan is created for a query, the cache is searched to see if there is already a plan for the same query. If so, that plan is reused.<\/p>\n<p>Sometimes this plan cache area of memory becomes filled with queries that the query optimizer doesn&#8217;t reckon to be the same, but which seem obviously the same to the programmer. \u00a0To reuse a plan, a query need to be equal to what&#8217;s in the cache: for instance, \u00a0if a whitespace or an upper\/lower letter is different it will not reuse the plan, and it will trigger a plan compilation. Although the algorithm for determining whether the queries are the same is very efficient, certain things, such as unqualified object references, will lead it to consider them different.<\/p>\n<p>The most obvious &#8216;surface&#8217; difference between two queries that actually disguises the fact that they are the same is the &#8216;literal&#8217; or value in a SQL query. \u00a0So the query..<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT * FROM Sales.SalesOrderHeaderWHERE OrderDate ='12 Mar 2004'<\/pre>\n<p>&#8230; should really be considered the same as..<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT * FROM Sales.SalesOrderHeaderWHERE OrderDate ='11 Apr 2004'<\/pre>\n<p>The literal date &#8217;12 Mar 2004&#8242; is different but the query plan should be the same. The Relational engine uses a technique called &#8216;auto-parameterization&#8217; to turn the literal value into a parameter \u00a0so that subsequent calls of the same SQL with a different date literal will be matched, but it only does so if it judges that they should be handled by the same query plan.<\/p>\n<p>Normally, if you are using procedures, functions or parameterized queries\/batches, you&#8217;re providing a strong hint to the query engine about where the parameters are and whether it is safe to reuse a plan. Ad-hoc queries don&#8217;t provide many clues, and SQL Server likes to play safe and assume that each one requires a new plan. \u00a0Unless SQL Server determines that it can automatically parameterize a query, or it determines that it is the same query, it is forced to generate a new execution plan. Apart from wasting resources to compile the query (which consumes a lot of CPU), it will store each new plan in cache. If a server receives a large number of ad-hoc queries, it will take pages of memory from the buffer cache pool \u00a0to store them and this can lead to a more general memory stress.<\/p>\n<h2>What&#8217;s the problem with ad-hoc queries?<\/h2>\n<p>Here is a sample of an ad-hoc query:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tDECLARE @Var VarChar(10), @SQL VarChar(200)\r\n\tSET @Var = CONVERT(VarChar(10), GETDATE() - (CheckSUM(NEWID()) \/ 1000000), 112)\r\n\t\r\n\tSET @SQL = 'SELECT * FROM OrdersBig WHERE OrderDate = ' + '''' + @Var + ''''\r\n\tEXEC (@SQL)\r\n\tGO\r\n<\/pre>\n<p>You&#8217;ll see that the user is mistakenly concatenating the user input to the SQL code &#8216;on the fly&#8217; to create the query. If you are forced to use dynamic sql for some reason, sp_ex<a href=\"https:\/\/www.simple-talk.com\/blogs\/2009\/08\/03\/stolen-pages-ad-hoc-queries-and-the-sins-of-dynamic-sql-in-the-application\/\">ecuteSQL can be used, with parameters , to allow plan reuse<\/a> and minimise \u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/SQL_injection\">SQL Injection<\/a> risks. \u00a0Even better would be a stored procedure, whose execution plan only needs to be created once, if a plan for it is not found in the procedure cache. \u00a0Every time it is called, the relational engine reuses the same query plan.<\/p>\n<p>In order to show how Ad-hoc queries \u00a0can be a problem, I&#8217;ll use <a href=\"http:\/\/dataeducation.com\/sqlquerystress-the-source-code\/\">SQL Query Stress<\/a> to simulate many users running a query at the same time and let&#8217;s see what happens with CPU and memory. The query uses the SQL Server demo database <a href=\"http:\/\/msftdbprodsamples.codeplex.com\/releases\/view\/55330\">AdventureWorks2012<\/a>, and is as follows:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tSELECT SalesOrderHeader.SalesPersonID,\r\n\t\u00a0\u00a0\u00a0\u00a0\u00a0 COUNT(DISTINCT SalesOrderHeader.CustomerID), \r\n\t\u00a0\u00a0\u00a0\u00a0\u00a0 SUM(SalesOrderDetail.OrderQty)\r\n\t FROM Sales.SalesOrderHeader\r\n\t\u00a0INNER JOIN Sales.SalesOrderDetail\r\n\t\u00a0\u00a0 ON SalesOrderDetail.SalesOrderID = SalesOrderHeader.SalesOrderID\r\n\t\u00a0INNER JOIN Production.Product\r\n\t\u00a0\u00a0 ON Product.ProductID = SalesOrderDetail.ProductID\r\n\t\u00a0WHERE Product.Name = 'C1DCB640-A394-4C33-95B0-D8EFF1DE6895'\r\n\t\u00a0GROUP BY SalesOrderHeader.SalesPersonID\r\n<\/pre>\n<p>The result and logic of the query don&#8217;t matter here, I only want to show a query that receives a parameter filter dynamically, to do this lets run it using EXEC and concatenating the filter on Product.Name. \u00a0Here is the dynamic query:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tDECLARE @Var VarChar(250), @SQL VarChar(MAX)\r\n\tSET @Var = NEWID()\r\n\t\r\n\tSET @SQL = \r\n\t'SELECT SalesOrderHeader.SalesPersonID,\r\n\t\u00a0\u00a0\u00a0\u00a0\u00a0 COUNT(DISTINCT SalesOrderHeader.CustomerID), \r\n\t\u00a0\u00a0\u00a0\u00a0\u00a0 SUM(SalesOrderDetail.OrderQty)\r\n\t FROM Sales.SalesOrderHeader\r\n\t\u00a0INNER JOIN Sales.SalesOrderDetail\r\n\t\u00a0\u00a0 ON SalesOrderDetail.SalesOrderID = SalesOrderHeader.SalesOrderID\r\n\t\u00a0INNER JOIN Production.Product\r\n\t\u00a0\u00a0 ON Product.ProductID = SalesOrderDetail.ProductID\r\n\t\u00a0WHERE Product.Name = ' + '''' + @Var + '''\r\n\t\u00a0GROUP BY SalesOrderHeader.SalesPersonID'\r\n\t\r\n\tEXEC (@SQL)\r\n<\/pre>\n<p>Every time the query above is executed, SQL Server will run a different query (because of the NEWID that creates a GUID), and consequently will compile a new plan for the query.<\/p>\n<p>If I run the code on <b>SQLQueryStress<\/b>using 5 iterations on 200 threads looks what happens:<\/p>\n<p class=\"illustration\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1943-clip_image001-620x370.jpg\" alt=\"1943-clip_image001-620x370.jpg\" width=\"620\" height=\"370\" \/><\/p>\n<p>Since SQL Server has to create an execution plan for each query, it is taking lot of time (more than 23 seconds) and resource (note CPU on 100%) to run the queries. In addition, if you look at the waits for the queries we can see that it is waiting &#8220;for the query to compile&#8221; (wait <code> resource_semaphore_query_compile<\/code>), following is the result of <code> sp_whoisactive<\/code> when the queries were running.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1943-clip_image002-620x393.jpg\" alt=\"1943-clip_image002-620x393.jpg\" width=\"620\" height=\"393\" \/><\/p>\n<p>The main points here are that you&#8217;re going to have to avoid<\/p>\n<ul>\n<li>compiling the query every time you execute it since they should be considered to be the same. Because you are only changing \u00a0the filter predicate, the existing plan is still usable<\/li>\n<li>\u00a0wasting CPU on compilation<\/li>\n<li>\u00a0the wait for the queries to compile<\/li>\n<li>wasting memory used to store 1k &#8220;different&#8221; plans on plan cache<\/li>\n<\/ul>\n<h2>So how do you solve the problem without changing the code?<\/h2>\n<p>There are some solutions for this problem, <a href=\"http:\/\/www.sqlskills.com\/blogs\/kimberly\/plan-cache-and-optimizing-for-adhoc-workloads\/\"> &#8220;optimizing for adhoc workloads&#8221;<\/a>, changing the database parameterization to &#8216;forced&#8217;, or just putting the query in either a stored procedure or parameterized batch, and receiving the predicate filter in an input parameter.<\/p>\n<p>The solution I would like to present you is to use guide plans to force the parameterization only for specific queries. The advantage of this technique is that it can be used in cases where you don&#8217;t have the option of fixing the code, as with third-party software. You can, of course, <a href=\"http:\/\/sqlmag.com\/blog\/forced-parameterization-when-should-i-use-it\">set parameterization to &#8216;forced&#8217;<\/a> at database level to solve this problem, but it is a blunt instrument to fix a problem that may be due to just a handful of queries. It will force the parameterization for all queries running on the database, whether they are suitable or not, and this may lead in turn to other problems; see a sample <a href=\"http:\/\/blogs.msdn.com\/b\/sql_pfe_blog\/archive\/2013\/09\/03\/forced-parameterization-can-lead-to-poor-performance.aspx\"> here<\/a> from PFE Thomas Stringer. If you specify forced parameterization you may force plan reuse even where it would result in a poorly-performing plan being selected. You&#8217;d be fixing one problem but making others (with other queries).That is why I like to fix the problem using guide plans. With guide plans you can specify the forced parameterization within the scope of individual query<\/p>\n<p>The creation of \u00a0a template guide plan is straightforward, \u00a0and to illustrate this, here is a script to create a guide plans to the query used on our tests:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tDECLARE @stmt nvarchar(max);\r\n\tDECLARE @params nvarchar(max);\r\n\tEXEC sp_get_query_template \r\n\tN'SELECT SalesOrderHeader.SalesPersonID,\r\n\t\u00a0\u00a0\u00a0\u00a0\u00a0 COUNT(DISTINCT SalesOrderHeader.CustomerID), \r\n\t\u00a0\u00a0\u00a0\u00a0\u00a0 SUM(SalesOrderDetail.OrderQty)\r\n\t FROM Sales.SalesOrderHeader\r\n\t\u00a0INNER JOIN Sales.SalesOrderDetail\r\n\t\u00a0\u00a0 ON SalesOrderDetail.SalesOrderID = SalesOrderHeader.SalesOrderID\r\n\t\u00a0INNER JOIN Production.Product\r\n\t\u00a0\u00a0 ON Product.ProductID = SalesOrderDetail.ProductID\r\n\t\u00a0WHERE Product.Name = ''3514C79D-12C2-4673-A5E3-8961F392B396''\r\n\t\u00a0GROUP BY SalesOrderHeader.SalesPersonID',\r\n\t\u00a0@stmt OUTPUT, @params OUTPUT;\r\n\t\r\n\tEXEC sp_create_plan_guide \r\n\t\u00a0\u00a0 N'TemplateGuide1', \r\n\t\u00a0\u00a0 @stmt, \r\n\t\u00a0\u00a0 N'TEMPLATE', \r\n\t\u00a0\u00a0 NULL, \r\n\t\u00a0\u00a0 @params, \r\n\t\u00a0\u00a0 N'OPTION(PARAMETERIZATION FORCED)';\r\n\tGO\r\n<\/pre>\n<p>First I&#8217;m using the procedure <code>sp_get_query_template<\/code> to get a parameterized version of the query and then I&#8217;m using the query (variable <code>@stmt<\/code>) and parameters (variable <code>@params<\/code>) to create the guide plan using the procedure <code>sp_create_plan_guide<\/code>.<\/p>\n<p>The result is that, next time you run the query independent of the predicate value specified SQL Server will identify the guide plan and use the hint &#8220;parameterization forced&#8221; to parameterize the query. You can see that it is using the guide plan and the parameters on the estimated graph execution plan, for instance:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1943-clip_image003-620x248.jpg\" alt=\"1943-clip_image003-620x248.jpg\" width=\"620\" height=\"248\" \/><\/p>\n<p>Now, if I run the same test I did on <b>SQLQueryStress<\/b>, look at the result:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1943-clip_image004-620x428.jpg\" alt=\"1943-clip_image004-620x428.jpg\" width=\"620\" height=\"428\" \/><\/p>\n<p>All 1000 executions finish on less than a second.<\/p>\n<p>In case you are wondering about the stored procedures and how it would perform on this query, here is the test. First let&#8217;s create the procedure:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">IF OBJECT_ID('st_CacheBloat_1') IS NOT NULL\r\n DROP PROCEDURE st_CacheBloat_1\r\nGO\r\nCREATE PROCEDURE st_CacheBloat_1 @Var VarChar(250)\r\nAS\r\nBEGIN\r\n SELECT SalesOrderHeader.SalesPersonID,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 COUNT(DISTINCT SalesOrderHeader.CustomerID), \r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 SUM(SalesOrderDetail.OrderQty)\r\n\u00a0\u00a0 FROM Sales.SalesOrderHeader\r\n\u00a0 INNER JOIN Sales.SalesOrderDetail\r\n\u00a0\u00a0\u00a0\u00a0 ON SalesOrderDetail.SalesOrderID = SalesOrderHeader.SalesOrderID\r\n\u00a0 INNER JOIN Production.Product\r\n\u00a0\u00a0\u00a0\u00a0 ON Product.ProductID = SalesOrderDetail.ProductID\r\n\u00a0 WHERE Product.Name = @Var\r\n\u00a0 GROUP BY SalesOrderHeader.SalesPersonID\r\nEND\r\n\t<\/pre>\n<p>And then executing on SQL Query Stress:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1943-clip_image005-620x419.jpg\" alt=\"1943-clip_image005-620x419.jpg\" width=\"620\" height=\"419\" \/><\/p>\n<p>Again, less than a second. if you are not using Stored procedures, or parameterized queries, then think again.<\/p>\n<h2>Conclusion<\/h2>\n<p>In general, ad-hoc queries are bad for performance and you should instead use stored procedures or parameterized queries\/batches. This is because, to get performance from any database, you should give the relational engine all the help you can. It needs to save the time and resources spent in creating query plans, by only doing it when it&#8217;s necessary. There is no magical way it can detect that repeated queries are actually the same barring a change in parameter, and that the parameter is not going to change sufficiently to render the existing plan a poor one. You need to tell it. The problem is that sometimes you can&#8217;t change the code and in this case guide plans, ad-hoc workloads parameter or even forced parameterization is a good alternative.<\/p>\n<p>It is important to understand that may cause other problems related to the parameter sniffing problem, but it is a topic for another article and this is issue already well covered in articles on the internet.<\/p>\n<p>That&#8217;s all folks.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Imagine it. You&#8217;ve been asked to fix a dire performance problem with a SQL Server database. You find a severe case of &#8216;Cache Bloat&#8217; due to ad-hoc queries, but you can&#8217;t fix the code itself. What should you do? Specify forced parameterization? Perhaps a better idea would be to use guide plans.&hellip;<\/p>\n","protected":false},"author":65554,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143529],"tags":[4168,4206,4150,4151],"coauthors":[6809],"class_list":["post-1765","post","type-post","status-publish","format-standard","hentry","category-performance-sql-server","tag-database","tag-performance","tag-sql","tag-sql-server"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1765","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/65554"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=1765"}],"version-history":[{"count":8,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1765\/revisions"}],"predecessor-version":[{"id":76585,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1765\/revisions\/76585"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=1765"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=1765"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=1765"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=1765"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}