{"id":1435,"date":"2012-11-01T00:00:00","date_gmt":"2012-11-01T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/fixing-gatekeeper-row-cardinality-estimate-issues\/"},"modified":"2021-08-24T13:40:04","modified_gmt":"2021-08-24T13:40:04","slug":"fixing-gatekeeper-row-cardinality-estimate-issues","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/database-administration-sql-server\/fixing-gatekeeper-row-cardinality-estimate-issues\/","title":{"rendered":"Fixing Gatekeeper Row Cardinality Estimate Issues"},"content":{"rendered":"<div id=\"pretty\">\n<p class=\"start\">The SQL Server Query Optimizer needs to estimate the number of rows that are processed for each physical operator in a query plan. These row estimates are referred to as <em>cardinality estimates<\/em> and are required in order to then calculate the cost models of the query plan operators. If these cardinality estimates are inaccurate, the Query Optimizer is less likely to choose the most appropriate execution plan for the query, and so the performance of the query&#8217;s execution is likely to be slower.<\/p>\n<p>There are a variety of reasons for a poor cardinality estimate, and this article will focus on just one of the more complex reasons, affecting the &#8220;gatekeeper row&#8221; cardinality estimate.<\/p>\n<h2>Gatekeeper Row Scenario<\/h2>\n<p>We will illustrate this by using a query executed against the AdventureWorksDW2012 database:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT\u00a0\u00a0[od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0SUM([fis].[SalesAmount]) AS [SalesAMT]\r\nFROM\u00a0\u00a0\u00a0\u00a0[dbo].[FactInternetSales] AS [fis]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimProduct] AS [p]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[ProductKey] = [p].[ProductKey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimCurrency] AS [c]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[CurrencyKey] = [c].[currencykey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimDate] AS [od]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[OrderDateKey] = [od].[DateKey]\r\nWHERE\u00a0\u00a0 [c].[CurrencyName] = N'United Kingdom Pound'\r\nGROUP BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine]\r\nORDER BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine];\r\nGO <\/pre>\n<p>The query references one fact table and three separate dimension tables. The abridged relationship diagram is below (click for full size):<\/p>\n<p class=\"illustration\"><a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-image1large.png\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-d345c092-8d98-40d4-8030-e720c9276d42.png\" alt=\"1592-1-d345c092-8d98-40d4-8030-e720c9276\" \/><\/a><\/p>\n<p>What kind of cardinality estimates do we see and how do they compare to the actual rows? For this article, I was using SQL Server 2012 version 11.0.2316 and I saw the following estimated and actual row counts:<\/p>\n<div>\n<table class=\"MsoTableLightShadingAccent1\">\n<tbody>\n<tr>\n<td valign=\"top\">\n<p><b>Rows<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p><b>Estimated Rows<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p><b>StmtText<\/b><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>12<\/p>\n<\/td>\n<td valign=\"top\">\n<p>22.17362<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0 |&#8211;Sort(ORDER BY:([od].[CalendarYear] ASC, [p].[ProductLine] ASC))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>12<\/p>\n<\/td>\n<td valign=\"top\">\n<p>22.17362<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Aggregate, HASH:([od].[CalendarYear],<br \/>\n[p].[ProductLine]), <br \/>\nRESIDUAL:([AdventureWorksDW2012].[dbo].[DimDate].[CalendarYear] as <br \/>\n[od].[CalendarYear] = <br \/>\n[AdventureWorksDW2012].[dbo].[DimDate].[CalendarYear] as <br \/>\n[od].[CalendarYear] AND <br \/>\n[AdventureWorksDW2012].[dbo].[DimProduct].[ProductLine] as <br \/>\n[p].[ProductLine] = <br \/>\n[AdventureWorksDW2012].[dbo].[DimProduct].[ProductLine] as <br \/>\n[p].[ProductLine]) <br \/>\nDEFINE:([Expr1008]=SUM([AdventureWorksDW2012].[dbo].[FactInternetSales].[SalesAmount] <br \/>\nas [fis].[SalesAmount])))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>6740<\/p>\n<\/td>\n<td valign=\"top\">\n<p>7936.583<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([p].[ProductKey])=([fis].[ProductKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimProduct].[PK_DimProduct_ProductKey] <br \/>\nAS [p]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>6740<\/p>\n<\/td>\n<td valign=\"top\">\n<p>8466.041<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([od].[DateKey])=([fis].[OrderDateKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimDate].[PK_DimDate_DateKey] <br \/>\nAS [od]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>6740<\/p>\n<\/td>\n<td valign=\"top\">\n<p>10066.33<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([c].[CurrencyKey])=([fis].[CurrencyKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimCurrency].[PK_DimCurrency_CurrencyKey] <br \/>\nAS [c]), <br \/>\nWHERE:([AdventureWorksDW2012].[dbo].[DimCurrency].[CurrencyName] as <br \/>\n[c].[CurrencyName]=N&#8217;United Kingdom Pound&#8217;))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>60398<\/p>\n<\/td>\n<td valign=\"top\">\n<p>60398<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[FactInternetSales].<br \/>\n[PK_FactInternetSales_SalesOrderNumber_SalesOrderLineNumber] <br \/>\nAS [fis]))<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>The clustered index scan of <strong>FactInternetSales<\/strong> shows that the estimated rows are equal to actual rows. This represents the total row count from that table, which was 60,398 rows. The estimated and actual row counts are also identical for the clustered index scan of the <strong>DimCurrency<\/strong> table, which makes sense given the &#8220;United Kingdom Pound&#8221; search condition for the <strong>CurrencyName<\/strong> column.<\/p>\n<p>The <strong>DimCurrency<\/strong> has 105 rows in it. But that doesn&#8217;t mean that <strong>FactInternetSales<\/strong> has rows for every possible <strong>CurrencyKey<\/strong>. In reality, the <strong>FactInternetSales<\/strong> has the following distribution of rows by currency, as we can determine by using this query:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT\u00a0\u00a0[c].[CurrencyName],\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 COUNT(*) AS [RowCount]\r\nFROM\u00a0\u00a0\u00a0\u00a0[dbo].[FactInternetSales] AS [fis]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimCurrency] AS [c]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[CurrencyKey] = [c].[currencykey]\r\nGROUP BY [c].[CurrencyName]\r\nORDER BY [c].[CurrencyName];\r\nGO <\/pre>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-b2c97887-3a62-4f28-808e-ef10fad41582.png\" alt=\"1592-1-b2c97887-3a62-4f28-808e-ef10fad41\" width=\"300\" \/><\/p>\n<p>As you can see, most currencies are <em>not<\/em> represented in <strong>FactInternetSales<\/strong>.<\/p>\n<p>Given this, what happens if we change the previous query to reference a currency that has no associated rows? For example, switching to the Pakistan Rupee currency?<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT\u00a0\u00a0[od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0SUM([fis].[SalesAmount]) AS [SalesAMT]\r\nFROM\u00a0\u00a0\u00a0\u00a0[dbo].[FactInternetSales] AS [fis]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimProduct] AS [p]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[ProductKey] = [p].[ProductKey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimCurrency] AS [c]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[CurrencyKey] = [c].[currencykey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimDate] AS [od]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[OrderDateKey] = [od].[DateKey]\r\nWHERE\u00a0\u00a0 [c].[CurrencyName] = N'Pakistan Rupee'\r\nGROUP BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine]\r\nORDER BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine];\r\nGO <\/pre>\n<p>The cardinality estimates versus actual now appear as follows:<\/p>\n<div>\n<table>\n<tbody>\n<tr>\n<td valign=\"top\">\n<p><b>Rows<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p><b>Estimated<\/b><\/p>\n<p><b>Rows<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p><b>StmtText<\/b><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>22.17362<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0 |&#8211;Sort(ORDER BY:([od].[CalendarYear] ASC, [p].[ProductLine] ASC))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>22.17362<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Aggregate, HASH:([od].[CalendarYear], <br \/>\n[p].[ProductLine]), <br \/>\nRESIDUAL:([AdventureWorksDW2012].[dbo].[DimDate].[CalendarYear] as <br \/>\n[od].[CalendarYear] = <br \/>\n[AdventureWorksDW2012].[dbo].[DimDate].[CalendarYear] as <br \/>\n[od].[CalendarYear] AND <br \/>\n[AdventureWorksDW2012].[dbo].[DimProduct].[ProductLine] as <br \/>\n[p].[ProductLine] = <br \/>\n[AdventureWorksDW2012].[dbo].[DimProduct].[ProductLine] as <br \/>\n[p].[ProductLine]) <br \/>\nDEFINE:([Expr1008]=SUM([AdventureWorksDW2012].[dbo].[FactInternetSales].[SalesAmount] <br \/>\nas [fis].[SalesAmount])))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>7936.583<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([p].[ProductKey])=([fis].[ProductKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimProduct].[PK_DimProduct_ProductKey] <br \/>\nAS [p]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>8466.041<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([od].[DateKey])=([fis].[OrderDateKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimDate].[PK_DimDate_DateKey] <br \/>\nAS [od]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>10066.33<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([c].[CurrencyKey])=([fis].[CurrencyKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimCurrency].[PK_DimCurrency_CurrencyKey] <br \/>\nAS [c]), <br \/>\nWHERE:([AdventureWorksDW2012].[dbo].[DimCurrency].[CurrencyName] as <br \/>\n[c].[CurrencyName]=N&#8217;Pakistan Rupee&#8217;))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>60398<\/p>\n<\/td>\n<td valign=\"top\">\n<p>60398<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[FactInternetSales].<br \/>\n[PK_FactInternetSales_SalesOrderNumber_SalesOrderLineNumber] <br \/>\nAS [fis]))<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>As with the original query, the clustered index scan of <strong>FactInternetSales<\/strong> shows that the estimated rows are equal to the actual rows. This represents the total row count from that table, which was 60,398 rows. The estimated and actual row counts are also identical for the clustered index scan of the <strong>DimCurrency table<\/strong>, showing a value of &#8220;1&#8221; representing the search on Pakistan Rupee in the <strong>CurrencyName <\/strong>column.<\/p>\n<p>What about the Hash Match operation between the two tables? Our previous example had 10,066 (estimated) versus 6,740 (actual). But with our new query using a currency that does <em>not <\/em>exist in the fact table, our estimate is still 10,066 and actual number of rows this time is 0.<\/p>\n<p>So a gatekeeper row, in this context, represents a dimension row that will let several fact table rows through to the parent operators when selected via the presence or absence of a search condition. When the gatekeeper dimension row is <em>not<\/em> present, the fact table rows are ultimately not passed through to the final result set, but yet the cardinality estimate issue remains.<\/p>\n<p>I can exacerbate the problem with this cardinality estimate by modifying all the rows in the fact table to the same currency, the Yen in our case, with this simple statement (and if you&#8217;re trying this out yourself, be sure to back up your AdventureWorksDW2012 database beforehand so you can revert back afterwards):<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">-- Setting all Fact table rows to the Yen currency\r\nUPDATE [dbo].[FactInternetSales]\r\nSET [CurrencyKey] = 102;\r\nGO<\/pre>\n<p>Once I have done this and we now have only Yen, I see the following estimated rows vs. actual rows if I execute the previous query that references the Pakistan Rupee.<\/p>\n<div>\n<table>\n<tbody>\n<tr>\n<td valign=\"top\">\n<p><b>Rows<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p><b>Estimated<\/b><\/p>\n<p><b>Rows<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p><b>StmtText<\/b><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>22.04758<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0 |&#8211;Stream Aggregate(GROUP BY:([od].[CalendarYear], <br \/>\n[p].[ProductLine]) DEFINE:([Expr1008]=SUM([partialagg1015])))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>23.5184<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Sort(ORDER BY:([od].[CalendarYear] ASC, [p].[ProductLine] <br \/>\nASC))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>23.5184<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Nested Loops(Inner Join, OUTER <br \/>\nREFERENCES:([fis].[CurrencyKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>12<\/p>\n<\/td>\n<td valign=\"top\">\n<p>88.19033<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Aggregate, HASH:([fis].[CurrencyKey], <br \/>\n[p].[ProductLine], [od].[CalendarYear]), <br \/>\nRESIDUAL:([AdventureWorksDW2012].[dbo].[FactInternetSales].[CurrencyKey] <br \/>\nas [fis].[CurrencyKey] = <br \/>\n[AdventureWorksDW2012].[dbo].[FactInternetSales].[CurrencyKey] as <br \/>\n[fis].[CurrencyKey] AND <br \/>\n[AdventureWorksDW2012].[dbo].[DimProduct].[ProductLine] as <br \/>\n[p].[ProductLine] = <br \/>\n[AdventureWorksDW2012].[dbo].[DimProduct].[ProductLine] as <br \/>\n[p].[ProductLine] AND <br \/>\n[AdventureWorksDW2012].[dbo].[DimDate].[CalendarYear] as <br \/>\n[od].[CalendarYear] = <br \/>\n[AdventureWorksDW2012].[dbo].[DimDate].[CalendarYear] as <br \/>\n[od].[CalendarYear]) <br \/>\nDEFINE:([partialagg1015]=SUM([AdventureWorksDW2012].[dbo].[FactInternetSales].[SalesAmount] <br \/>\nas [fis].[SalesAmount])))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>60398<\/p>\n<\/td>\n<td valign=\"top\">\n<p>47348.81<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | \u00a0\u00a0\u00a0|&#8211;Hash <br \/>\nMatch(Inner Join, HASH:([p].[ProductKey])=([fis].[ProductKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimProduct].[PK_DimProduct_ProductKey] <br \/>\nAS [p]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>60398<\/p>\n<\/td>\n<td valign=\"top\">\n<p>50507.5<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([od].[DateKey])=([fis].[OrderDateKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimDate].[PK_DimDate_DateKey] <br \/>\nAS [od]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>60398<\/p>\n<\/td>\n<td valign=\"top\">\n<p>60398<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[FactInternetSales].<br \/>\n[PK_FactInternetSales_SalesOrderNumber_SalesOrderLineNumber] <br \/>\nAS [fis]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nSeek(OBJECT:([AdventureWorksDW2012].[dbo].[DimCurrency].[PK_DimCurrency_CurrencyKey] <br \/>\nAS [c]), <br \/>\nSEEK:([c].[CurrencyKey]=[AdventureWorksDW2012].[dbo].[FactInternetSales].[CurrencyKey] <br \/>\nas [fis].[CurrencyKey]),\u00a0 <br \/>\nWHERE:([AdventureWorksDW2012].[dbo].[DimCurrency].[CurrencyName] as <br \/>\n[c].[CurrencyName]=N&#8217;Pakistan Rupee&#8217;) ORDERED FORWARD)<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>This time the Query Optimizer re-arranged the order in which the tables were joined. This new plan started with a hash join between <strong>DimDate<\/strong> and <strong>FactInternetSales<\/strong>, followed by a hash join to <strong>DimProduct<\/strong>, and then finally a nested loop join to <strong>DimCurrency<\/strong>. The estimated row count of the nested loop operation is 24 vs. the actual 0 row count. So we see that we are <em>not<\/em> filtering out the <strong>FactInternetSales<\/strong> rows early in the plan, even though no rows are ultimately returned.<\/p>\n<h2>Increasing the Scale of the Problem<\/h2>\n<p>Why is all of this important? It is because, with a real relational data warehouse database of any scale, the performance will be noticeable impacted. The problem is that the Query Optimizer isn&#8217;t aware that there are zero qualifying rows in the Fact table. The query plan generates unnecessary reads in the leaf level of the execution plan and then passes the pre-filtered rows to the intermediate levels of the plan tree and is not filtered down until later steps. Our current database is small, but the consequences of these cardinality estimate issues become more pronounced with larger tables.<\/p>\n<p>To demonstrate the impact at a larger scale, I&#8217;m going to increase the size of the FactInternetSales table to 3,865,472 rows:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">-- Increase size of table to 3,865,472 rows\r\nINSERT\u00a0\u00a0[dbo].[FactInternetSales]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0( [ProductKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[OrderDateKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[DueDateKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[ShipDateKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[CustomerKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[PromotionKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[CurrencyKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[SalesTerritoryKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[SalesOrderNumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[SalesOrderLineNumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[RevisionNumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[OrderQuantity] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[UnitPrice] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[ExtendedAmount] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[UnitPriceDiscountPct] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[DiscountAmount] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[ProductStandardCost] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[TotalProductCost] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[SalesAmount] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[TaxAmt] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[Freight] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[CarrierTrackingNumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[CustomerPONumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[OrderDate] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[DueDate] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[ShipDate]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0SELECT\u00a0\u00a0[FactInternetSales].[ProductKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[OrderDateKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[DueDateKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[ShipDateKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[CustomerKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[PromotionKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[CurrencyKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[SalesTerritoryKey] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0LEFT(CAST(NEWID() AS NVARCHAR(36)), 20) ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[SalesOrderLineNumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[RevisionNumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[OrderQuantity] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[UnitPrice] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[ExtendedAmount] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[UnitPriceDiscountPct] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[DiscountAmount] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[ProductStandardCost] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[TotalProductCost] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[SalesAmount] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[TaxAmt] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[Freight] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[CarrierTrackingNumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[CustomerPONumber] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[OrderDate] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[DueDate] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[FactInternetSales].[ShipDate]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0FROM\u00a0\u00a0\u00a0\u00a0[dbo].[FactInternetSales];\r\nGO 6 -- Executes this INSERT 6 separate times<\/pre>\n<p>I&#8217;ve inflated the size based on the modified data which had all rows associated with just one currency key. My gatekeeper dimension row is CurrencyKey = 102. If I reference this specific key, then millions of fact table rows will need to be accessed, whereas any other value of CurrencyKey, assuming equality search conditions, will cause no rows to be returned from the Fact table.<\/p>\n<p>Re-executing the query that referenced the Pakistan Rupee, the following plan is generated (show via SQL Server Plan Explorer &#8211; click for full size):<\/p>\n<p class=\"illustration\"><a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-image2large.png\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-860f3fba-ac26-4198-9447-517a66a99ca7.png\" alt=\"1592-1-860f3fba-ac26-4198-9447-517a66a99\" \/><\/a><\/p>\n<p>The first observation is that the plan is now executed with parallel operations. This is an artifact of the increase in the number of fact table rows and associated cost.<\/p>\n<p>The second observation is that millions of rows are flowing through the plan tree and only narrowing down until it is roughly 2\/3rds through the plan. The associated estimated vs. actual row counts continue to be skewed and the Query Optimizer doesn&#8217;t know that all rows in <strong>FactInternet<\/strong><strong>Sales<\/strong> only have one <strong>CurrencyKey<\/strong> value. The <strong>DimCurrency<\/strong> table is not evaluated until very late in the plan:<\/p>\n<div>\n<table>\n<tbody>\n<tr>\n<td valign=\"top\">\n<p><b>Rows<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p><b>Estimate<\/b><\/p>\n<p><b>Rows<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p><b>StmtText<\/b><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>21.11082<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0 |&#8211;Stream Aggregate(GROUP BY:([od].[CalendarYear], <br \/>\n[p].[ProductLine]) DEFINE:([Expr1008]=SUM([partialagg1015])))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>22.51457<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Parallelism(Gather Streams, ORDER BY:([od].[CalendarYear] <br \/>\nASC, [p].[ProductLine] ASC))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>22.51457<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Nested Loops(Inner Join, OUTER <br \/>\nREFERENCES:([fis].[CurrencyKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>12<\/p>\n<\/td>\n<td valign=\"top\">\n<p>84.44327<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Sort(ORDER BY:([od].[CalendarYear] ASC, [p].[ProductLine] <br \/>\nASC))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>12<\/p>\n<\/td>\n<td valign=\"top\">\n<p>84.44327<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0 |&#8211;Hash Match(Aggregate, HASH:([fis].[CurrencyKey], <br \/>\n[p].[ProductLine], [od].[CalendarYear]), <br \/>\nRESIDUAL:([AdventureWorksDW2012].[dbo].[FactInternetSales].[CurrencyKey] <br \/>\nas [fis].[CurrencyKey] = <br \/>\n[AdventureWorksDW2012].[dbo].[FactInternetSales].[CurrencyKey] as <br \/>\n[fis].[CurrencyKey] AND <br \/>\n[AdventureWorksDW2012].[dbo].[DimProduct].[ProductLine] as <br \/>\n[p].[ProductLine] = <br \/>\n[AdventureWorksDW2012].[dbo].[DimProduct].[ProductLine] as <br \/>\n[p].[ProductLine] AND <br \/>\n[AdventureWorksDW2012].[dbo].[DimDate].[CalendarYear] as <br \/>\n[od].[CalendarYear] = <br \/>\n[AdventureWorksDW2012].[dbo].[DimDate].[CalendarYear] as <br \/>\n[od].[CalendarYear]) <br \/>\nDEFINE:([partialagg1015]=SUM([AdventureWorksDW2012].<br \/>\n[dbo].[FactInternetSales].[SalesAmount] <br \/>\nas [fis].[SalesAmount])))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>3865472<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2900982<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0|\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Parallelism(Repartition Streams, Hash Partitioning, <br \/>\nPARTITION COLUMNS:([fis].[CurrencyKey], [p].[ProductLine], <br \/>\n[od].[CalendarYear]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>3865472<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2900982<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([p].[ProductKey])=([fis].[ProductKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Bitmap(HASH:([p].[ProductKey]),<br \/>\nDEFINE:([Opt_Bitmap1010]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0|\u00a0\u00a0\u00a0 |&#8211;Parallelism(Repartition Streams,<br \/>\nHash Partitioning, PARTITION <br \/>\nCOLUMNS:([p].[ProductKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>606<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].<br \/>\n[DimProduct].[PK_DimProduct_ProductKey] <br \/>\nAS [p]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>3865472<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2900982<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Parallelism(Repartition Streams, Hash Partitioning, <br \/>\nPARTITION COLUMNS:([fis].[ProductKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>3865472<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2900982<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Hash Match(Inner Join, <br \/>\nHASH:([od].[DateKey])=([fis].[OrderDateKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Bitmap(HASH:([od].[DateKey]),<br \/>\nDEFINE:([Opt_Bitmap1009]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0 |&#8211;Parallelism(Repartition Streams, Hash Partitioning,<br \/>\nPARTITION <br \/>\nCOLUMNS:([od].[DateKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2191<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[DimDate].[PK_DimDate_DateKey] <br \/>\nAS [od]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>3865472<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2900982<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Parallelism(Repartition Streams, Hash Partitioning, <br \/>\nPARTITION COLUMNS:([fis].[OrderDateKey]))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>3865472<\/p>\n<\/td>\n<td valign=\"top\">\n<p>3865472<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nScan(OBJECT:([AdventureWorksDW2012].[dbo].[FactInternetSales].<br \/>\n[PK_FactInternetSales_SalesOrderNumber_SalesOrderLineNumber] <br \/>\nAS [fis]), <br \/>\nWHERE:(PROBE([Opt_Bitmap1009],[AdventureWorksDW2012].<br \/>\n[dbo].[FactInternetSales].[OrderDateKey] <br \/>\nas [fis].[OrderDateKey],N'[IN ROW]&#8217;) AND <br \/>\nPROBE([Opt_Bitmap1010],[AdventureWorksDW2012].<br \/>\n[dbo].[FactInternetSales].[ProductKey] <br \/>\nas [fis].[ProductKey],N'[IN ROW]&#8217;)))<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>0<\/p>\n<\/td>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 |&#8211;Clustered Index <br \/>\nSeek(OBJECT:([AdventureWorksDW2012].<br \/>\n[dbo].[DimCurrency].[PK_DimCurrency_CurrencyKey] <br \/>\nAS [c]), <br \/>\nSEEK:([c].[CurrencyKey]=[AdventureWorksDW2012].<br \/>\n[dbo].[FactInternetSales].[CurrencyKey] <br \/>\nas [fis].[CurrencyKey]),\u00a0 <br \/>\nWHERE:([AdventureWorksDW2012].[dbo].[DimCurrency].[CurrencyName] as <br \/>\n[c].[CurrencyName]=N&#8217;Pakistan Rupee&#8217;) ORDERED FORWARD)<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h2>Why Care?<\/h2>\n<p>So what are the consequences of this? There are a few reasons that stand out for the problem that I&#8217;ve illustrated:<\/p>\n<ul>\n<li>Inflated query execution memory grants<\/li>\n<li>Reduced concurrency due to inflated query execution memory grants<\/li>\n<li>Excessive read I\/O from reading rows that aren&#8217;t actually required<\/li>\n<li>Buffer pool pressure (potentially) from all the data file pages being read from disk<\/li>\n<\/ul>\n<p>If you look at the Showplan XML of the previously executed query, there are additional details of the memory grant requirements:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">&lt;MemoryGrantInfo\r\n\u00a0\u00a0\u00a0\u00a0SerialRequiredMemory=\"3072\"\r\n\u00a0\u00a0\u00a0\u00a0SerialDesiredMemory=\"9464\"\r\n\u00a0\u00a0\u00a0\u00a0RequiredMemory=\"30400\"\r\n\u00a0\u00a0\u00a0\u00a0DesiredMemory=\"36800\"\r\n\u00a0\u00a0\u00a0\u00a0RequestedMemory=\"36800\"\r\n\u00a0\u00a0\u00a0\u00a0GrantWaitTime=\"0\"\r\n\u00a0\u00a0\u00a0\u00a0GrantedMemory=\"36800\"\r\n\u00a0\u00a0\u00a0\u00a0MaxUsedMemory=\"12816\"\/&gt; <\/pre>\n<p>This runtime memory grant information was introduced in SQL Server 2012. The serial plan memory attributes<strong> SerialRequiredMemory<\/strong> and <strong>SerialDesiredMemory<\/strong> refer to non-parallel plans. The other attributes refer to the query in its current execution state, which in this case used parallel operations.<\/p>\n<p>The requested and actual memory was 36,800 KB. And what if multiple concurrent requests were attempting to execute queries with similar cardinality estimate issues? At a certain point, you may end up seeing waits for memory grants. Cardinality estimate issues can cause the requested query execution memory grant to be significantly inflated. If concurrency is important, bad cardinality estimates can hamper your overall workload throughput.<\/p>\n<p>Problems with cardinality estimates can also produce excessive read I\/O requests. For example, dropping all clean buffers (in a non-production environment, of course) prior to executing the previous query shows that more than 165,061 pages got loaded into the buffer pool &#8211; even though no rows were ultimately returned in my query:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">-- Not for production use\r\nDBCC DROPCLEANBUFFERS;\r\nGO\r\n\r\n-- Execute the query here\r\n\r\nSELECT COUNT(*) AS [BufferCount]\r\nFROM sys.[dm_os_buffer_descriptors]; <\/pre>\n<p>Executing <strong>SET STATISTICS IO ON<\/strong> also shows the overall I\/O impact of the query (abridged results):<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">Table 'FactInternetSales'. Scan count 9, logical reads 167112, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. <\/pre>\n<p>And I&#8217;m not even mentioning the CPU overhead that is so tightly associated with I\/O operations. As you can see, when estimates are off, it impacts query performance, resource utilization, and concurrency.<\/p>\n<h2>Solutions<\/h2>\n<p>Cardinality estimate issues can occur for a variety of different reasons and so there are a variety of different solutions which include, amongst others:<\/p>\n<ul>\n<li>updating statistics<\/li>\n<li>changing statistics-related database settings<\/li>\n<li>disabling &#8220;norecompute&#8221; on indexes and statistics<\/li>\n<li>adding multi-column statistics for correlated columns<\/li>\n<\/ul>\n<p>The problem with the gatekeeper row cardinality estimate is particularly interesting because the standard solutions aren&#8217;t usually helpful.<\/p>\n<p>One way to approach this problem is to help the Query Optimizer understand more explicitly the cardinality of the gatekeeper column in the fact table itself, without relying on the byproduct of the Fact-to-Dimension join operation. For example:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">-- CurrencyKey 75 = Pakistan Rupee\r\nSELECT\u00a0\u00a0[od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0SUM([fis].[SalesAmount]) AS [SalesAMT]\r\nFROM\u00a0\u00a0\u00a0\u00a0[dbo].[FactInternetSales] AS [fis]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimProduct] AS [p]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[ProductKey] = [p].[ProductKey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimDate] AS [od]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[OrderDateKey] = [od].[DateKey]\r\nWHERE\u00a0\u00a0 [fis].[CurrencyKey] = 75\r\nGROUP BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine]\r\nORDER BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine];\r\nGO <\/pre>\n<p>I removed the INNER JOIN to DimCurrency and instead directly referenced a search condition for the fact table CurrencyKey. Given this direct predicate on the foreign key reference, the query execution plan shape changed significantly (click for larger version):<\/p>\n<p class=\"illustration\"><a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-image3large.png\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-86232a04-4cdc-4a8e-a1a1-e0b2a010bb5e.png\" alt=\"1592-1-86232a04-4cdc-4a8e-a1a1-e0b2a010b\" \/><\/a><\/p>\n<p>The estimated vs. actual row skew was also eliminated (remember that the estimates feed the cost model, so the plan is directly impacted by good or bad estimates) and large sets of rows were no longer being propagated up the tree:<\/p>\n<p class=\"illustration\"><a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-image4large.png\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-903dd16b-c994-43b8-9dd0-46dae8809174.png\" alt=\"1592-1-903dd16b-c994-43b8-9dd0-46dae8809\" \/><\/a><\/p>\n<p>If you are using a stored procedure to encapsulate the query, you could choose to pass the parameter in for direct use in the fact table predicate:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">CREATE PROCEDURE [dbo].[ProductLineSales_by_CalendarYear]\r\n\u00a0\u00a0 @CurrencyKey INT \r\nAS\u00a0\u00a0\r\nSELECT\u00a0\u00a0[od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0SUM([fis].[SalesAmount]) AS [SalesAMT]\r\nFROM\u00a0\u00a0\u00a0\u00a0[dbo].[FactInternetSales] AS [fis]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimProduct] AS [p]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[ProductKey] = [p].[ProductKey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimDate] AS [od]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[OrderDateKey] = [od].[DateKey]\r\nWHERE\u00a0\u00a0 [fis].[CurrencyKey] = @CurrencyKey\r\nGROUP BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine]\r\nORDER BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine];\r\nGO <\/pre>\n<p>However, you do need to be extremely careful about parameter sniffing in this case. For example, if I execute the <em>non<\/em>-gatekeeper row parameter first (initial compilation) followed by a gatekeeper row parameter, the non-gatekeeper plan gets used for consecutive executions, resulting in greatly diminished performance, and tempdb spills on the sort operator:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">-- Pakistan Rupee currency\r\nEXECUTE [dbo].[ProductLineSales_by_CalendarYear] 75;\r\nGO\r\n\r\n-- Yen currency\r\nEXECUTE [dbo].[ProductLineSales_by_CalendarYear] 102;\r\nGO <\/pre>\n<p class=\"illustration\"><a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-image5large.png\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-2416eb3a-2f3a-48d8-a8de-759dd2de480f.png\" alt=\"1592-1-2416eb3a-2f3a-48d8-a8de-759dd2de4\" \/><\/a><\/p>\n<p>Also, a sub-query providing the CurrencyKey does <em>not<\/em> achieve the same results as when I directly provided the literal value. The following query uses the non-optimal plan that had the original cardinality estimate issues:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT\u00a0\u00a0[od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0SUM([fis].[SalesAmount]) AS [SalesAMT]\r\nFROM\u00a0\u00a0\u00a0\u00a0[dbo].[FactInternetSales] AS [fis]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimProduct] AS [p]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[ProductKey] = [p].[ProductKey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimDate] AS [od]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[OrderDateKey] = [od].[DateKey]\r\nWHERE\u00a0\u00a0 [fis].[CurrencyKey] = \r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 (SELECT [CurrencyKey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0FROM [dbo].[DimCurrency]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0WHERE [CurrencyName] = N'Pakistan Rupee')\r\nGROUP BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine]\r\nORDER BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine];\r\nGO <\/pre>\n<p>Another solution to consider is the use of a covering nonclustered index along with an <strong>INDEX<\/strong> and\/or <strong>FORCESEEK<\/strong> hint. By default I&#8217;m not a fan of using hints but, when tackling a problem with the gatekeeper row cardinality estimate, it may be one of the few options available to you.<\/p>\n<p>In the following example, I create a covering index on the columns referenced for join conditions and an INCLUDE column for my sales amount:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">CREATE NONCLUSTERED INDEX IX_FIS_CurrencyKey_ProductKey_DateKey\r\nON [dbo].[FactInternetSales]([CurrencyKey], [ProductKey], [OrderDateKey])\r\nINCLUDE ([SalesAmount]);\r\nGO <\/pre>\n<p>Adding this index will not, by itself, solve the problem. In my testing, the Query Optimizer still chose to perform a large scan of the Fact table. But adding the covering index in conjunction with a <strong>FORCESEEK<\/strong> hint <em>was <\/em>effective:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT\u00a0\u00a0[od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0SUM([fis].[SalesAmount]) AS [SalesAMT]\r\nFROM\u00a0\u00a0\u00a0\u00a0[dbo].[FactInternetSales] AS [fis] WITH (FORCESEEK)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimProduct] AS [p]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[ProductKey] = [p].[ProductKey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimCurrency] AS [c]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[CurrencyKey] = [c].[currencykey]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0INNER JOIN [dbo].[DimDate] AS [od]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0ON [fis].[OrderDateKey] = [od].[DateKey]\r\nWHERE\u00a0\u00a0 [c].[CurrencyName] = N'Pakistan Rupee'\r\nGROUP BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine]\r\nORDER BY [od].[CalendarYear] ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[p].[ProductLine];\r\nGO <\/pre>\n<p>The Query Optimizer chose to join the <strong>DimCurrency<\/strong> and <strong>FactInternetSales<\/strong> tables first and the cardinality estimates were accurate (abridged query plan below):<\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-19c69aec-9f18-424c-96a5-7ccdf6577cdb.png\" alt=\"1592-1-19c69aec-9f18-424c-96a5-7ccdf6577\" \/><\/p>\n<p>Another potential solution to consider if you&#8217;re running on SQL Server 2012 Enterprise Edition is columnstore indexes. While using columnstore indexing doesn&#8217;t fix the cardinality estimate issues, it is able to side-step them in some scenarios due to leveraging columnar storage in combination with batch-execution mode processing:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">-- Drop nonclustered index in order to eliminate QO tempting choices\r\nDROP INDEX IX_FIS_CurrencyKey_ProductKey_DateKey\r\nON [dbo].[FactInternetSales];\r\nGO\r\n\r\n-- Covering all supported data type columns \r\nCREATE NONCLUSTERED COLUMNSTORE INDEX [NCI_CSI_FactInternetSales] ON \r\n[dbo].[FactInternetSales]\r\n(\r\n\u00a0\u00a0 [ProductKey],\r\n\u00a0\u00a0 [OrderDateKey],\r\n\u00a0\u00a0 [DueDateKey],\r\n\u00a0\u00a0 [ShipDateKey],\r\n\u00a0\u00a0 [CustomerKey],\r\n\u00a0\u00a0 [PromotionKey],\r\n\u00a0\u00a0 [CurrencyKey],\r\n\u00a0\u00a0 [SalesTerritoryKey],\r\n\u00a0\u00a0 [SalesOrderNumber],\r\n\u00a0\u00a0 [SalesOrderLineNumber],\r\n\u00a0\u00a0 [RevisionNumber],\r\n\u00a0\u00a0 [OrderQuantity],\r\n\u00a0\u00a0 [UnitPrice],\r\n\u00a0\u00a0 [ExtendedAmount],\r\n\u00a0\u00a0 [UnitPriceDiscountPct],\r\n\u00a0\u00a0 [DiscountAmount],\r\n\u00a0\u00a0 [ProductStandardCost],\r\n\u00a0\u00a0 [TotalProductCost],\r\n\u00a0\u00a0 [SalesAmount],\r\n\u00a0\u00a0 [TaxAmt],\r\n\u00a0\u00a0 [Freight],\r\n\u00a0\u00a0 [CarrierTrackingNumber],\r\n\u00a0\u00a0 [CustomerPONumber],\r\n\u00a0\u00a0 [OrderDate],\r\n\u00a0\u00a0 [DueDate],\r\n\u00a0\u00a0 [ShipDate]\r\n)WITH (DROP_EXISTING = OFF)\r\nGO <\/pre>\n<p>If you execute the original, problematic query using the non-gatekeeper row, a plan is selected that now leverages the columnstore index and results in a fast execution (abridged query plan &#8211; click to enlarge):<\/p>\n<p class=\"illustration\"><a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-image6large.png\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-34b85545-b64e-4a16-bfcb-23dbe5e13d6b.png\" alt=\"1592-1-34b85545-b64e-4a16-bfcb-23dbe5e13\" \/><\/a><\/p>\n<p>Notice that the 3,865,472 rows associated with the columnstore index scan are quickly narrowed down in the parent operator, even though the cardinality estimate for the scan and parent repartition-stream operator are significantly skewed compared to the actual row count:<\/p>\n<p class=\"illustration\"><a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-image7large.png\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1592-1-ac25937a-1bca-4177-8c12-c3f9269ae9fa.png\" alt=\"1592-1-ac25937a-1bca-4177-8c12-c3f9269ae\" \/><\/a><\/p>\n<p>Even though we didn&#8217;t eliminate the cardinality estimate issues, the plan had far less I\/O impact, and thus far less buffer pool impact. The memory grant requests, however, actually increase in size:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">&lt;MemoryGrantInfo\r\n\u00a0\u00a0\u00a0\u00a0SerialRequiredMemory=\"7680\"\r\n\u00a0\u00a0\u00a0\u00a0SerialDesiredMemory=\"13264\"\r\n\u00a0\u00a0\u00a0\u00a0RequiredMemory=\"66448\"\r\n\u00a0\u00a0\u00a0\u00a0DesiredMemory=\"72080\"\r\n\u00a0\u00a0\u00a0\u00a0RequestedMemory=\"72080\"\r\n\u00a0\u00a0\u00a0\u00a0GrantWaitTime=\"0\"\r\n\u00a0\u00a0\u00a0\u00a0GrantedMemory=\"72080\"\r\n\u00a0\u00a0\u00a0\u00a0MaxUsedMemory=\"8776\"\/&gt; <\/pre>\n<p>While columnstore indexing comes with significant potential performance benefits, there are trade-offs, such as making your table read-only for non-partition switch related data modifications. Not all queries can benefit from columnstore indexing, but if your gatekeeper-row issue is associated with a standard, relational data-warehouse query, you may want to consider exploring this option. You may find that a columnstore index solution requires far less refactoring of your queries and associated schema than other potential solutions.<\/p>\n<h2>Summary<\/h2>\n<p>Gatekeeper-row cardinality estimation issues can be difficult to spot because they are driven by the changing characteristics of your relational data warehouse. Look for actual and estimated row skews across operators in a star-schema query. If that skew is associated with a dimension row that significantly influences the rows that are returned from the fact table, you may need to refactor your queries in order to help the Query Optimizer to properly estimate cardinality. Other potential solutions include the use of a covering nonclustered index with indexing hints, and also the use of a columnstore index which, while not resolving the cardinality estimate issue, may compensate by virtue of the performance optimizations that are inherent to its architecture.<\/p>\n<\/div>\n<div>\u00a0<\/div>\n","protected":false},"excerpt":{"rendered":"<p>The Query Optimiser needs a good estimate of the number of rows likely to be returned by each physical operator in order to select the best query plan from the most likely alternatives. Sometimes these estimates can go so wildly wrong as to result in a very slow query. Joe Sack shows how it can happen with SQL Queries on a data warehouse with a star schema.&hellip;<\/p>\n","protected":false},"author":221918,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143527],"tags":[5751,4170,4783,5084,5076,5752,4150],"coauthors":[17045],"class_list":["post-1435","post","type-post","status-publish","format-standard","hentry","category-database-administration-sql-server","tag-cardinality-estimates","tag-database-administration","tag-execution-plans","tag-optimiser","tag-query-optimizer","tag-row-counts","tag-sql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1435","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221918"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=1435"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1435\/revisions"}],"predecessor-version":[{"id":80967,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1435\/revisions\/80967"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=1435"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=1435"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=1435"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=1435"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}