Stolen Pages, Ad-hoc queries and the sins of dynamic SQL in the application.

/*

One of the great advantages of doing development work on SQL Server machines with very modest hardware specifications is that mistakes are visible. The mistake of doing ad-hoc dynamic queries from an application will soon reveal itself. You might get a complaint from the developers that a database is running slowly. You'll notice a very high CPU usage, up to 100% but without any blocking during the periods of slow performance.

In extreme cases, You may even receive errors such as

Error: 701, Severity: 17, State: 1

There is insufficient system memory to run this query.

..or..

Msg 8645, Level 17, State 1, Procedure , Line 1

A time out occurred while waiting for memory resources to execute the query. Re-run the query.

You may be seeing the effects of the running of too many ad-hoc queries. The high CPU can be due to the query optimiser compiling a large number of Ad-Hoc queries. The memory stress is due to the extra memory required to store these. In other words, some developer is spraying a large number of Ad-hoc queries at the database instead of using either stored procedures or prepared statements. the chances are, they're doing something 'suboptimal' or, as we used to call it, Daft.

A compiled execution plan takes around 70KB depending on its size, and a stored procedure plan takes two to three times as much, depending on its complexity. The difference is that there will be only one plan for each stored procedure. With ad-hoc queries, you run the risk of having a separate plan for each query. We'll illustrate this in this article, and show you what to look out for.

Query plans have to be stored for re-use in the procedure cache, and SQL Server takes buffers from the LRU Buffer data store to do this. The word 'stolen' is a bit misleading as this is a perfectly legitimate exercise. Stolen pages are buffer cache pages that are 'stolen' to use for other server memory requests. Stolen pages are used for several miscellaneous server purposes such as procedure cache, sorting or for hashing operations (query workspace memory). It is also used as a generic memory store for allocations that are smaller than 8 KB, to store internal data structures such as locks, transaction context, and connection information. This is a simple way of allocating memory to routine tasks but if a server receives a huge rate of ad-hoc queries, it can lead to trouble. Unless SQL Server determines that it can automatically parameterize a query, or it determines that it is the same query, it is forced to generate a new execution plan. You are seeing the consequences of this with the starvation of memory for data buffers.

The first thing to check is the number of query plans being compiled. The SQL Server Performance Monitor will show many SQL Compilations/sec are being done. Ideally, the ratio of SQL Recompilations/sec to Batch Requests/sec should be very low.

DBCC MemoryStatus will indicate a rise in the number of stolen pages, and there are a host of DMVs that can point to the cause of the problems.

Once you are sure as to what is causing the problem, then there are several things you can do. Ideally, the best cure is to use stored procedures. If this solution isn't available for some reason, then develop parameterized queries in the applications so that SQL Server is enabled to reuse an existing plan. It is possible to force SQL Server to use a compiled plan for ad-hoc queries, but as it is so easy to form correct parameterised queries, this is hardly worth considering.

Let's show you an example. First, we'll prepare a 'person' table and populate it with, say, half a million rows.

*/

IF NOT EXISTS (SELECT 1 FROM information_schema.tables

WHERE table_name LIKE 'Person')

CREATE TABLE [dbo].[Person](

[Person_id] [int] IDENTITY(1,1) NOT NULL,

[ContactTitle] [varchar](50) NOT NULL CONSTRAINT [DF_Person_ContactTitle] DEFAULT (''),

[Title] [varchar](20) NOT NULL CONSTRAINT [DF_Person_Title] DEFAULT (''),

[FirrstName] [nvarchar](50) NOT NULL CONSTRAINT [DF_Person_FirstName] DEFAULT (''),

[NickName] [nvarchar](50) NOT NULL CONSTRAINT [DF_Person_NickName] DEFAULT (''),

[LastName] [nvarchar](50) NOT NULL CONSTRAINT [DF_Person_creator] DEFAULT (USER_NAME()),

[DateOfBirth] [datetime] NULL,

[insertiondate] [datetime] NULL CONSTRAINT [DF_Person_insertiondate] DEFAULT (GETDATE()),

[terminationdate] [datetime] NULL,

CONSTRAINT [PK_dbo_Person] PRIMARY KEY CLUSTERED

(

[Person_id] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

/*Now we are going to simulate a programmer failing to come to grips with how to get information from a table and trying to do it by iterating through the rows using Ad-Hoc queries

*/

SET NOCOUNT ON --as we don't want it

--Lets start with a clean sheet.

CHECKPOINT --Writes all data pages that have been entered into the buffer cache and modified but not yet written to disk, for the current database to disk.

GO

DBCC DROPCLEANBUFFERS --remove all buffers from the buffer pool.

DBCC FREEPROCCACHE --Removes all elements from the plan cache

GO

--We'll create a temporary table to put our results into

--drop the temporary table if it exists and create it

IF EXISTS (SELECT 1 FROM tempdb.information_schema.tables

WHERE table_name LIKE '#names%') DROP TABLE #names

CREATE TABLE #names ([name] VARCHAR(80))

--now let's simulate a simple loop through a table looking for a particular name in a "Person" table, assuming that a programmer is making the simple mistake of using procedural code

DECLARE @ii INT --iteration counter

DECLARE @id VARCHAR(10)

SELECT @ii=1

WHILE @ii<2000 --and just look through 2000 rows.

BEGIN

SELECT @ii=@ii+1, @id=CONVERT(VARCHAR(5),@ii)

EXECUTE ('insert into #names (name) Select Lastname from person where person_ID='+@id)

END

--Hmm. Lets see what is in the query-plan cache. Just peep at a sample of plans

SELECT TOP 5 TEXT

FROM sys.dm_exec_cached_plans

CROSS APPLY sys.dm_exec_sql_text(plan_handle)

WHERE cacheobjtype ='Compiled Plan'

/* look at that in the cache! Each execution has created a cached plan

insert into #names (name) Select Lastname from person where person_ID=2000

insert into #names (name) Select Lastname from person where person_ID=1999

insert into #names (name) Select Lastname from person where person_ID=1998

insert into #names (name) Select Lastname from person where person_ID=1997

insert into #names (name) Select Lastname from person where person_ID=1996

How much memory did that lot take up, I wonder?*/

SELECT SUM(size_in_bytes)/1024/1024 AS "Megs"

(SUM(size_in_bytes)/1024)/COUNT(*) AS [Average size(k)],

AVG(refcounts) AS "average ref."

AVG(Usecounts) AS "average use"

ObjType

FROM sys.dm_exec_cached_plans cp

WHERE cacheobjtype ='Compiled Plan'

GROUP BY Objtype

/*

Megs Average size(k) average ref. average use ObjType

------ --------------- ------------ ----------- --------

78 40 2 1 Adhoc

So that simple loop in procedural code within an application took 78 megs of memory!

Now we've seen the damage, let's see if using a prameterised query is any better. We'll make up a little test harness.

*/

GO

CHECKPOINT

GO

DBCC DROPCLEANBUFFERS

DBCC FREEPROCCACHE

GO

SET NOCOUNT ON

-- create a temporary table

IF EXISTS (SELECT 1 FROM tempdb.information_schema.tables

WHERE table_name LIKE '#names%') DROP TABLE #names

CREATE TABLE #names ([name] VARCHAR(80))

--create a temporary timer table

DECLARE @log TABLE

(

Log_ID INT IDENTITY(1, 1),

"Event"VARCHAR(40),

"Stolen Pages"INT,

"iterations"INT,

InsertionDate DATETIME DEFAULT GETDATE()

)

DECLARE @ii INT, @IterationsToDo INT

DECLARE @id VARCHAR(8)

DECLARE @StolenPageCounter INT

--DBCC MemoryStatus

--find out the stolen page counter value

SELECT @StolenPageCounter=cntr_value FROM SYS.SYSPERFINFO

WHERE COUNTER_NAME='STOLEN PAGES'

AND OBJECT_NAME LIKE '%BUFFER MANAGER%'

SET @IterationsToDo=5000

INSERT INTO @log (event, "Stolen Pages"Iterations) SELECT 'Ad-hoc queries', @StolenPageCounter,@IterationsToDo

SELECT @ii=1

WHILE @ii<=@IterationsToDo

BEGIN

SELECT @ii=@ii+1, @id=CONVERT(VARCHAR(5),@ii)

EXECUTE ('insert into #names (name) Select Title + '' ''+Firstname+ '' ''+Lastname from person as philfactor where person_ID='+@id)

END

SELECT @StolenPageCounter=cntr_value FROM SYS.SYSPERFINFO

WHERE COUNTER_NAME='STOLEN PAGES'

AND OBJECT_NAME LIKE '%BUFFER MANAGER%'

INSERT INTO @log (event, "Stolen Pages" Iterations) SELECT 'Parameterised Queries', @StolenPageCounter,@IterationsToDo

SELECT @ii=1

WHILE @ii<=@IterationsToDo

BEGIN

SELECT @ii=@ii+1

/* the first time that the sp_executesql statement is executed,

SQL Server generates a parameterized plan for the SELECT statement

from person with id as the parameter. For all subsequent executions,

SQL Server reuses the plan with the new parameter value */

EXEC sp_executesql N'

insert into #names (name) Select Title + '' ''+Firstname+ '' ''+Lastname

from person as factorphil

where person_ID=@id',

N'@id int',

@id=@ii

END

SELECT @StolenPageCounter=cntr_value FROM SYS.SYSPERFINFO

WHERE COUNTER_NAME='STOLEN PAGES'

AND OBJECT_NAME LIKE '%BUFFER MANAGER%'

INSERT INTO @log (event,"Stolen Pages" SELECT 'Completed parameterised query', @StolenPageCounter

--first, we'll see what cached plans we have.

SELECT COUNT(*) AS [No.],

CONVERT(NUMERIC(9, 2),

SUM(size_in_bytes * 1.00) / 1024 / 1024) AS Mb,

(SUM(size_in_bytes) / 1024)/COUNT(*) AS [Average size (K)],

AVG(refcounts) AS "average ref."

AVG(Usecounts) AS "average use"

ObjType

FROM sys.dm_exec_cached_plans cp

GROUP BY Objtype

ORDER BY [Mb] DESC

/*

No. Mb Average size (K) average ref. average use ObjType

------ ------- ---------------- ------------ ----------- --------

5003 195.82 40 2 1 Adhoc

8 0.26 33 1 4 View

1 0.04 40 2 5000 Prepared

1 0.01 8 1 5000 Proc

*/

SELECT event,

[Pages Stolen] = (SELECT "Stolen Pages"

FROM @log f

WHERE f . Log_ID = g . Log_ID + 1)-"Stolen Pages"

[Time (Ms)] = DATEDIFF(ms, [InsertionDate],

(SELECT [InsertionDate]

FROM @log f

WHERE f . Log_ID = g . Log_ID + 1

)

FROM @log g

WHERE Log_ID < ( SELECT MAX (Log_ID) FROM @log )

/*

so we see that it is much faster to use parameterized queries and the number of stolen pages is far less

event Pages Stolen Time (Ms)

-------------------------- ------------ -----------

Ad-hoc queries 21193 3906

Parameterised Queries 4492 970

So now, lets pick out the two queries from the Procedure Cache, recognising them by the two strings I embedded into them

*/

SELECT 'parameterised query' AS [query type],

SUM(size_in_bytes)/1024 AS [memory consumed (K)],

COUNT(*) AS [number of plans]

FROM sys.dm_exec_cached_plans

CROSS APPLY sys.dm_exec_sql_text(plan_handle)

WHERE TEXT LIKE '%factorphil%'

UNION ALL

SELECT 'Ad-Hoc query' AS [query type],

SUM(size_in_bytes)/1024 AS [memory consumed (K)],

COUNT(*) AS [number of plans]

FROM sys.dm_exec_cached_plans

CROSS APPLY sys.dm_exec_sql_text(plan_handle)

WHERE TEXT LIKE '%philfactor%'

/*

query type memory consumed (K) number of plans

------------------- ------------------- ---------------

parameterised query 3072 2

Ad-Hoc query 205728 5001

*/

GO

SELECT LEFT([type], 20) AS [type], SUM(single_pages_kb) AS [Total_kb]

FROM sys.dm_os_memory_clerks

WHERE TYPE IN ('CACHESTORE_SQLCP','CACHESTORE_PHDR','CACHESTORE_OBJCP')

GROUP BY TYPE

ORDER BY SUM(single_pages_kb) DESC

/*

we can see the huge figure again from the DMV sys.dm_os_memory_clerks

type Total_kb

-------------------- --------------------

CACHESTORE_SQLCP 202928

CACHESTORE_PHDR 312

CACHESTORE_OBJCP 8

but probably the best way of viewing the damage is from this DMV sys.dm_os_memory_cache_counters

*/

SELECT

LEFT([name], 20) AS [name],

LEFT([type], 20) AS [type],

SUM([single_pages_kb] + [multi_pages_kb]) AS cache_kb,

SUM([entries_count]) AS No_Entries

FROM sys.dm_os_memory_cache_counters

WHERE TYPE IN ('CACHESTORE_SQLCP','CACHESTORE_PHDR','CACHESTORE_OBJCP')

GROUP BY [type], [name]

ORDER BY cache_kb DESC

/*

name type cache_kb No_Entries

------------- -------------------- -------- -------

SQL Plans CACHESTORE_SQLCP 202768 5005

Bound Trees CACHESTORE_PHDR 312 10

Object Plans CACHESTORE_OBJCP 8 0

Note that:

CACHESTORE_SQLCP are our ad-hoc cached SQL statements or batches that aren't in stored procedures, functions or triggers. Thes consist of dynamic ad-hoc SQL sent to the server by an application.

CACHESTORE_PHDR are algebrizer trees for views, constraints and defaults. An algebrizer tree is the parsed SQL text that resolves the table and column names.

CACHESTORE_OBJCP are compiled plans for stored procedures, functions and triggers.

So there you have it. This is a problem that can be very insidious, but once you know what to watch out for, it is very easy to put right. You just make sure that the developers do not try using too many ad hoc queries.

*/

Register for Simple Talk

Stolen Pages, Ad-hoc queries and the sins of dynamic SQL in the application.

About the author

Phil Factor

Phil Factor's contributions

Articles

Books

Top topics

Phil Factor's latest contributions:

Accessibility, and the Need for Ingenious Simplicity in Computer Scripts, Languages and Programs

Data Documents and Common Sense

AI and Databases

Recommended

About the author

Phil Factor

Phil Factor's contributions

Articles

Books

Top topics

Phil Factor's latest contributions:

Accessibility, and the Need for Ingenious Simplicity in Computer Scripts, Languages and Programs

Data Documents and Common Sense

AI and Databases