SQL Server Batched Deletions: Monitor, Tune & Optimize

Batched deletions are a common strategy in SQL Server to manage large datasets without overloading the system, but poorly tuned deletes can cause blocking, long-running transactions, and heavy log usage. Learn how to monitor and optimize these processes for smooth, efficient database performance.

In previous articles I showed patterns for working with large amounts of data on big tables while keeping locking at a minimum. These processes can allow migrations and maintenance without requiring downtime but, in environments with unpredictable database workloads, there is a risk of heavy traffic starting at any time and disrupting a once smooth operation. In this article, I’ll demonstrate how to augment these processes to allow dynamic adjustment of the configuration.

For most systems, the main limitation these techniques run into is the speed and throughput of I/O (input/output). During periods of low traffic, a large batch size may perform great with no impact to production, but as traffic increases, the storage subsystem may not be able to keep up.

I’ll show two workarounds to deal with this issue: lowering the batch size, and introducing a small delay in between batches. Both will allow the process to continue running with less I/O demand but, if this still isn’t enough, we can easily stop the process and restart at a different time.

Logging Tables in SQL Server

Before we get into the configuration options, let’s consider how we can get feedback on our process while it’s running, so we’re better informed about the adjustments we want to make. Let’s review the code for purging in batches explored in this article:

SET NOCOUNT ON;

--control the number of rows deleted per iteration

DECLARE @BatchSize INT = 5000;

--variable used to tell the process to stop

DECLARE @Stop INT = 0;

IF (OBJECT_ID ('tempdb..#ToProcess') IS NULL)

CREATE TABLE #ToProcess (Id INT NOT NULL PRIMARY KEY CLUSTERED);

IF (OBJECT_ID ('tempdb..#Batch') IS NULL)

CREATE TABLE #Batch (Id INT NOT NULL PRIMARY KEY CLUSTERED);

-----------------Gather Ids------------------------------------

INSERT INTO #ToProcess (Id)

SELECT Id

FROM dbo.Posts

WHERE CreationDate < '2011';

-----------------Main Loop------------------------------------

WHILE (@Stop = 0)

BEGIN

--Load up our batch table while deleting from the main set

DELETE TOP (@BatchSize) #ToProcess

OUTPUT DELETED.Id INTO #Batch (Id);

--Once the rowcount is less than the batchsize,

-- we can stop (after this loop iteration)

IF @@ROWCOUNT < @BatchSize

SELECT @Stop = 1;

--Perform the DELETE

DELETE FROM p

FROM dbo.Posts p

JOIN #Batch b ON p.Id = b.Id;

--Clear out Batch table

TRUNCATE TABLE #Batch

END;

Now, we’ll add a new table to hold our logging:

IF (OBJECT_ID('dbo.PurgeLogging') IS NULL)

CREATE TABLE dbo.PurgeLogging (

LogId INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED

,StartTime DATETIME2(7) NOT NULL DEFAULT GETDATE()

,EndTime DATETIME2(7) NULL

,Duration_ms AS

DATEDIFF(MILLISECOND

, StartTime

, ISNULL(EndTime, GETDATE()))

,RowsAffected INT NULL

,[BatchSize] INT NOT NULL

,DelayTime varchar(8) NOT NULL

,ErrorCode INT NULL

,ErrorMessage NVARCHAR(255) NULL

)

This table will contain a row for every iteration of the loop, tracking the time it takes for the DELETE statement to run (because that’s what will alert us to any blocking). Depending on the latency demands of your system, even a 1-2 second iteration may be too slow, but other systems may function without issue as long as this is below the default timeout of 30 seconds.

Besides the time tracking, we’ll also record the values of our configuration parameters and any errors that come up during the operation. We do this by inserting a record at the beginning of our loop:

INSERT INTO dbo.PurgeLogging (StartTime,[BatchSize], DelayTime)

SELECT GETDATE(), @BatchSize, @DelayTime

SELECT @LogId = SCOPE_IDENTITY()

And by updating that record at the end of our loop:

UPDATE dbo.PurgeLogging

SET EndTime = GETDATE()

, RowsAffected = @RowsAffected

, ErrorMsg = @ErrorMsg

, ErrorNum = @ErrorNum

WHERE LogId = @LogId

Configuration Tables in SQL Server

In our previous script we used a static variable to hold the BatchSize, but we’ll now store this in a table along with other parameters to adjust the process.

This allows us to make changes on the fly without stopping the script and rolling back a midflight transaction. It also opens up the possibility of using an automated process to tweak these parameters based on system load.

Let’s look at our configuration table:

IF (OBJECT_ID ('dbo.PurgeConfig') IS NULL)

CREATE TABLE dbo.PurgeConfig (

ConfigId INT NOT NULL DEFAULT (1) PRIMARY KEY CLUSTERED

, DelayTime varchar(8) NOT NULL

, [BatchSize] INT NOT NULL

, [Stop] bit NOT NULL DEFAULT (0)

, CONSTRAINT CH_PurgeConfig_ConfigId CHECK (ConfigId=1)

);

This table contains our parameters as well as a check constraint to ensure we only have one row stored at any time. We’ll update the running variables at the end of each loop (as well as at the beginning of the script):

SELECT @BatchSize = [BatchSize]

, @DelayTime = [DelayTime]

--ONLY UPDATE @Stop when it is 0

, @Stop = CASE WHEN @Stop = 1 THEN 1 ELSE [Stop] END

FROM dbo.PurgeConfig

Error Handling in SQL Server

While we are making improvements to the script, let’s add some polish by introducing error handling inside the main loop. This can be as simple as adding a TRY/CATCH block and some variables to store the error message and number.

You might also opt to set @Stop = 1 in this section if you want the script to stop any time it hits an error. In this example, I’m letting it continue because I can address the error and rerun the script later; any records that failed to delete will be scooped up by the insert to the #ToProcess table on the next run.

Putting It All Together

Now let’s look at what our script looks like with these new tables:

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

SET NOCOUNT ON;

--control the number of rows deleted per iteration

DECLARE @BatchSize INT;

--variable used to tell the process to stop

DECLARE @Stop INT = 0;

--add a delay between iterations

DECLARE @DelayTime VARCHAR(8);

--logging variable for rowcount

DECLARE @RowsAffected INT;

--logging variable for the log record

DECLARE @LogId INT;

--logging variables for errors

DECLARE @ErrorMsg nvarchar(4000)

, @ErrorNum INT;

IF (OBJECT_ID ('tempdb..#ToProcess') IS NULL)

CREATE TABLE #ToProcess (Id INT NOT NULL PRIMARY KEY CLUSTERED);

IF (OBJECT_ID ('tempdb..#Batch') IS NULL)

CREATE TABLE #Batch (Id INT NOT NULL PRIMARY KEY CLUSTERED);

--------------Logging Table-----------------------------------

IF (OBJECT_ID('dbo.PurgeLogging') IS NULL)

CREATE TABLE dbo.PurgeLogging (

LogId INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED

,StartTime DATETIME2(7) NOT NULL DEFAULT GETDATE()

,EndTime DATETIME2(7) NULL

,Duration_ms AS

DATEDIFF(MILLISECOND

, StartTime

, ISNULL(EndTime, GETDATE()))

,RowsAffected INT NULL

,[BatchSize] INT NOT NULL

,DelayTime varchar(8) NOT NULL

,ErrorCode INT NULL

,ErrorMessage NVARCHAR(255) NULL

);

-----------------Configuration Table---------------------------

IF (OBJECT_ID ('dbo.PurgeConfig') IS NULL)

BEGIN

CREATE TABLE dbo.PurgeConfig (

ConfigId INT NOT NULL DEFAULT (1) PRIMARY KEY CLUSTERED

, DelayTime varchar(8) NOT NULL

, [BatchSize] INT NOT NULL

, [Stop] bit NOT NULL DEFAULT (0)

, CONSTRAINT CH_PurgeConfig_ConfigId CHECK (ConfigId=1)

);

-----------------Add Initial Configuration---------------------

INSERT INTO dbo.PurgeConfig (DelayTime, [BatchSize], [Stop])

VALUES ('00:00:00', 5000, 0);

SELECT @BatchSize = [BatchSize]

, @DelayTime = [DelayTime]

--ONLY UPDATE @Stop when it is 0

, @Stop = CASE WHEN @Stop = 1 THEN 1 ELSE [Stop] END

FROM dbo.PurgeConfig;

END

-----------------Gather Ids------------------------------------

INSERT INTO #ToProcess (Id)

SELECT Id

FROM dbo.Posts

WHERE CreationDate < '2011';

-----------------Main Loop------------------------------------

WHILE (@Stop = 0)

BEGIN

--Create a logging record

INSERT INTO dbo.PurgeLogging (StartTime,[BatchSize], DelayTime)

SELECT GETDATE(), @BatchSize, @DelayTime;

SELECT @LogId = SCOPE_IDENTITY();

--Load up our batch table while deleting from the main set

DELETE TOP (@BatchSize) #ToProcess

OUTPUT DELETED.Id INTO #Batch (Id);

--Once the rowcount is less than the batchsize,

-- we can stop (after this loop iteration)

IF @@ROWCOUNT < @BatchSize

SELECT @Stop = 1;

BEGIN TRY

--Perform the DELETE

DELETE FROM p

FROM dbo.Posts p

JOIN #Batch b ON p.Id = b.Id;

--Store the rowcount

SELECT @RowsAffected = @@ROWCOUNT;

END TRY

BEGIN CATCH

--if we have an error, store the message and log it

-- optionally you can stop the process here by setting @stop =1

SELECT @ErrorMsg = ERROR_MESSAGE()

, @ErrorNum = ERROR_NUMBER()

END CATCH

--Update the logging table

UPDATE dbo.PurgeLogging

SET EndTime = GETDATE()

, RowsAffected = @RowsAffected

, ErrorMessage = @ErrorMsg

, ErrorCode = @ErrorNum

WHERE LogId = @LogId;

--Refresh running configuration from config table

SELECT @BatchSize = [BatchSize]

, @DelayTime = [DelayTime]

--ONLY UPDATE @Stop when it is 0

, @Stop = CASE WHEN @Stop = 1 THEN 1 ELSE [Stop] END

FROM dbo.PurgeConfig;

--Clear out Batch table

TRUNCATE TABLE #Batch;

–-Wait before running the next batch

WAITFOR DELAY @DelayTime

END;

After we kick off this script we can monitor the progress by querying the PurgeLogging table:

1 2	SELECT TOP 100 * FROM dbo.PurgeLogging ORDER BY LogId DESC

This results in each batch taking less than 200 milliseconds to complete:

An image showing each batch taking less than 200 milliseconds to complete.

Then, if we want to change the batch size or add a delay in between batches, we can update the PurgeConfig table like so:

UPDATE PurgeConfig

SET DelayTime = '00:00:02'

, [BatchSize] = 100000

We can see in our logging table that the change takes effect seamlessly:

An image showing that the change takes effect seamlessly.

Notice that there’s now a second gap between the StartTime and EndTime of the previous row, which could help slower disks keep up.

We can also see that each batch is now taking more than a second – if we feel this is too long, we can update our PurgeConfig table once again to lower the batch size:

Image showing updating the PurgeConfig table to lower the batch size.

Conclusion

Adding logging and dynamic configuration to this technique allows us to tune the process to the unique capabilities and requirements of any environment. By looking at the time each loop takes to execute, we can adjust the batch size to keep our impact to an acceptable amount. If our I/O is saturated, we can add a delay in between batches to allow other processes to complete.

This technique can be used to purge old data as I have shown here, but it can also be used for more advanced processes like changing the datatype on a column while a table is being used, or deleting from multiple tables with foreign key relationships.

Register for Simple Talk

How to Monitor and Optimize Batched Deletion Processes in SQL Server

Logging Tables in SQL Server

Configuration Tables in SQL Server

Error Handling in SQL Server

Putting It All Together

Conclusion

Article tags

About the author

Matt Gantz

Matt's contributions

Articles

Books

Top topics

Matt's latest contributions:

How to Measure Replication Latency in SQL Server AlwaysOn Synchronous Availability Groups

Replacing a Column in a Large Active SQL Server Table

Purging Data from a Large Table in SQL Server

Logging Tables in SQL Server

Configuration Tables in SQL Server

Error Handling in SQL Server

Putting It All Together

Conclusion

Article tags

Recommended

About the author

Matt Gantz

Matt's contributions

Articles

Books

Top topics

Matt's latest contributions:

How to Measure Replication Latency in SQL Server AlwaysOn Synchronous Availability Groups

Replacing a Column in a Large Active SQL Server Table

Purging Data from a Large Table in SQL Server