During one of our regular SQL Server heath checks, using sp_blitz, one of our largest production tables raised a worrying alert. The ID column of a table holding customer order information was an INT datatype, and it was running out of numbers.
The table was around 500GB with over 900 million rows. Based on the average number of inserts a day on that table, I estimated that we had eight months before inserts on that table would grind to a halt. This was an order entry table, subject to round-the-clock inserts due to customer activity. Any downtime to make the conversion to BIGINT was going to have to be minimal.
This article describes how I planned and executed a change from an INT to a BIGINT data type, replicating the process I used in a step by step guide for the AdventureWorks database. The technique creates a new copy of the table, with a BIGINT datatype, on a separate SQL Server instance, then uses object level recovery to move it into the production database.
Assessing the options
The obvious option for changing the datatype from INT to BIGINT was simply to ALTER the table directly. However, the requirement for minimal downtime ruled this out immediately. The ID column is a clustered Primary Key, which poses the first issue: you must drop the clustered index before running the ALTER TABLE command.
This was going to cause an outage of around nine hours and a considerable amount of logging. We could add extra disk capacity to cope with the logging requirements, but that length of downtime was not an option.
We didn’t have the luxury of being on Azure SQL Database, or SQL Server 2016 or 2017, each of which offer online rebuild options. There are limitations to online rebuilds though, one being this warning from MSDN:
Online alter column does not reduce the restrictions on when a column can be altered. References by index/stats, etc. might cause the alter to fail.
Another option involved the use of a trigger. This would entail copying all the data to a new table, creating all indexes and constraints, then creating a trigger to make sure inserts go to both tables. I was worried about many aspects of this idea, including maintenance and performance.
Another option suggested was to reseed the INT to use negative numbers. This means resetting the INT from -1 onwards to -2.147 billion rows, but I wasn’t a fan of the idea, and, in any event, this was not a long-term answer.
A standard approach I’d seen documented, was to create a replica table, except with a BIGINT datatype instead of INT and copy the data across in small batches. You then keep the two tables in sync, by capturing modifications to the original table (using a trigger or Change Data Capture) and applying them to the new table. Finally, during some brief downtime, switch the new table for the old. This was my preferred approach…but with a new twist. I created the replica table on a separate development instance, rather than side-by-side with the original in production. Instead of a trigger or Change Data Capture, I used SSIS packages to keep the tables in sync. Then, I used object-level restore to move the new table to production for the switch over.
I’d not seen this approach written about anywhere, but it seemed like a great way to minimize any possible disruption to production, and my instincts told me it would work!
The proof of concept
I did a lot of work in our test and development environments, making sure this approach was going to work exactly as intended. The following sections summarize the testing work. The demo mimics the steps I took as closely as possible but using the sample AdventureWorks database. I’ll assume you’ve restored the database to a development environment and start by creating the replica table.
Create the ‘replica’ table
In a newly restored AdventureWorks database, create a PersonNEW table alongside the original Person table, using a BIGINT data type for the clustered index column as shown in Listing 1. Note: to more closely follow along with the process used when I made the change in production, create the new table in a database on another instance.
CREATE TABLE Person.PersonNEW
BusinessEntityID BIGINT NOT NULL,
PersonType NCHAR(2) NOT NULL,
NameStyle dbo.NameStyle NOT NULL,
Title NVARCHAR(8) NULL,
FirstName dbo.Name NOT NULL,
MiddleName dbo.Name NULL,
LastName dbo.Name NOT NULL,
Suffix NVARCHAR(10) NULL,
EmailPromotion INT NOT NULL,
AdditionalContactInfo XML(CONTENT Person.AdditionalContactInfoSchemaCollection) NULL,
Demographics XML(CONTENT Person.IndividualSurveySchemaCollection) NULL,
rowguid UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL,
ModifiedDate DATETIME NOT NULL,
PRIMARY KEY CLUSTERED (BusinessEntityID ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON
) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY];
Transfer across the data, then create indexes
I used SSIS to transfer all data across into the PersonNEW table, and then created all the necessary indexes and constraints. When you create the SSIS package, make sure to click Enable Identity Insert (see below). You will find this option under the Edit Mappings tab in Select Source Tables and Views. In my scenario there was an identity column so this was needed. I also didn’t want any discrepancies, since the ID is a unique number for each order used by many applications and throughout the company.
During testing, I periodically updated the data in the BIGINT table using SSIS packages. For example, if the last import stopped at ID 6000, I would create the next SSIS package with > 6000. I used the cluster key to eliminate scanning the table each time for the most efficient transfer. I did this every day to keep the transferring of data time down. Listing 2 shows the query to use in the SSIS package for the Person table.
/****** Script for SelectTopNRows command from SSMS ******/
WHERE BusinessEntityID > 6000
During testing, I also used Redgate’s SQL Data Compare after data transfers to verify that the data was copied exactly as expected.
Object level restore
The next step was testing the process on a separate staging server. I wanted to see if I could use object level recovery of the table into a database with a different name. To do this, I had to use a third-party SQL Server backup tool since object level recovery is not supported natively. I restored a fresh copy of AdventureWorks to the staging server, naming it AdventureWorksBIGINT. This represented the Production database in my test. I then restored the new table, PersonNEW, from the backup into the new staging database.
This was a smoke test to make sure that the same object level restore, from Development to Production would work exactly as expected. When restoring to Production, I restored the table using the object level recovery feature in my SQL Server backup tool.
Create a trigger to stop entries to the original table
During the switch over to the new table, changes to the data must be stopped. I used a trigger (Listing 3) to stop all change to the production table and stopped the application from being able to insert, update, or delete.
Listing 3: Trigger to stop changes
CREATE TRIGGER trReadOnly_Person ON [Person].[Person]
INSTEAD OF INSERT,
RAISERROR( 'Person table is read only.', 16, 1 )
--DROP TRIGGER trReadOnly_Person
Swap out the old table for the new one
Now that both the original and replica tables were in the same database, the final step was to do a swap of the tables, swapping the indexes, constraints, table names, foreign keys, a trigger and a couple database permissions for denying access to certain columns. You can download the test object flip script for AdventureWorks at the bottom of this article, but I won’t show it here. Looking back, I did overcomplicate the index name flip as only the primary key was needed in my environment. Bear in mind not all indexes need to be changed since you could reuse the same name in two different tables.
Before I did the swap, I checked which rows were last added to the table, so when I did the last transfer, I knew what the most recent row should be. I used Redgate’s SQL Data Compare to make sure none of the rows were different.
What I also decided to do while I could amend the replica table as I pleased as it’s on our development environment, was to compress the table, saving around 200GB. Finally, I took a differential backup to make sure we had an up to date backup of the production database on hand, in case any issues occurred.
Everything looked good, so I ‘flicked the object flip switch’.
Once all objects are renamed, you can remove the trigger to reopen the table. Listing 4 shows how to check that you can access the table and entries are being made or can be made.
INSERT INTO Person.Person (
SELECT TOP 1 BusinessEntityID + 1,
ModifiedDate FROM Person.Person
ORDER BY BusinessEntityID DESC
Check your tables you will see the Person table now has a BIGINT datatype!
Moving onto Production…
We ran a pilot in our Acceptance environment, which mimics our production set up, and it worked fine also.
When running on Acceptance and Production, the process was performed in the following steps:
- Restore a full database backup of the production database to a dev/test environment
- In the restored database, create the replica table with a BIGINT instead of INT
- Create the SSIS package and, with IDENTITY INSERT enabled, transfer the data
- Create all the indexes and constraints on the replica table
- Compress the new table (optional)
- Perform the object restore into the production database, keeping the table named as PersonNew .
- Update the PersonNew table periodically using SSIS packages to transfer data from a reporting instance in an availability group
- In the planned maintenance window, do one more SSIS transfer and then create the trigger to make the table read-only. We also shut down the applications that accesses this table
- Do a differential backup
- Do the table swap
- Check data consistency using SQL Data Compare
- Remove trigger and bring the APIs back online
This method cut down the downtime from a potential nine hours to less than fifteen minutes, and a lot of the intensive work was done away from the production instance. I did not see much impact on performance from the restore of the table using object level recovery.
There are many ways to make this and other datatype changes to production databases. The option you choose will often depend on the available downtime window you have available. The shorter that is, the more ‘creative’ you need to be with your solution, while always ensuring data integrity as the number one priority.
The method I’ve described minimized both downtime, and the potential to affect production server performance, since it allowed me to do most of the work to the table on a separate development instance.