Definition
An identity column has a name, initial seed and step. When a row is inserted into a table the column will take the value of the curent seed incremented by the step.
Note: An identity column is not guaranteed to be unique nor consecutive. You should always place a unique index on an identity column if your system requires uniqueness.
Creating And Using Identity Columns
We create an example table
1 |
CREATE TABLE #a(i INT IDENTITY(1,1), j INT) |
This is the usual way you will see an identity used and is the default. It gives the same result as …
1 |
CREATE TABLE #a(i INT IDENTITY, j INT) |
Inserting rows …
1 2 3 4 |
INSERT #a SELECT 1SELECT * FROM #a i j ----------- ----------- 1 1 |
… will work but is a bit confusing and may be version dependent. It is a good idea to always name the columns inserted and leave the others to default.
1 2 3 4 5 |
INSERT #a (j) SELECT 1SELECT * FROM #a i j ----------- ----------- 1 1 2 1 |
Note one use of an identity. We have inserted duplicate rows but can separate them by the identity column value. The identity also shows the order in which the rows were inserted.
We can find the current identity seed by using DBCC checkident …
1 2 |
DBCC checkident (#a) Checking identity information: current identity value '2', current column value '2'. |
So the current seed is 2 – remember the next value will be the current seed plus the step.
Failed Inserts
1 2 3 4 5 6 7 8 |
BEGIN TRANINSERT #a (j) SELECT 1ROLLBACK TRANSELECT * FROM #a i j ----------- ----------- 1 1 2 1 DBCC checkident (#a) Checking identity information: current identity value '3', current column value '3'. |
The table has not changed but we can see from the checkident that the current seed has been changed and we know that this is used to generate the next value.
Note: The next value is the step added to the current seed; not one more than the max value in the table, or even the step from the last or maximum value.
1 2 3 4 5 6 |
INSERT #a (j) SELECT 1SELECT * FROM #a i j ----------- ----------- 1 1 2 1 4 1 |
The situation above commonly happens when there is an index violation on j.
Values For Original Seed And Step
Note that the initial seed and step can be any integer value
1 2 3 4 5 6 |
CREATE TABLE #b (i INT IDENTITY(-7,5), j INT)INSERT #b (j) SELECT 1INSERT #b (j) SELECT 1INSERT #b (j) SELECT 1SELECT * FROM #b i j ----------- ----------- -7 1 -2 1 3 1 |
1 2 3 4 5 6 |
CREATE TABLE #c (i INT IDENTITY(1,-3), j INT)INSERT #c (j) SELECT 1INSERT #c (j) SELECT 1INSERT #c (j) SELECT 1SELECT * FROM #c i j ----------- ----------- 1 1 -2 1 -5 1 |
Inserting Specific Identity Values
We can insert a specific identity value to override the generated value. To do this execute a SET IDENTITY_INSERT #a ON and specify the column list explicitly.
1 2 3 4 5 6 7 |
SET IDENTITY_INSERT #a ONINSERT #a (i,j) SELECT 2,2SET IDENTITY_INSERT #a OFFSELECT * FROM #a i j ----------- ----------- 1 1 2 1 4 1 2 2 |
Remember that the identity doesn’t guarantee uniqueness? We now have 2 rows with the identity value 2.
What has happened to the seed?
1 2 |
DBCC checkident (#a) Checking identity information: current identity value '4', current column value '4'. |
Note that it is not affected by the previous insert. Let’s insert a higher value …
1 2 3 4 5 6 7 8 9 10 11 12 |
SET IDENTITY_INSERT #a ONINSERT #a (i,j) SELECT 10,3SET IDENTITY_INSERT #a OFFSELECT * FROM #a i j ----------- ----------- 1 1 2 1 4 1 2 2 10 3 DBCC checkident (#a) Checking identity information: current identity value '10', current column value '10'. |
This time the seed is updated – that is because the value we inserted was higher than the current seed. It will increase but not decrease.
But what happens if the step is negative?
1 2 3 4 5 6 7 8 9 10 11 |
SELECT * FROM #c i j ----------- ----------- 1 1 -2 1 -5 1 DBCC checkident (#c) Checking identity information: current identity value '-5', current column value '-5'. note: step is -3 |
1 2 3 4 5 6 7 8 9 10 11 12 |
SET IDENTITY_INSERT #c ON INSERT #c (i,j) SELECT -8,2SET IDENTITY_INSERT #c OFF SELECT * FROM #c i j ----------- ----------- 1 1 -2 1 -5 1 -8 2 DBCC checkident (#c) Checking identity information: current identity value '-8', current column value '-8'. |
1 2 3 4 5 6 7 8 9 10 11 |
SET IDENTITY_INSERT #c ONINSERT #c (i,j) SELECT 10,2SET IDENTITY_INSERT #c OFFSELECT * FROM #c i j ----------- ----------- 1 1 -2 1 -5 1 -8 2 10 2 DBCC checkident (#c) |
So the update of the seed takes into account the sign of the step.
Changing The Current Seed
We have seen that the current seed can be changed by an insert – but only in the direction of the step. A better way is to use DBCC checkident. This will take a reseed keyword and value to set the seed.
1 2 3 4 5 6 7 8 |
CREATE TABLE #d (i INT IDENTITY (5,2), j INT)INSERT #d (j) SELECT 1INSERT #d (j) SELECT 1SELECT * FROM #d i j ----------- ----------- 5 1 7 1 DBCC checkident(#d) Checking identity information: current identity value '7', current column value '7'. |
1 2 3 |
DBCC checkident(#d, reseed, 2) Checking identity information: current identity value '7', current column value '2'. |
Note this is the first time that the current seed has been different from the last allocated value.
1 2 3 4 5 6 |
INSERT #d (j) SELECT 2SELECT * FROM #d i j ----------- ----------- 5 1 7 1 4 2 |
We can also reset the current seed to it’s original value via a truncate table.
Note – a delete does not do this.
1 2 3 4 5 6 7 8 9 10 11 |
TRUNCATE TABLE #d DBCC checkident(#d) Checking identity information: current identity value 'NULL', current column value 'NULL'. INSERT #d (j) SELECT 1SELECT * FROM #d i j ----------- ----------- 5 1 DBCC checkident(#d) Checking identity information: current identity value '5', current column value '5'. |
Finding The Identity Value
A common requirement is to find the identity value for an inserted row. There are several statements associated with this
- scope_identity
- ident_current
- @@identity
- scope_identity()
- returns the last identity inserted in the current scope and session. This is usualy the only one of these functions that is useful. It is not affected by other connections or tables nor by triggers.
- @@identity
- will return the last identity value inserted in any scope. This means that if a trigger inserts into a table with an identity then that is the value returned. This means that adding replication or auditing triggers to a database can alter the value of @@identity. In earlier versions of sql server this was the only means of returning the identity value and care had to be taken.
- ident_current(‘table’)
- returns the last value inserted into that table on any connection. Remember to put the table name in quotes
As stated earlier scope_identity() is probably the only one of these functions that you will need to use.
Using Scope_Identity()
As stated earlier scope_identity() returns the last identity value inserted.
1 2 3 4 |
CREATE TABLE #t1 (i INT IDENTITY(5,1), j INT)INSERT #t1 (j) SELECT 1SELECT SCOPE_IDENTITY(), MAX(i) FROM #t1 --------------------------------------- ----------- 5 5 |
scope_identity also returns the value after a rollback
1 2 3 4 |
BEGIN TRANINSERT #t1 (j) SELECT 1ROLLBACK TRANSELECT SCOPE_IDENTITY(), MAX(i) FROM #t1 --------------------------------------- ----------- 6 5 |
but the value is not updated for a failure due to an index violation although the value is allocated
1 2 3 |
CREATE UNIQUE INDEX ix ON #t1 (j)INSERT #t1 (j) SELECT 1SELECT SCOPE_IDENTITY(), MAX(i) FROM #t1 --------------------------------------- ----------- 6 5 |
1 2 3 4 |
INSERT #t1 (j) SELECT 2SELECT SCOPE_IDENTITY(), MAX(i) FROM #t1 --------------------------------------- ----------- 8 8 |
Adding An Identity Column To A Table.
An identity column can be added to a table via an alter table statement. Values will be allocated to the column according to the seed and step.
In this case SCOPE_IDENTITY() will not return an allocated value.
1 2 3 4 5 6 7 |
CREATE TABLE #t2 (j INT)INSERT #t2 (j) SELECT 1INSERT #t2 (j) SELECT 1INSERT #t2 (j) SELECT 1INSERT #t2 (j) SELECT 1SELECT * FROM #t2 j ----------- 1 1 1 1 |
1 2 3 4 5 6 7 |
ALTER TABLE #t2 ADD i INT IDENTITY (5,2)SELECT * FROM #t2 j i ----------- ----------- 1 5 1 7 1 9 1 11 |
This can be useful for dealing with tables with duplicate rows.
Note – an existing column cannot be made into an identity. In this case you must drop the existing column and add a new one.
In this instance the existing values cannot be retained. To retain existing values create a new table and insert using identity_insert. Also as this will update all rows in a table it can take a very long time on large tables and increase the log size.
Select Into
An identity column can be included in a table created using a select into statement via the identity function
1 |
SELECT *, IDENTITY(INT,1,1) AS id INTO #tbl FROM sysobjects |
This is useful for creating a table from existing structures
Identity Datatypes
The identity must only contain integer values but the column can be of any numeric datatype (bigint, int, tinyint, numeric, decimal). This can be useful when values greater than be contained in a bigint are required.
Detecting Identity Columns And Their Properties
The existance of an identity column on a table can be checked via
1 |
SELECT OBJECTPROPERTY(OBJECT_ID('<tablename>'),'TableHasIdentity') |
Which will return 1 if an identity exists on the table.
Similarly …
1 |
SELECT COLUMNPROPERTY(OBJECT_ID('<tablename>'),'<columnname>','IsIdentity') |
… Will show if a column has the identity property.
A more useful way of obtaining this information is by using the catalog view sys.identity_columns which returns a row for each column in the database with an identity property.
1 |
SELECT TableName = OBJECT_NAME(OBJECT_ID) , ColumnName = name , OriginalSeed = seed_value , Step = increment_value , LastValue = last_value , IsNotForReplication = is_not_for_replicationFROM sys.identity_columns |
Character Values In An Identity Column
A common request is to hold a composite value and to allocate sequential values depending on the character part
eg.
1 2 3 4 5 |
a1 a2 a3 b1 b2 |
This is not possible and probably not even desirable. Notice that this column actually contains two values – what would be it’s purpose?
Perhaps it is trying to allocate a sequence to the character part. That definition highlights the mistake – the sequence value is separate to the character part and should be a separate column.
Now we can use an identity for the numeric value and easily calculate a consecutive value for the character value from this when accessing the table or in a view.
If it is required to keep the sequence value in the table – maybe for performance reasons then this could be maintained via a trigger.
Bulk Insert
If the table has an identity column then a bulk insert will often fail if the identity values are not held in the text file. The easiest way around this is to create a view on the tabke excluding the identity column and bulk insert into the view. Another option is to create a format file to use with the bulk insert. I would avoid this option if possible as it adds an external object and is more difficult to maintain.
It is tempting to assume that the identity values will be allocated in the order of rows in the text file but this is not the case. This is often an issue with unstructured data like XML. In this case an XML block cannot be parsed using the identity values. It often will work but cannot be guaranteed – especially if multiple threads are spawned – better not to rely on it.
To deal with such data import into text column (or varchar(max) in v2005+) then parse the data. This might be quite slow. You will not be able to define a column or row terminator and there is an interesting “feature” to be aware of in some versions of sql server: If the text file length is divisble by 4 the bulk insert would fail without giving an error. Test your version to see if it has this problem and if so you can check the file length and and a dummy character if it’s divisible by 4.
To Use An Identity Or Not
It is a question that often raises passions of almost religious fervour and a search will find many threads on the subject.
I have heard people say that every table in a database should have an identity column and that only those should be used in joins. Other people say that they have no place in a relational database and should never be used. I would not subscribe to either of these opinions but would use an identity where it seems sensible. When importing data into staging tables an identity can be useful to identify the rows which may otherwise contain duplicates. In the same situation it can be useful for batching rows to fit in with the memory available for processing. A lookup table needs an ID – why not make it an identity if it is not allocated from a script. It can be useful for allocating IDs – e.g. a customer ID but be careful about different systems allocating the same ID.
Disaster Recovery
There can be an issue with disaster recovery and standby systems. If the identity values are used in another database then the databases may get out of step. When the standby system is brought on-line there needs to be some means of checking that the values are consistent across the databases
Summary Of Points Covered
- An identity column has a name, initial seed and step.
- An identity column is not guaranteed to be unique nor consecutive
- We can find the current identity seed and change it by using dbcc checkident
- The next value allocated is the step added to the current seed.
- An insert failure can change the current seed value.
- An explicit value may be inserted via set identity_insert on and including the column list.
- A value explicitly inserted that is more than the current seed in the direction of the step will update the current seed.
- A truncate table (but not delete) will update the current seed to the original seed value.
- Scope_identity() can be used to find the last identity value allocated.
- An identity column can be added to a table but the identity property of an existing column cannot be changed.
- The identity function may be used to create an identity column on a table created using select into.
- Identity columns and their properties can be found via sys.identity_columns.
- Bulk insert cannot be guaranteed to allocate the identity values in the order of rows in a text file.
- It is sometimes easier to use a view to bulk insert into a table with an identity column.
Load comments