Set-based Constraint Violation Reporting in SQL Server

Introduction

Many of us have experience on more than one SQL-DBMS, but it is almost inevitable that you will become a specialist on one product or another. All SQL-DBMSs are, of course, based on the relational model as originally defined by Edgar Codd (how well SQL follows the relational model is another issue, which we won’t go into here). Though the basic principles of all products are the same, sometimes the details of particular implementations can lead to difficulties when you make the move from one product to another. I have worked with Sybase and Oracle in the past, but today I make my living with Microsoft SQL Server, so I am most up to date with that product and have the deepest knowledge of it.

It can be interesting to look at the way the different products solve the same problems, and to observe what features may be implemented in one and not in the other. One of these differences is the way that Oracle and SQL Server handle the reporting of multiple constraint violations from a single statement. The Oracle approach is elegant and straightforward, in SQL Server we have to put a lot of effort into devising a workaround.

Oracle doesn’t always do better in this sort of comparison; SQL Server specialists are often puzzled by the dire warning issued against using triggers by many Oracle experts. To those of you are familiar with SQL Server but not with Oracle, triggers in Oracle can be used effectively but it is considerably more challenging to use them correctly [1].

On the other hand Oracle specialists may be equally puzzled by what I am about to say about SQL Server.

The Problem

For the most part SQL Server deals fairly well with set based operations, but with constraints in general, some serious shortcomings quickly become apparent. Any SQL insert, update or delete command will cause an exception condition to be raised that returns one constraint violation message, no matter how many constraints are violated, how many different types of constraint are violated, or how many of the affected rows violate the constraint.

Before we go any further, let’s have a closer look at the kind of problems that SQL Server’s standard constraint violation reporting leads to.

We will use the following small database design for the examples.

CREATE TABLE dbo.customer

(

customer_id int NOT NULL,

customer_name varchar(100) NOT NULL,

CONSTRAINT "customer id must be unique" PRIMARY KEY

(

customer_id

)

);

INSERT INTO dbo.customer

(

customer_id,

customer_name

)

VALUES

(

100,

'Siemens'

);

CREATE TABLE dbo.sales_order(

customer_id int NOT NULL,

sales_order_number int NOT NULL,

sales_order_date date NOT NULL CONSTRAINT sales_order_date_default DEFAULT getdate(),

CONSTRAINT "sales order number must be unique" PRIMARY KEY CLUSTERED

(

sales_order_number

CONSTRAINT "sales order must have belong to valid customer"

FOREIGN KEY(customer_id)

REFERENCES dbo.customer (customer_id)

);

CREATE TABLE dbo.product(

product_name varchar(100) NOT NULL,

CONSTRAINT "product name must be unique" PRIMARY KEY CLUSTERED

(

product_name

)

);

CREATE TABLE sales_order_line

(

sales_order_number integer NOT NULL,

product_name varchar(100) NOT NULL,

quantity integer NOT NULL

CONSTRAINT quantity_default DEFAULT 1,

CONSTRAINT "order number and product name must be unique"

PRIMARY KEY (sales_order_number, product_name),

CONSTRAINT "sales order line must belong to valid order"

FOREIGN KEY (sales_order_number)

REFERENCES sales_order (sales_order_number),

CONSTRAINT "sales order line must be for valid product"

FOREIGN KEY (product_name)

REFERENCES product (product_name)

);

We now attempt the following insert:

insert into sales_order_Line

(

sales_order_number,

product_name,

quantity

)

values

(

100,

'Bolt',

100

)

Resulting in the following message:

Msg 547, Level 16, State 0, Line 3

The INSERT statement conflicted with the FOREIGN KEY constraint "sales order line must belong to valid order". The conflict occurred in database "playground", table "dbo.sales_order", column 'sales_order_number'.

The statement has been terminated.

Let’s say we now add the primary key reference to fix this problem:

INSERT INTO sales_order

(

customer_id,

sales_order_number

)

VALUES

(

100,

100

)

Attempting the insert on sales_order_line again, we once again get a foreign key violation constraint message, but for a different foreign key.

insert into sales_order_Line

(

sales_order_number,

product_name,

quantity

)

values

(

100,

'Bolt',

100

)

Msg 547, Level 16, State 0, Line 3

The INSERT statement conflicted with the FOREIGN KEY constraint "sales order line must be for valid product". The conflict occurred in database "playground", table "dbo.product", column 'product_name'.

The statement has been terminated.

Only one key violation is reported at a time. The situation gets even worse if we are inserting multiple rows. First let’s fix the foreign key violation on product:

1	INSERT INTO product (product_name) VALUES ('bolt')

Now we attempt to insert several rows into sales_product_line, some of which are valid and some of which will violate foreign key constraints:

insert into sales_order_Line

(

sales_order_number,

product_name,

quantity

)

values (100,'Bolt', 100),

(100, 'Bolt', 20),

(200, 'Bolt', 1),

(100, 'screw', 4)

We observe that the insert values contain not just foreign key violations, but also primary key violations.

When issuing the insert command, as usual, only one of the constraint violations is reported:

Msg 2627, Level 14, State 1, Line 1

Violation of PRIMARY KEY constraint 'order number and product name must be unique'. Cannot insert duplicate key in object 'dbo.sales_order_line'. The duplicate key value is (100, Bolt).

The statement has been terminated.

If your database is at the size of this demonstration, it would seem absurd to consider that this is a difficulty. However, with the enormous amounts of data being routinely imported into, or captured by, databases, a constraint violation exception can lead to a great deal of subsequent forensic work to find the rows that did it.

Our Oracle user friends would probably say at this point that they don’t understand what the problem is. We just use the LOG ERRORS clause on the insert. This feature has been available since Oracle 10g Version 2.

I created our test database in Oracle and issued the following commands:

1	EXECUTE DBMS_ERRLOG.CREATE_ERROR_LOG('sales_order_line', 'err$_sales_order_line');

This creates a table to hold the errors with the same columns as sales_order_line and some additional columns to hold the error information.

We then issue the insert with the LOG ERRORS clause added at the end (the syntax has been adapted as Oracle does not support multiple value lines in the insert statement, also I had to shorten the constraint names as Oracle will only accept 30 characters):

insert into sales_order_Line

(

sales_order_number,

product_name,

quantity

)

select 100,'Bolt', 100 from dual

union all

select 100, 'Bolt', 20 from dual

union all

select 200, 'Bolt', 1 from dual

union all

select 100, 'screw', 4 from dual

LOG ERRORS INTO err$_sales_order_line ('INSERT') REJECT LIMIT UNLIMITED;

This gives us the following result:

ORA ERR NUMBERSORA ERR MESG$ORA ERR ROWIDSORA ERR OPTYPSORA ERR TAGSSALES ORDER NUMBERPRODUCT NAMEQUANTITY1ORA-00001: Unique Constraint (SYSTEM.order nr: prod name not unique) verletzt|INSERT100Bolt1001ORA-00001: Unique Constraint (SYSTEM.order nr: prod name not unique) verletzt|INSERT100Bolt202291ORA-02291: Integritats-Constraint (SYSTEM.sales ord line not valid order) verletzt - ubergeordneter Schlussel nicht gefunden|INSERT200Bolt12291ORA-02291: Integritats-Constraint (SYSTEM.sales ord line not valid prod) verletzt - ubergeordneter Schlussel nicht gefunden|INSERT100screw4

A Solution

There are ways of avoiding this problem in SQL Server. Some workarounds are:

Using Stored procedures or Table-valued functions. Inserts and updates will be done by the application using a stored procedure. This would allow us to do whatever validation and checking is appropriate before the insertion is done.
Doing inserts a row at a time in a try-catch block This solution has been described here on Simple Talk in Handling Constraint Violations and Errors in SQL Server
The use of an INSTEAD OF trigger would allow us to find a way of executing this INSERT statement so that we get a report of all the constraint violations before we actually apply it.

The first two solutions will work well if you are in control of the import process. However, this isn’t always the case. By using the INSTEAD OF trigger, users can use their existing ways of inserting, deleting or updating data. In addition we can get much closer to the behaviour of the elegant Oracle solution. Clearly this cannot be achieved in a standard trigger on the table, because the database update has already taken place before the trigger has run and, if any constraints are violated, then the trigger code will never be executed.

Here is the INSTEAD OF insert trigger to report all the constraint violations for the primary key and the two foreign key constraints in a global temporary table.

CREATE TRIGGER dbo_sales_order_line_constraint_check_insert ON dbo.sales_order_line INSTEAD OF INSERT

DECLARE @error bit = 0

IF object_id('tempdb..##err_dbo_sales_order_line') IS NULL

BEGIN

CREATE TABLE ##err_dbo_sales_order_line(

sales_order_number int NOT NULL

,product_name varchar(100) NOT NULL

,quantity int NULL

,spid integer NOT NULL

,constraint_name sysname NOT NULL)

END

DELETE ##err_dbo_sales_order_line WHERE spid = @@spid

INSERT INTO ##err_dbo_sales_order_line(sales_order_number,product_name,quantity, spid, constraint_name)

SELECT sales_order_number,product_name,quantity,@@spid,'order number and product name must be unique' FROM inserted WHERE EXISTS

(

SELECT * FROM dbo.sales_order_line WHERE inserted.product_name = dbo.sales_order_line.product_name AND inserted.sales_order_number = dbo.sales_order_line.sales_order_number

)

OR EXISTS (SELECT * FROM inserted sub WHERE inserted.sales_order_number = sub.sales_order_number

AND inserted.product_name = sub.product_name

AND (inserted.quantity != sub.quantity

OR (inserted.quantity IS NULL AND sub.quantity IS NOT NULL) OR (inserted.quantity IS NOT NULL and sub.quantity IS NULL)))

IF @@rowcount > 0 SET @error = 1

INSERT INTO ##err_dbo_sales_order_line(sales_order_number,product_name,quantity, spid, constraint_name)

SELECT sales_order_number,product_name,quantity,@@spid,'sales order line must belong to valid order' FROM inserted

WHERE NOT EXISTS (SELECT * FROM dbo.sales_order WHERE

dbo.sales_order.sales_order_number = inserted.sales_order_number)

IF @@rowcount > 0 SET @error = 1

INSERT INTO ##err_dbo_sales_order_line(sales_order_number,product_name,quantity, spid, constraint_name)

SELECT sales_order_number,product_name,quantity,@@spid,'sales order line must be for valid product' FROM inserted

WHERE NOT EXISTS (SELECT * FROM dbo.product WHERE

dbo.product.product_name = inserted.product_name)

IF @@rowcount > 0 SET @error = 1

IF @error = 0

BEGIN

INSERT INTO dbo.sales_order_line

SELECT DISTINCT * FROM inserted

END

ELSE

BEGIN

RAISERROR('Constraint violation(s) - SELECT * FROM ##err_dbo_sales_order_line WHERE spid = @@spid for details', 16, 1)

END

We now do the insert again:

insert into sales_order_Line

(

sales_order_number,

product_name,

quantity

)

values (100,'Bolt', 100),

(100, 'Bolt', 20),

(200, 'Bolt', 1),

(100, 'screw', 4)

We now get this result:

Msg 50000, Level 16, State 1, Procedure dbo_sales_order_line_constraint_check_insert, Line 43

Constraint violation(s) - SELECT * FROM ##err_dbo_sales_order_line WHERE spid = @@spid for details

Msg 3609, Level 16, State 1, Line 1

The transaction ended in the trigger. The batch has been aborted.

We can now execute the query from the RAISERROR message:

1	SELECT * FROM ##err_dbo_sales_order_line WHERE spid = @@spid

Which now gives us all the rows that violated constraints and the corresponding constraint name.

sales_order_number product_name quantity spid constraint_name

100 Bolt 100 53 order number and product name must be unique

100 Bolt 20 53 order number and product name must be unique

200 Bolt 1 53 sales order line must belong to valid order

100 screw 4 53 sales order line must be for valid product

I have deliberately used temporary tables here as the we are probably only concerned with the results during the current session and we can save ourselves a lot of housekeeping by not holding the data permanently. Of course triggers for update and delete would also be required.

Now I can think of a number of objections to this approach.

Maintenance First of all, maintaining these triggers looks like a lot of work. We can ease the burden by generating the triggers automatically. The trigger code above was generated by a query (you can find the code for the query at the end of the article). Furthermore, the trigger could be regenerated in a DDL trigger whenever an ALTER TABLE is issued against the table. This way the trigger will always be in step with the constraints defined in the database.
Scalability. If we do a mass update and every row violates every constraint on the table then we could be faced with a massive number of output rows from the constraint reporting. It would be useful to have some way we could switch off or limit the number of rows returned. We could hard code a limit into the triggers but this seems a little arbitrary and may lead to misleading results.

Another refinement would be to have the INSTEAD OF triggers on views rather than on the tables themselves. If the view was updated then all the constraint violations would be reported, a direct update on the table would only report the first constraint on the first row that violated that constraint. Of course this would mean maintaining a whole set of views with the same structure as the underlying tables, but here again DDL triggers could come to our aid to modify the views automatically whenever the table structure changes (we would need a completely consistent naming convention for the views to achieve this).

Conclusion

As we have seen, different SQL-DBMS products can vary considerably in how they handle particular issues. Though it can be hard enough to keep up to date on the details of our “home” product, it can be very interesting to have a heads up from time to time and look at how the other half lives. Sometimes the grass on the other side of the fence is greener, but sometimes the concrete is grayer.

I think we can be fairly hopeful that Microsoft will eventually bring a similar kind of constraint reporting into the core product, thus matching the Oracle functionality, but until that time I hope that this article has provided some ideas about how a workaround could be developed.

References

[1] Oracle: Triggers Considered Harmful, Considered Harmful

Acknowledgments

Though I have been thinking about this problem for some time, I was prompted into action by the following article on Fabian Pascal’s dbdebunk website :
Weak Entities, Referential Constraints & Normalization

Appendix: Code to Generate Insert Trigger

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

ALTER VIEW generate_insert_constraint_reporting_trigger

SELECT

table_schema,

table_name,

/********************************************************

* Generate the create or alter trigger statement

********************************************************/

CASE WHEN EXISTS

(

SELECT *

FROM sys.triggers

WHERE name = tables.table_schema + '_' + tables.table_name

+ '_constraint_check_insert'

) THEN

'ALTER'

ELSE

'CREATE'

END +

' TRIGGER ' + tables.table_schema + '_' + tables.table_name

+ '_constraint_check_insert ON ' +

tables.table_schema + '.' + tables.table_name +

' INSTEAD OF INSERT ' + CHAR(10) +

'AS' + CHAR(10) +

'DECLARE @error bit = 0' + CHAR(10) +

/***************************************************

* Create global temporary table for results

***************************************************/

'IF object_id(''tempdb..##err_' + tables.table_schema + '_'

+ tables.table_name + ''') IS NULL' + CHAR(10) +

'BEGIN' + CHAR(10) +

CHAR(9) + 'CREATE TABLE ##err_' + tables.table_schema + '_'

+ tables.table_name + '(' + CHAR(10) + CHAR(9) +

(

STUFF(

(

SELECT

CHAR(10) + CHAR(9) + ',' + column_name + char(9) + data_type +

CASE WHEN data_type in ('varchar', 'nvarchar', 'char', 'nchar', 'binary', 'varbinary')

THEN

CASE WHEN CHARACTER_MAXIMUM_LENGTH = -1 THEN

'(max)'

ELSE

'(' + CAST(CHARACTER_MAXIMUM_LENGTH as varchar) + ')'

END

WHEN data_type in ('numeric', 'decimal') THEN

'(' + CAST(numeric_precision as varchar) + ','

+ CAST(numeric_scale as varchar) + ')'

WHEN data_type = 'float' THEN

'(' + CAST(numeric_precision as varchar) + ')'

ELSE

END + CASE WHEN IS_NULLABLE = 'YES' THEN

' NULL '

ELSE

' NOT NULL'

END

FROM

INFORMATION_SCHEMA.columns

WHERE

columns.TABLE_NAME = tables.table_name

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'

), 1, 3, '')

) + CHAR(10) + CHAR(9) + ',spid integer NOT NULL' +

CHAR(10) + CHAR(9) + ',constraint_name sysname NOT NULL' +

')' + CHAR(10) +

'END' + CHAR(10) +

'DELETE ##err_' + tables.table_schema + '_' + tables.table_name

+ ' WHERE spid = @@spid' + CHAR(10) +

/***************************************************

* Generate the primary key constraint check

***************************************************/

(SELECT

'INSERT INTO ##err_' + tables.table_schema + '_' + tables.table_name + '(' +

STUFF

(

SELECT

',' + column_name

FROM

INFORMATION_SCHEMA.columns

WHERE

table_schema = tables.table_schema

AND table_name = tables.table_name

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'

1, 1, ''

) + ', spid, constraint_name' +

')' + CHAR(10) +

'SELECT ' + STUFF

(

SELECT

',' + column_name

FROM

INFORMATION_SCHEMA.columns

WHERE

table_schema = tables.table_schema

AND table_name = tables.table_name

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'

1, 1, ''

) + ',@@spid,''' + constraint_name + ''' FROM inserted WHERE EXISTS

(

SELECT * FROM ' + tables.table_schema + '.'

+ tables.table_name +

' WHERE ' +

STUFF((

SELECT

' AND inserted.' + column_name

+ ' = ' + tables.table_schema + '.'

+ tables.table_name + '.' + column_name

FROM

INFORMATION_SCHEMA.key_column_usage

WHERE

table_constraints.constraint_name =

key_column_usage.constraint_name

AND table_constraints.table_name = tables.table_name

AND table_constraints.table_schema = tables.table_schema

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)')

,1,4,'') + CHAR(10) + ')' + CHAR(10) +

/**********************************************************

* There may be duplicates not just in the existing data

* but in the inserted data too.

**********************************************************/

' OR EXISTS (SELECT * FROM inserted sub WHERE' +

STUFF((

SELECT

CASE WHEN KEY_COLUMN_USAGE.column_name IS NULL

AND columns.IS_NULLABLE = 'YES' THEN

' AND (inserted.' + columns.column_name +

' != sub.' + columns.column_name + CHAR(10) +

' OR (inserted.' + columns.column_name +

' IS NULL AND sub.' + columns.column_name + ' IS NOT NULL)' +

' OR (inserted.' + columns.column_name +

' IS NOT NULL and sub.' + columns.column_name + ' IS NULL))'

WHEN KEY_COLUMN_USAGE.column_name IS NULL

AND columns.IS_NULLABLE = 'NO' THEN

' AND inserted.' + columns.column_name

+ ' != sub.' + columns.column_name + CHAR(10)

ELSE

' AND inserted.' + columns.column_name + ' = sub.'

+ columns.column_name + CHAR(10)

END

FROM

INFORMATION_SCHEMA.columns

LEFT JOIN

INFORMATION_SCHEMA.key_column_usage

columns.column_name = key_column_usage.COLUMN_NAME

AND table_constraints.constraint_name = key_column_usage.constraint_name

WHERE

table_constraints.table_name = tables.table_name

AND table_constraints.table_schema = tables.table_schema

AND columns.table_schema = tables.table_schema

AND columns.table_name = tables.table_name

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)')

,1,4,'') + ')' + CHAR(10) +

'IF @@rowcount > 0 SET @error = 1' + CHAR(10)

FROM INFORMATION_SCHEMA.table_constraints

WHERE

tables.TABLE_SCHEMA = table_constraints.table_schema

AND tables.table_name = table_constraints.table_name

AND table_constraints.constraint_type = 'primary key'

) + CHAR(10) +

/******************************************************

* Generate the foreign key constraint check

******************************************************/

COALESCE((

SELECT

'INSERT INTO ##err_' + tables.table_schema + '_' + tables.table_name + '(' +

STUFF

(

SELECT

',' + column_name

FROM

INFORMATION_SCHEMA.columns

WHERE

table_schema = tables.table_schema

AND table_name = tables.table_name

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'

1, 1, ''

) + ', spid, constraint_name' +

')' +

CHAR(10) + 'SELECT ' + STUFF

(

SELECT

',' + column_name

FROM

INFORMATION_SCHEMA.columns

WHERE

table_schema = tables.table_schema

AND table_name = tables.table_name

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'

1, 1, ''

) + ',@@spid,''' + name + ''' FROM inserted ' + CHAR(10) +

' WHERE NOT EXISTS (SELECT * FROM ' +

object_schema_name(referenced_object_id) + '.'

+ object_name(referenced_object_id) + ' WHERE ' + CHAR(10) +

STUFF((

SELECT

' AND ' + object_schema_name(referenced_object_id) + '.'

+ object_name(referenced_object_id) + '.' +

col_name(referenced_object_id, referenced_column_id)

+ ' = ' + 'inserted.' +

col_name(parent_object_id, parent_column_id)

FROM

sys.foreign_key_columns

WHERE

foreign_keys.object_id = foreign_key_columns.constraint_object_id

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'

),1,4,'') + ')' + CHAR(10) +

'IF @@rowcount > 0 SET @error = 1' + CHAR(10)

FROM

sys.foreign_keys

WHERE

OBJECT_SCHEMA_NAME(parent_object_id) = tables.table_schema

AND object_name(parent_object_id) = tables.table_name

FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'

), '') + CHAR(10) +

/**********************************************

* If not constraint violations found then

* do the insert, otherwise raise an error

**********************************************/

'IF @error = 0 ' + CHAR(10) +

'BEGIN' + CHAR(10) +

CHAR(9) + 'INSERT INTO ' + tables.table_schema + '.' + tables.table_name + CHAR(10) +

CHAR(9) + 'SELECT DISTINCT * FROM inserted' + CHAR(10) +

'END' + CHAR(10) +

'ELSE' + CHAR(10) +

'BEGIN' + CHAR(10) +

CHAR(9) +

'RAISERROR(''Constraint violation(s) - SELECT * FROM ##err_'

+ tables.table_schema + '_' + tables.table_name

+ ' WHERE spid = @@spid for details'', 16, 1)' + CHAR(10) +

CHAR(9) + 'ROLLBACK' + CHAR(10) +

'END' as trigger_sql

FROM

information_schema.tables

Here is a test example:

select trigger_sql

from generate_insert_constraint_reporting_trigger

where table_name = 'sales_order_line'

Register for Simple Talk

Set-based Constraint Violation Reporting in SQL Server

Introduction

The Problem

A Solution

Conclusion

References

Acknowledgments

Appendix: Code to Generate Insert Trigger

Article tags

About the author

William Sisson

William Sisson's contributions

Articles

Books

Top topics

William Sisson's latest contributions:

Where in the Application Should Data Validation be Done?

Introduction

The Problem

A Solution

Conclusion

References

Acknowledgments

Appendix: Code to Generate Insert Trigger

Article tags

Recommended

About the author

William Sisson

William Sisson's contributions

Articles

Books

Top topics

William Sisson's latest contributions:

Where in the Application Should Data Validation be Done?