Operator Precedence versus the Confusing Constraint Translation

The other day, a person I work with was creating some fairly complex CHECK constraints to protect data in a new table that was being created (a practice I am 100% for: https://www.red-gate.com/simple-talk/blogs/eliminating-annoying-space-characters-via-check-constraints/. The multiple constraints were combined into just one, with several conditions condensed into a single Boolean expression, using several ANDs, ORs, and parentheticals. When the constraint was being reviewed, the scripted output of the constraint definition looked funny, as the parenthesis had all been removed.

You can see the same thing in the following table, which I will add a semi complex CHECK constraint, with similar use of parentheses, with a NOT thrown in to help the example be a bit more complex:

USE tempdb;

CREATE SCHEMA Demo;

CREATE TABLE Demo.ExampleConstraint

(

ExampleConstraintId int NOT NULL

CONSTRAINT PK_ExampleConstraint PRIMARY KEY,

Value1 int NULL,

Value2 int NULL

);

Then add this CHECK constraint:

ALTER TABLE Demo.ExampleConstraint

ADD CONSTRAINT CHKExampleConstraint

CHECK (

(Value1 IS NOT NULL AND Value2 IS NULL)

(Value1 IS NULL AND Value2 IS NOT NULL)

OR NOT Value1 = Value2 AND Value2 > 2 * Value1

);

Now, script out the constraint, or just use the definition you can get from the sys.check_constraint catalog view (something that would be really nice to have on the sys.foreign_keys table):

SELECT definition

FROM sys.check_constraints

WHERE OBJECT_SCHEMA_NAME(parent_object_id) = 'Demo'

AND OBJECT_NAME(parent_object_id) = 'ExampleConstraint';

You will see the code is not the same as when I coded it:

1 2	([Value1] IS NOT NULL AND [Value2] IS NULL OR [Value1] IS NULL AND [Value2] IS NOT NULL OR NOT [Value1]=[Value2] AND [Value2]>(2)*[Value1])

Looking as this, one thing stands out to the eye: the parenthesis have been removed from the comparisons, but added to the literal value of 2. (Also, everything is surrounded by square brackets, but as awful as that looks, it is a common thing to see in code that is generated/parsed, such as this or a query plan.) When the person who wrote the similar code saw this, they were concerned it was wrong, then the next person who did a code review was also concerned it was wrong, and when I saw it, I was sure it was right, but only because I had a bit more faith that if this were a problem with SQL Server’s toolset, I would probably have read about it!

Personally, I had to really look at the code, test out a similar expression in a CASE expression, and read this web page to remind me why it was actually right: Microsoft Docs: Operator Precedence. I mean, who writes code like this?

It all has to do with operator precedence, the operators in the expression are processed in the following order of:

1. Comparison Operators come first. So the IS NOT NULL, =, and > will be evaluated. So if Value1 = 2, and Value2 = NULL:

([Value1] IS NOT NULL AND [Value2] IS NULL OR [Value1] IS NULL AND [Value2] IS NOT NULL OR NOT [Value1]=[Value2] AND [Value2]>(2)*[Value1])

\ / \ / \ / \ / \ / \ /

( TRUE AND TRUE OR FALSE AND FALSE OR NOT UNKNOWN AND UNKNOWN )

Each of the comparisons returns a Boolean value, or UNKNOWN, which means a NULL was compared with something, not using a NULL safe comparison, like IS NOT NULL.

Now, here is where you instinctively start to think like simple arithmetic operations. 1 + 2 – 3 equals 0, no matter if you do 1 + 2 first, or 2 – 3 first. But <this> AND <that> OR <the other thing> is not the same if you do the AND or the OR comparison first, for sure.

2. NOT

We have one NOT operator, and NOT (UNKNOWN) still equals UNKNOWN (one of my favorite problems to talk about, not actually deal with in code!)

( TRUE AND TRUE OR FALSE AND FALSE OR NOT UNKNOWN AND UNKNOWN )

\ /

( TRUE AND TRUE OR FALSE AND FALSE OR UNKNOWN AND UNKNOWN )

3. AND

( TRUE AND TRUE OR FALSE AND FALSE OR UNKNOWN AND UNKNOWN )

\ / \ / \ /

( TRUE OR FALSE OR UNKNOWN )

4. OR

Which you can see is going to be false:

1	( TRUE OR FALSE OR UNKNNOWN )

So the values provided will obviously work, since TRUE OR anything will be TRUE. And of course, as long as one value is NULL and the other is NOT NULL, then the first two conditions will be met, and the last will not matter.

If both numbers are NOT NULL, then the first two conditions will FALSE, and the OR NOT Value1=Value2 AND Value2 > 2 * Value1 would actually matter. So if value2 is more than 2 times the size of value1, then the data would pass the CHECK constraint’s conditions.

Looking at this, you may think this is a good enough constraint, but it isn’t if you care about the case where both columns are NULL, then the output will be UNKNOWN OR UNKNOWN OR UNKNOWN. A constraint fails on a FALSE return value only. So:

1 2	INSERT Demo.ExampleConstraint(ExampleConstraintId, Value1, Value2) VALUES(1, NULL, NULL);

This means that the first two checks for the column values to have one NULL and one NOT NULL values is ineffective since both values being NULL will also succeed. If you want to make sure that exactly either 1 of the two columns is not null, you would need something like

--These parens ARE required in the parsed expression

(<original expression> )

AND (Value1 IS NOT NULL OR Value2 IS NOT NULL)

But the constraint will actually matter if both values are NOT NULL:

1 2	INSERT Demo.ExampleConstraint(ExampleConstraintId, Value1, Value2) VALUES(2, 1, 1);

Will not, because NOT Value1=Value2 AND Value2 > 2 * Value1 requires that Value1<>Value2

So while no one I know would have written the code this way, strictly because it is hard to read (minimally, formatting the code somewhat would give visual clues that the writer knew that AND has higher precedence than OR:

1 2	([Value1] IS NOT NULL AND [Value2] IS NULL OR [Value1] IS NULL AND [Value2] IS NOT NULL OR NOT [Value1]=[Value2] AND [Value2]>(2)*[Value1])

It is exactly the same Boolean expression as:

(

(Value1 IS NOT NULL AND Value2 IS NULL)

(Value1 IS NULL AND Value2 IS NOT NULL)

OR NOT Value1 = Value2 AND Value2 > 2 * Value1

)

In reality, had I been writing this expression, I would have been really likely to have put more parentheses in the expression, such as:

(

(Value1 IS NOT NULL AND Value2 IS NULL)

(Value1 IS NULL AND Value2 IS NOT NULL)

(NOT(Value1 = Value2) AND Value2 > 2 * Value1)

)

Just for clarity of reading the expression. Either way it was a good reminder of operator precedence and how SQL Server applies it to the saving and scripting of constraints.

Note:
In the expression: NOT [Value1]=[Value2] AND [Value2]>(2)*[Value1]; it probably seems like the Value1=Value2 is unnecessary for the [Value2]>(2)*[Value1] part, if value1 = -1 and value2 = -1, then the -1 > 2 * -1 will be true.

Register for Simple Talk

Operator Precedence versus the Confusing Constraint Translation

Article tags

About the author

Louis Davidson

Louis's contributions

Articles

Books

Top topics

Louis's latest contributions:

The good, the bad, and the awful taste muscle memory can have

Sponsoring Scenic City Summit

What about Loyalty?

Article tags

Recommended

About the author

Louis Davidson

Louis's contributions

Articles

Books

Top topics

Louis's latest contributions:

The good, the bad, and the awful taste muscle memory can have

Sponsoring Scenic City Summit

What about Loyalty?