Questions about T-SQL Expressions You Were Too Shy to Ask

Nobody seems to ask questions about SQL Expressions in forums, even though expressions can cause all sorts of problems. Even the books on T-SQL skate over them in haste to get to more complicated topics. It is time for frank, straight-forward Q&A, and who better than Robert Sheldon to give the A?

The Questions

  1. What exactly is a T-SQL expression
  2. I want to display outputted numbers as currency, with commas and dollar signs. Is that possible in T-SQL?
  3. I see the plus sign (+) used in a variety of ways in T-SQL script, but I can’t always make sense of how SQL Server arrives at the results it does. Is there a “right way” to use plus signs?
  4. The NULLIF expression makes no sense. I would think you’d use it to somehow ferret out a NULL value or to test a condition for a NULL value. But it appears to merely return a NULL if two values are equal. Is that all it’s doing?
  5. Can you use expressions in data manipulation language (DML) statements?
  6. The CASE expression allows me to leave out the ELSE clause. Is that important to include it?
  7. Can you shed light on the order in which the database engine processes the components in an expression?
  8. I get confused on how to treat constants in my expressions. Is there a trick in using them?
  9. Are there any advantages to using a COALESCE expression rather than a CASE expression when checking for the first non-NULL value in a list of values?
  10. I’m working on a query that uses the NOT IN operator in the WHERE clause. The operator checks values in a list returned by a subquery. When the returned list includes a NULL value, the query returns an empty resultset, even though I know I should be seeing results. Any idea what might be happening?
  11. What the heck is a ‘modulo’?
  12. Are all WHEN and ELSE clauses in a CASE expression evaluated if the first WHEN clause evaluates to TRUE?
  13. What’s the difference between the two ‘not equal to’ operators (<> and !=)?
  14. I’ve come across operators such as += in variable assignment SET statements. What do they mean?

What exactly is a T-SQL expression?”

An expression is a block of code in a T-SQL statement that is processed as a unit in order to return a scalar (single) data value. Each expression is made up of one or more types of components, including constants, columns, variables, operators, scalar functions, and scalar subqueries. SQL Server supports the use of expressions within different types of T-SQL statements and within various parts of those statements. For example, you can define expressions in the SELECT, WHERE, GROUP BY, and ORDER BY clauses of a SELECT statement.

The following T-SQL code contains a variety of expressions throughout the SET and SELECT statements:

The first expression shows up in the SET statement, after the equal sign. It is simply the GETDATE() function. An expression is often made up of only a single element, such as a column, constant, or in this case, function.

Next we move to the SELECT statement’s select list. The first expression we find defines the ProdName column. The expression concatenates the Name column with the last four digits of the ProductNumber column. The expression includes two concatenation operators (+), one literal string value (:), and the SUBSTRING scalar function, which extracts the last four digits from the ProductName column.

The next expression in the select list is the ReorderPoint column. The expression and name of the column are one and the same. This is followed by an expression that defines the NewPoint column. The expression multiplies the ReorderPoint value by 1.1, uses the ROUND function to round the result to a whole number, and then uses the CONVERT() function to convert the value to an integer.

Next we have an expression that defines the PointDiff column. In this case, the expression merely subtracts the ReorderPoint value from the SafetyStockLevel value, using the subtract operator to calculate the difference.

Our next expression defines the ProdColor column. Here we use a CASE expression to provide a default value should a NULL be returned.

The final expression in the select list defines the MonthDiff column. The expression uses the DATEDIFF function to calculate the difference, in months, between the SellStartDate value and the @date variable value.

The next expression is in the search condition of the WHERE clause. The expression compares the ProductID value to the range of values defined by the BETWEEN operator, the AND operator, and the literal values 316 and 320. You can use various operators to link expression elements together in this way, creating one large expression that is synthesized into a single unit that returns a scalar value. Note that the search condition can also be considered to be three expressions linked together by the BETWEEN and the AND operators: one column and two numerical constants.

Because of the expressions specified throughout the T-SQL code, our SELECT statement returns the following results:

ProdName

ReorderPoint

NewPoint

PointDiff

ProdColor

MonthDiff

Blade:2036

600

660

200

N/A

145

LL Crankarm:5965

375

413

125

Black

145

ML Crankarm:6738

375

413

125

Black

145

HL Crankarm:7457

375

413

125

Black

145

Chainring Bolts:2903

750

825

250

Silver

145

 Clearly, expressions play an important role in a T-SQL statement. They can be as simple the ones shown here or far more complex.

What we’ve not covered, however, is the FROM clause. In this case, the FROM clause includes only the schema and table name. In some cases, the FROM clause might include a table expression, such as a simple table-valued variable or an operand in an APPLY clause. However, those types of expressions are usually specifically referred to as table expressions. When the term expression is used by itself, it’s normally referring to a scalar expression, such as those highlighted in the example above.

To know when and where you can use expressions, you should refer to the statement syntax in SQL Server Books Online. Not only will this show you where you can create expressions, but also whether there are any limitations on the expressions you create.

“I want to display outputted numbers as currency, with commas and dollar signs. Is that possible in T-SQL?”

Yes, it’s possible, but before we go into how that is done, be aware that such operations are generally best done by the application at the presentation level, rather than at the database level. Still, there are times when you are required to convert data into its representational form. So let’s look at how to format your currency.

Prior to SQL Server 2012, to get data into a currency format (at least for currencies with a structure similar to the US dollar), you could cast the value to the MONEY type, if necessary, and then use the CONVERT function to convert the value to a character type, as shown in the following example:

Each expression in the SELECT list references one of the variables as its source data and is made up of multiple components. What provides the final format, however, is the CONVERT function. The function requires that the source data be either the MONEY or SMALLMONEY type. When you call the function, you specify a second argument along with the source value. That argument determines the format of the data. In this case, we specify 1 so the outputted data includes commas every three digits to the left of the decimal point. We also concatenate the value with a dollar sign to give it that final touch, as shown in the following results:

amount1

amount2

amount3

$555,555,555.56

$555,555,555.56

$555,555,555.56

 Overall, this isn’t too painful a process, unless you’re working with currencies that follow a different format, such as the Swiss franc, which takes apostrophes rather than commas. In such cases, you might have to add yet another function, such as REPLACE, to substitute the commas with apostrophes.

Fortunately, since the release of SQL Server 2012, you’ve been able to instead use the FORMAT function to output your numbers to a currency format. The FORMAT function returns a value in a specified format, based on a specified culture. To demonstrate how this works, let’s first recast our preceding expressions into ones that use the FORMAT function to configure the data as US dollar amounts:

Notice that we must first cast the character data (@var1) to the MONEY data type, but we can use the DECIMAL variable (@var2) as is. The second FORMAT argument (c) specifies that we’re formatting currency, and the third argument specifies the US symbol (en-us) for the cultural context. The SELECT statement returns the same results as the previous SELECT statement, with all values returned as the NVARCHAR type.

However, with the FORMAT function, we can specify other cultures. For example, the following example uses the German symbol (de-de) to provide the cultural context:

Now our results are specific to how the euro is represented in Germany, as shown in the following table:

amount1

amount2

amount3

555.555.555,56 â¬

555.555.555,56 â¬

555.555.555,56 â¬

 Certainly, the FORMAT function is a handy tool if you want your results to be displayed in a specific currency. But just to reiterate, such formatting is generally handled by the calling application, with the source data retrieved in its raw state.

“I see the plus sign (+) used in a variety of ways in T-SQL script, but I can’t always make sense of how SQL Server arrives at the results it does. Is there a “right way” to use plus signs?”

SQL Server is a tricky devil. As you’ve discovered, plus signs have all sorts of meaning in T-SQL. You can use them to concatenate string values, add numerical values together, or indicate that a numerical value is a positive number, rather than negative. The following SELECT statement demonstrates how to both concatenate and add values:

For the FullName column, we’ve used the plus sign to concatenate the FirstName and LastName columns with a space in between. For the TwoYearSales column, we’ve used the plus sign to add the SalesLastYear and SalesYTD columns together. In both cases, the database engine knows when to concatenate the values or when to add them together, as shown in the following results:

FullName

SalesLastYear

SalesYTD

TwoYearSales

Michael Blythe

$1,750,406.48

$3,763,178.18

$5,513,584.66

Linda Mitchell

$1,439,156.03

$4,251,368.55

$5,690,524.58

Jillian Carson

$1,997,186.20

$3,189,418.37

$5,186,604.57

Garrett Vargas

$1,620,276.90

$1,453,719.47

$3,073,996.36

Tsvi Reiter

$1,849,640.94

$2,315,185.61

$4,164,826.55

Pamela Ansman-Wolfe

$1,927,059.18

$1,352,577.13

$3,279,636.31

Shu Ito

$2,073,506.00

$2,458,535.62

$4,532,041.62

José Saraiva

$2,038,234.65

$2,604,540.72

$4,642,775.37

David Campbell

$1,371,635.32

$1,573,012.94

$2,944,648.25

Tete Mensa-Annan

$0.00

$1,576,562.20

$1,576,562.20

Lynn Tsoflias

$2,278,548.98

$1,421,810.92

$3,700,359.90

Rachel Valdez

$1,307,949.79

$1,827,066.71

$3,135,016.50

Jae Pak

$1,635,823.40

$4,116,871.23

$5,752,694.62

Ranjit Varkey Chudukatil

$2,396,539.76

$3,121,616.32

$5,518,156.08

There’s no real mystery here. When we use the plus sign with only character data, the database engine concatenates the values. In this case, the plus sign is considered a string concatenation operator. When we use the plus sign with numerical values, the database engine adds the values together. Under these circumstances, the plus sign is considered an addition operator.

Not surprisingly, when we mix and match the types of data, our results are less predictable. Take a look at the following example:

As you can see, we’ve declared seven variables, each defined with a different type. We then use the plus sign to combine the variables in different ways, giving us the following results:

var1var2

var3var6

var4var5

var7var3

var6var7

TestVarCX

1299.99

1001.0001

2015-05-21 00:00:00.000

2017-04-19 23:45:36.000

As we would expect, when we pair @var1 with @var2, both character types, we end up with a concatenated string. When we pair @var3 to @var6, both numerical types, the values are added together, even though @var3 and @var6 are different types.

Now we get to our third pairing: @var4 + @var5. This time around, we’re adding a DECIMAL value to BIT value. However, the database engine still treats both variables as numerical values and adds them together, even though SQL Server documentation suggests that you can’t do this with the BIT type. (And really, why would you?)

The next item in the select list, @var7 + @var3, pairs a DATETIME value with a SMALLINT value. This time, we get a date that is 300 days later than the original date. When your expression includes the plus sign, the database engine defaults to the data type with the highest precedence, in this case, DATETIME. Based on the type, the engine determines whether to use the plus sign to concatenate values or add them together. If the data type with the highest precedence is a numeric data type, the engine attempts to add the values. In this case, the engine adds the SMALLINT value to the DATETIME value as the specified number of days. (A DATETIME value is actually stored as two integers.)

Finally, we pair a MONEY date type with the DATETIME date type (@var6 + @var7). Once again, DATETIME takes precedence over MONEY, so the database engine adds 999.99 days to our original date to come up with a date in 2017.

We can verify how data type precedence works with plus signs by using the SQL_VARIANT_PROPERTY function to retrieve the data type of the expressions that pair the different variables:

The SELECT statement returns the returned data type for each expression, as shown in the following table:

var1var2

var3var6

var4var5

var7var3

var6var7

nvarchar

money

decimal

datetime

datetime

 In each case, the expression returns the type with the highest precedence and determines how the plus sign will be used.

So far in our examples, the database engine has been able to add or concatenate values without a problem because the engine could implicitly convert the data type with the lowest precedence to the type with the highest. But this isn’t always the case:

This time around, we’re using the plus sign to pair a NVARCHAR value with a SMALLINT value. Because SMALLINT has precedence over NVARCHAR, the database engine attempts to convert the value TestVar to the SMALLINT type, which of course is not possible. Consequently, the database engine returns the following error message:

The only way we can pair a numerical value with a string value is to concatenate them. In order to do so, we must explicitly convert the SMALLINT value to a character data type:

Now our SELECT statement returns the value TestVar300.

There’s one other use of the plus sign that you should be aware of-as a unary plus operator. SQL Server supports two types of unary operators, plus and minus, which are used to designate whether a numeric value is positive or negative:

All we’ve done here is to designate our two literal values as positive and negative numbers. The SELECT statement returns the following results:

var1

var2

123.45

-123.45

 Be aware, however, that you cannot use the positive unary operator to convert a negative number to a positive number. For that, you need to use the ABS mathematical function.

“The NULLIF expression makes no sense. I would think you’d use it to somehow ferret out a NULL value or to test a condition for a NULL value. But it appears to merely return a NULL if two values are equal. Is that all it’s doing?”

That’s part of what it’s doing. When a NULLIF expression compares two values, it returns the first value if the two values are not equal. If they are equal, the expression returns a NULL value of the same data type as the first specified value. Here’s what a NULLIF expression looks like in action:

In this case, the NULLIF expression returns valueA because the two values are not equal. If they were equal, the returned value would be NULL. In addition, the returned value would be NULL if the first value were NULL.

When you know how a NULLIF expression works, it doesn’t seem so bad, although its use still has a tendency to cause confusion. Luckily, we can achieve the same results by using a CASE statement:

The advantage of the CASE statement is that it’s a lot easier to understand. Sure, NULLIF provides simpler syntax, but if it creates confusion, what’s the point?

“Can you use expressions in data manipulation language (DML) statements?”

You can use expressions wherever T-SQL syntax permits them, including DML statements. Take a look at part of the syntax for the UPDATE statement’s SET clause:

 According to SQL Server Books Online, the expression placeholder can be a “variable, literal value, expression, or a subselect statement (enclosed with parentheses) that returns a single value.” This pretty much defines the term expression as we understand it. Note, however, that any expression you define must conform to the rules associated with the particular T-SQL statement in which the expression is defined. For example, an UPDATE statement’s UPDATE clause can include the TOP clause, as shown in the following syntax:

 In this case, expression “specifies the number or percent of rows that will be updated.” In other words, you cannot specify a non-numerical string value as your expression, as you can with other types of expressions. That said, it’s still considered an expression.

So, yes, you can use an expression in a DML statement. In fact, they’re used all the time. Let’s look at an example. Suppose we create the following table in our database and insert a row of data:

Now let’s use an UPDATE statement to modify that row of data:

Our SET clause is updating two columns. Each SET definition includes an expression that either adds the @AmtReceived value to the target column or subtracts that value. If we run the SELECT statement after the table has been updated, we come up with the following results:

ProductID

InStock

OnOrder

1

67

3

If you’re writing T-SQL statements, you’re writing expressions, even if you don’t realize it. That’s why it’s important to refer back to the statement’s syntax whenever you’re uncertain how to define a particular element. Expressions are supported throughout most statements. The syntax for the UPDATE statement, for example, includes the expression placeholder in 10 different places, which suggests that many of us might not be using some statements to their fullest capacity.

 “The CASE expression allows me to leave out the ELSE clause. Is it that important to include it?”

Yes, but…. The ELSE clause specifies the value to return if no other comparison operations (WHEN...THEN) evaluate to true. If you omit the ELSE clause, the database engine instead returns a NULL value. A NULL value might be fine, if that’s what you want. But what if it’s not? Take a look at the following example, which uses a CASE expression to determine how to return an employee’s full name:

If the MiddleName column is NULL, that column is not included in the expression. Only the first and last names are concatenated. If the MiddleName value is an initial, a period is added to the initial and concatenated with the first and last names. Otherwise, all three names are concatenated, as shown in the following results:

EmployeeID

FullName

Territory

275

Michael G. Blythe

Northeast

277

Jillian Carson

Central

279

Tsvi Michael Reiter

Southeast

Now let’s modify the CASE expression by removing the ELSE clause:

This time a NULL is returned if the MiddleName column has a value that is more than an initial, as the following table shows:

EmployeeID

FullName

Territory

275

Michael G. Blythe

Northeast

277

Jillian Carson

Central

279

NULL

Southeast

Clearly, you would not want to delete the ELSE clause in this instance. You could, however, add a WHEN...THEN condition that replaces the ELSE clause, in which case, you’re still returning NULL of none of the conditions are met, which might be fine under some circumstances. Consider the following example:

If the SalesQuota value is great than 250,000, a true is returned. If the value is less than or equal to 250,000, we get a false. However, if the column is NULL, neither condition can be true, so the CASE expression returns a NULL, as shown in the following results:

FullName

HigherQuota

Stephen Jiang

NULL

Michael Blythe

true

Linda Mitchell

false

Jillian Carson

false

Tsvi Reiter

true

Pamela Ansman-Wolfe

false

Shu Ito

false

David Campbell

false

Tete Mensa-Annan

true

Syed Abbas

NULL

Amy Alberts

NULL

You can choose to include an ELSE clause and then provide a default value, such as N/A (or even NULL), whether or not you think it’s necessary. In fact, many developers think you should never omit the ELSE clause, even if you believe it is impossible for one of the WHEN...THEN conditions not to be met. Someone might make a change to the database that affects the statement or your thinking about possible outcomes might not be correct.

“Can you shed light on the order in which the database engine processes the components in an expression?”

The way in which an expression’s elements are processed depends on several factors: their listed order, whether any subset of elements is enclosed in parentheses, and what operators are used to connect the various elements. Parentheses group elements together to ensure that those elements are processed as a unit before being incorporated into the rest of the expression. In addition, the expression as a whole adheres to the concept of operator precedence, in which the operator types determine the order in which the expression is evaluated. For example, the multiply and division arithmetic operators take precedence over the greater than (>) and lesser than (<) comparison operations, which take precedence over logical operators such as AND or ANY. (You can find details about operator precedence in the MSDN topic “Operator Precedence.”)

Let’s start with a few simple arithmetic-based expressions to give you a sense of how this works. The following example declares several INT variables, assigns an expression to each one, and returns the values from all three:

All we’re doing in each expression is multiplying and adding and subtracting values. However, because we use parentheses in the second and third expressions, our results come up quite different:

num1

num2

num3

4

14

0

The @num1 expression is fairly straightforward. We multiple 3 by 2, add 5, and subtract 7, which gives us a total of 4.

However, in the @num2 expression, we enclose the 2 + 5 elements in parentheses, so they’re processed first, giving us a total of 7. The 7 is then multiplied by the first 3, giving us 21. From there, we subtract 7, giving us a total of 14.

In the @num3 expression, the parentheses enclose the 2 + 5 - 7 elements, which are processed before the other elements. Because this returns 0, 3 is multiplied by 0, giving us a final result of 0.

Now let’s look at another example, which again declares three variables and applies different expressions to them:

Here’s what are results look like:

SalesDiff1

SalesDiff2

6500

500

 The first expression includes no parentheses. However, the multiply operator takes precedence over the addition and subtraction operators, so the multiplication is performed first. That means we first multiply the SalesQuota value by 2 and the SalesYTD value by 2, giving us the expression 6000 - 2500 + 3000, which equals 6500.

The second expression does use parentheses. The first set of parentheses returns a value of 6000. The second set of parentheses actually contains a third set, and those are the elements evaluated first, before the outer set of elements. After they’re evaluated, that part of the expression enclosed in the second set of parentheses becomes @SalesLastYear + 3000, giving us a total of 5500. That 5500 is then subtracted from the 6000, which is how we end up with 500.

Now let’s look at an expression that incorporates character data as well as numerical data. In the following SELECT statement, our WHERE clause includes both the OR and the AND logical operators:

The WHERE clause search condition contains three predicates separated by the AND operator and the OR operator. In its current state, the search condition states that the ListPrice value must be greater than 2.5 times the StandardCost value and the ProductLine value must equal r or the ProductLine value must equal t. According to the rules of operator precedence, the database engine first evaluates the AND logical operator and then the OR logical operator. Because of the placement of the operators, the first two predicates must both evaluate to true or the third predicate must evaluate to true. In other words, all returned rows must meet the first two conditions or meet the third condition. As a result, the statement returns 59 rows that qualify. The following table shows part of the results:

ProductName

StandardCost

ListPrice

ProductLine

Touring Front Wheel

96.7964

218.01

T

Touring Rear Wheel

108.7844

245.01

T

Touring-Panniers, Large

51.5625

125.00

T

Road Bottle Cage

3.3623

8.99

R

Racing Socks, M

3.3623

8.99

R

Racing Socks, L

3.3623

8.99

R

HL Touring Frame – Yellow, 60

601.7437

1003.91

T

LL Touring Frame – Yellow, 62

199.8519

333.42

T

HL Touring Frame – Yellow, 46

601.7437

1003.91

T

HL Touring Frame – Yellow, 50

601.7437

1003.91

T

HL Touring Frame – Yellow, 54

601.7437

1003.91

T

HL Touring Frame – Blue, 46

601.7437

1003.91

T

HL Touring Frame – Blue, 50

601.7437

1003.91

T

HL Touring Frame – Blue, 54

601.7437

1003.91

T

Suppose what we were really after was to return rows that met the first condition and either the second or third condition. In this case, we would have to modify our search condition slightly:

Notice that the second and third predicates are now enclosed in parentheses. Consequently, the database engine will first evaluate them as a unit. That means for all rows returned, ListPrice must be greater than 2.5 times the value of StandardCost and ProductLine must equal r or t. The following table shows all the results now returned by the query.

ProductName

StandardCost

ListPrice

ProductLine

Road Bottle Cage

3.3623

8.99

R

Racing Socks, M

3.3623

8.99

R

Racing Socks, L

3.3623

8.99

R

Road Tire Tube

1.4923

3.99

R

Touring Tire Tube

1.8663

4.99

T

LL Road Tire

8.0373

21.49

R

ML Road Tire

9.3463

24.99

R

HL Road Tire

12.1924

32.60

R

Touring Tire

10.8423

28.99

T

 Notice that we now have only nine rows and that each row meets both conditions for the ListPrice and ProductLine columns. By applying the parentheses, we have better controlled how the rules of operator precedence are applied.

“I get confused on how to treat constants in my expressions. Is there a trick in using them?”

A constant is a literal value that is either part of an expression or is the entire expression. The way in which you treat constants depends on the type of data. As a general rule, you enclose character data in single quotes, but leave numerical data without quotes, although it’s not quite as straightforward as this. Before we go into the oddities, though, let’s look at a few examples of character constants. The following T-SQL declares several character variables and then retrieves their values:

By default, when you’re working with character string constants, you enclose them in singe quotes. If you’re working with Unicode strings, you precede the opening quote with an uppercase N. Anything enclosed in the quotes is considered part of that string, even if the characters are numbers, operators, or other types of symbols, as shown in the following results:

var1

var2

var3

var4

some value

some value + 1

12345

 In SQL Server, we can instead enclose character strings in double quotes by setting the QUOTED_IDENTIFIER option to OFF, as shown in the following example:

When the QUOTED_IDENTIFIER option is set to OFF, we can use double quotes just like single quotes, as is the case with the @var1 declaration. The advantage of using double quotes is that you can pass in a special character, such as an apostrophe, without having to escape it by adding a second apostrophe, as we had to do for @var3. The SELECT statement now returns the results shown in the following table:

var1

var2

Var3

its value

it’s a value

it’s a value

Be aware, however, that the QUOTED_IDENTIFIER option has other implications besides being able to use double quotes for string constants. For example, the option affects the way identifiers are handled in T-SQL. In addition, the SQL Server Native Client Provider and ODBC driver set this option to ON, which can impact your code if you had set the option to OFF. For these reasons you’re usually better off leaving the option set to ON.

As previously mentioned, you typically don’t enclose numerical constants in quotes, even if they include currency symbols or scientific notation. For example, the following T-SQL declares four types of numerical variables, sets their values, and retrieves those values:

If we had enclosed the constants in quotes, the database engine would have interpreted the values as character data and have tried to convert it to the type of the applicable variable. For the first three variables, an implicit conversion would have worked fine, but it would mean unnecessary work on the part of the database engine. For the last variable, the engine would have returned the following error:

When we specify a numerical constant without quotes, the database engine assumes a numerical type consistent with the value. For example, the value 147 is assumed to be an INT, so no implicit conversion is necessary. The following table shows the results returned by the SELECT statement.

var1

var2

var3

var4

147

10.5800

68.24

87.9

Even with numeric constants, an implicit conversion might be called for. In the following SELECT statement, the constant 1000 is compared the value in the ListPrice column, which is configured with the MONEY data type:

The 1000 is treated as an INT value, so it must be implicitly converted to the MONEY type. But that’s not all what’s going on in the WHERE clause. Notice that it also includes a string constant (blue). The constant is compared to the Color column, which is configured with the NVARCHAR data type. This means that the database engine must convert the VARCHAR string to the NVARCHAR type, which gives us the following results:

ProductName

ListPrice

HL Touring Frame – Blue, 46

1003.91

HL Touring Frame – Blue, 50

1003.91

HL Touring Frame – Blue, 54

1003.91

HL Touring Frame – Blue, 60

1003.91

Touring-2000 Blue, 60

1214.85

Touring-1000 Blue, 46

2384.07

Touring-1000 Blue, 50

2384.07

Touring-1000 Blue, 54

2384.07

Touring-1000 Blue, 60

2384.07

Touring-2000 Blue, 46

1214.85

Touring-2000 Blue, 50

1214.85

Touring-2000 Blue, 54

1214.85

Like string constants, date and time values are also enclosed in single quotes, even though the values are stored as integers, as you can see in the following example:

The date and time data types will accept various formats when passing in a constant, as long as the value is enclosed in quotes. The following table shows the results returned by the SELECT statement:

var1

var2

var3

var4

2014-08-01 00:00:00.000

2014-07-28 16:24:27.293

2014-07-30

14:05:55.0010000

 You should also be aware of how to treat constants when working with some of the other data types. For example, a VARBINARY value takes an 0x prefix but is not enclosed in quotes. The same goes for a BIT value. However, a UNIQUEIDENTIFIER value is enclosed in quotes, as shown in the following T-SQL:

 Our SELECT statement now returns the following results:

var1

var2

var3

0x14AD

1

CC421A37-E462-4AE0-8451-38F837FC5A1A

 Again, we generally enclose string values in quotes and leave the quotes off for numerical values, but as you’ve seen, some areas can be somewhat gray. If you get confused, check out the MSDN topic “Constants.”

“Are there any advantages to using a COALESCE expression rather than a CASE expression when checking for the first non-NULL value in a list of values?”

A COALESCE expression is a syntactic shortcut for a CASE expression. That means, when the query optimizer gets ahold of a COALESCE expression, it rewrites it as a CASE one, so in that sense, they’re one in the same. Let’s look at a couple of examples. In the first one, we use COALESCE to return the first column that is not a NULL value:

For this particular product, the Size column and SizeUnitMeasureCode column each contains a NULL value, but the WeightUnit column contains an actual value, G, so the COALESCE expression returns that value, as shown in the following results:

ProductName

Size

SizeUnit

WeightUnit

FirstNotNull

HL Road Rear Wheel

NULL

NULL

 Now let’s look at what happens when we change the COALESCE expression to a CASE expression:

The SELECT statement returns the same results as the previous example. That said, the COALESCE expression is a lot simpler, although the logic of the CASE statement is easier to understand. In some cases, you might find that it’s better to use CASE so your intent is clear to other developers.

One other consideration with COALESCE. Developers sometimes run into problems with COALESCE expressions because of data type issues. The value returned by COALESCE is based on the data type of the input value with the highest precedence, which might or might not be the returned value. If an implicit conversion is required, and such a conversion is not permitted, COALESCE will return an error. See the article “Questions about SQL Server Data Types You were Too Shy to Ask” for more details about this particular issue.

All that said, figuring out the logic behind COALESCE is not too difficult, so choosing between CASE and COALESCE often comes down to nothing more than personal preference.

“I’m working on a query that uses the NOT IN operator in the WHERE clause. The operator checks values in a list returned by a subquery. When the returned list includes a NULL value, the query returns an empty resultset, even though I know I should be seeing results. Any idea what might be happening?”

Using NOT IN can be tricky, especially when NULL values are involved. Case in point. The following SELECT statement includes a WHERE clause that checks for specific colors:

Basically, what we’re doing is returning all products that are saleable items (FinishedGoodsFlag = 1) and whose color is not one of the colors of the non-saleable products. Unfortunately, when we run this query we receive an empty resultset.

If you’re familiar with the data, you might be surprised by these results. After all, the table contains 295 rows of saleable products. Either there are no colors unique to saleable items, or something is wrong.

We can investigate the problem by first running the subquery in the example above separate from the outer query:

The following table shows the subquery’s results, which includes three distinct items:

Color

NULL

Black

Silver

As you can see, we have black and silver non-saleable items, along with those items for which no color has been assigned (the NULL value). Now let’s modify the subquery to instead retrieve the colors of the orderable products:

This time around, we again receive the same three values, along with another seven colors, as shown in the following table:

Color

NULL

Black

Blue

Grey

Multi

Red

Silver

Silver/Black

White

Yellow

Clearly something is wrong. Rows that contain those seven additional colors should have been included in our original query results. The problem, it turns out, is the NULL value. Let’s recast the logic of the WHERE clause into several separate predicates, connected by the AND logical operator:

Essentially, our NOT IN operator indicates that the value in the Color column should not be Black and should not be Silver and should not be NULL, in addition to being a saleable product. The problem is with the equation Color <> NULL. For a row to be returned, all conditions must evaluate to true; however, a NULL cannot evaluate to true or false, which means the condition as a whole will never evaluate to true and no rows will ever be returned. If we were to run this statement, we would once again receive an empty resultset.

One way around this predicament is to add a second predicate to the WHERE clause in our subquery:

Now our subquery returns only the values Black and Silver, which means our outer query should return all rows except those whose Color value is one of the two colors. The following table shows part of the results, although the query actually returns 120 rows:

ProductName

Color

HL Road Frame – Red, 58

Red

Sport-100 Helmet, Red

Red

Mountain Bike Socks, M

White

Mountain Bike Socks, L

White

Sport-100 Helmet, Blue

Blue

AWC Logo Cap

Multi

Long-Sleeve Logo Jersey, S

Multi

Long-Sleeve Logo Jersey, M

Multi

Long-Sleeve Logo Jersey, L

Multi

Long-Sleeve Logo Jersey, XL

Multi

HL Road Frame – Red, 62

Red

HL Road Frame – Red, 44

Red

HL Road Frame – Red, 48

Red

If you were to run this query, you would see the Color values black and silver are not included. However, what’s odd about this is that rows with NULL for a color are not included either, even though NULL is no longer being returned by the subquery. This might be fine in some cases, but in others you might want to see those values. Before we start looking for a workaround, however, another consideration is performance. Many SQL Server folks recommend using an EXISTS operator rather than an IN operator anyway. So let’s rewrite the last statement using NOT EXISTS:

Our WHERE clause now specifies that we include only saleable products, that they be a known color, and that the color not exist within the list of non-orderable colors. As a result, the SELECT statement returns the same results as the preceding one, 120 rows. And if we want to include the saleable products with a Color value of NULL, we can simply remove the IS NOT NULL expression:

Our statement now returns 170 rows, which include those rows with a Color value of NULL. In theory, this statement should perform better than the ones using NOT NULL, but you might want to test that part out yourself to be sure. Just know that using NOT IN with NULL values can return unexpected results, so be prepared.

 “What the heck is a ‘modulo’?”

Remember when you first learned long division? You had the dividend, the divisor, and the remainder. The dividend is the number you start with. The divisor is the number you divide into the dividend. The remainder is what’s left. Modulo is a mathematical operator that returns the remainder.

Here’s how it works. In the follow example, @var1 serves as our dividend, and @var2 serves as our divisor:

The first expression in our SELECT clause uses the division operator (/) to divide @var1 by @var2. The second expression uses the modulo operator (%) to calculate the remainder that’s produced when you divide these two numbers. The following table shows the results.

Division

Remainder

4

0

No surprise here. When you divide 32 by 8, you get 4, with a remainder of 0. But now let’s produce a remainder other than 0:

This time around, we simply set @var1 to equal 36, rather than 32. Now our results show a remainder of 4.

That’s all there is to using the modulo operator. And you can use it with other numeric types:

In this case, we’re dividing 55.566 by 5 to come up with the following results:

Division

Remainder

11.111200000000000

0.5560

As you can see, the expression that contains the modulo operator returns a value of 0.5560. As for the expression that contains the division operator, the operator’s returned value is based on the argument type with the highest precedence, which in this case is DECIMAL.

“Are all WHEN and ELSE clauses in a CASE expression evaluated if the first WHEN clause evaluates to TRUE?”

According to the SQL Server documentation, a CASE expression evaluates each condition sequentially and stops with the first condition that evaluates to true. In other words, the expression will short circuit when a condition is met, saving the database engine from having to perform unnecessary processing. This is a good thing. The less work the engine has to perform, the better.

To get a sense of how this works, let’s look at an example. The following SELECT statement includes a CASE expression that sets up three possible conditions when concatenating the name-related columns in the vSalesPerson view:

The first CASE condition checks whether the MiddleName value is NULL. If it is, the FirstName and LastName columns are concatenated and the expression stops running. If the column contains a value other than NULL, the expression moves on to the next condition, which checks the length of the MiddleName value. If it equals 1, the three columns are concatenated and a period is added to the middle initial. The expression will then stop at that point. Otherwise, the ELSE condition is applied and all three columns are concatenated as is. The database engine repeats the process for each row and returns the following results:

EmployeeID

FullName

Territory

275

Michael G. Blythe

Northeast

277

Jillian Carson

Central

279

Tsvi Michael Reiter

Southeast

The CASE expression’s ability to short circuit can help to avoid unnecessary processing when it can be substituted for other statement elements. For example, suppose we define the following SELECT statement:

Nothing flashy here. For each row, the database engine tests for the three specified conditions and determines whether that row is returned. Consequently, the resultset includes any saleable product that is either black or red, which means 127 rows evaluate to true. Part of those results are shown in the following table:

ProductName

Color

HL Road Frame – Black, 58

Black

HL Road Frame – Red, 58

Red

Sport-100 Helmet, Red

Red

Sport-100 Helmet, Black

Black

HL Road Frame – Red, 62

Red

HL Road Frame – Red, 44

Red

HL Road Frame – Red, 48

Red

HL Road Frame – Red, 52

Red

HL Road Frame – Red, 56

Red

LL Road Frame – Black, 58

Black

LL Road Frame – Black, 60

Black

LL Road Frame – Black, 62

Black

 Suppose we want to reduce some of the work that the database engine has to perform. We can instead use a CASE expression in the WHERE clause, as shown in the following SELECT statement:

If the Color value equals black, the condition evaluates to true and the expression stops running, thus avoiding any unnecessary processing. Yet the statement returns the same results as the preceding example.

Under the right circumstances, using a CASE expression in this way can be a handy way to reduce your workloads. However, a CASE expression might not always short circuit in the way you hope. For example, suppose your T-SQL includes logic is similar to that shown in the following example:

All we’re doing is declaring an INT variable, setting its value to 0, and using a CASE expression to retrieve a specific value. Based on our understanding of the CASE expression’s ability to short circuit, we would expect the expression’s first condition to evaluate to true and the expression to then stop. However, what we get from our SELECT statement is the following error:

The error shows that the ELSE condition did in fact run, even though it shouldn’t have. It turns out that under certain circumstances, a CASE expression doesn’t behave like it’s supposed to behave. The problem appears to be related to aggregate functions (at least in part), so be aware that in some cases a CASE expression might not deliver the benefits you’re hoping for and you might have to come up with a workaround to make the CASE expression work the way you want it to.

“What’s the difference between the two ‘not equal to’ operators (<> and !=)?”

In T-SQL, the two comparisons operators are syntactically equivalent. They each perform a Boolean comparison between two expressions. If those expressions return non-NULL, unequal values, the comparison evaluates to true, otherwise, it evaluates to false. However, if either value is NULL, the evaluation returns a NULL. Here’s a sample of the two operators in action:

 The variable is set to a value of 1, and in each SELECT statement, that variable is compared to the constant 2. As a result, each SELECT statement returns the value not equal because each comparison evaluates to true. Even in terms of performance, there doesn’t appear to be much difference between the two.

However, there is one difference you should be aware of. The <> operator conforms to ANSI standards; the != operator does not, despite that fact that a number of relational database systems support both operators. That said, even if there is only the slightest chance that you might port your script to another system, you’re better off sticking with the <> operator.

“I’ve come across operators such as += in variable assignment SET statements. What do they mean?”

The add-equals operator (+=) is a compound operator, one of several types of compound operators supported in SQL Server. A compound operator takes the original value and in some way amends it, rather than replacing it. For example, the following T-SQL uses the add-equals operator to update the value of the @var1 variable:

We start by declaring @var1 and setting its value to 2. We then use a SET statement to add a value of 1 to the original value. The SELECT statement returns the new value, which is 3.

The add-equals operator makes it possible to add a value without having to specify the original value. Without the add-equals operator we would have to rewrite our SET statement as follows:

The add-equals operator merely provides the shorthand necessary to avoid having to explicitly specify the original value, yet the SELECT statement still returns a value of 3. Plus, SQL Server supports other compound operators as well. For example, the following T-SQL subtracts 1 from the original value:

In this case, we’re using the subtract-equals (-=) operator in our SELECT statement, rather than the add-equals operator. The subtract-equals operator works the same way as the add-equals operator, except that the second value is subtracted rather than added. As a result, our SELECT statement now returns a value of 1.

Another example of a compound operator is the multiply-equals operator, which is shown in the following example:

In this case, the SET statement uses the multiply-equals operator to multiple the variable value 2 by the constant value 3, giving us a total of 6. And compound operators are not limited to mathematical expressions. For example, the following T-SQL includes a SET statement that uses the add-equals operator to concatenate values:

In this example, the SET statement adds the string value and then some (including the opening space) to the variable value some value, giving us the value some value and then some. There are other compound operators as well, such as divide-equals (/=) and modulo-equals (%=). Check out the MSDN topic “Compound Operators” for more information.