Data Conversion in SQL Server

Most of the time, you do not have to worry about implicit conversion in SQL expressions, or when assigning a value to a column. Just occasionally, though, you'll find that data gets truncated, queries run slowly, or comparisons just seem plain wrong. Robert explains why you sometimes need to be very careful if you mix data types when manipulating values.

When working with data in a SQL Server database, you might find that some of that data needs to be converted from one data type to another. For example, suppose you need to compare two numbers, one configured with a numeric data type and one configured with a string data type. To perform the comparison, one of those values will have to be converted to the other’s data type.

SQL Server supports two types of data type conversion: implicit and explicit. Implicit means that the database engine will convert the data type automatically, a process invisible to the user. Explicit means that you must specify how the data type should be converted. If you don’t specify how SQL Server should convert the data types to do what you want (explicitly), it will try to guess your intentions (implicitly).

In this article, I introduce you to both types of data type conversions and provide examples that demonstrate how they work, with a few warnings about the sort of thing that can go wrong with implicit conversions. The examples are written against the AdventureWorks2008 sample database on a local instance of SQL Server 2008. A couple of examples use SQL Server 2008 syntax but you should have no problem running the others against the AdventureWorks database on an instance of SQL Server 2005.

NOTE: Data type conversion can also occur when an outside application retrieves data from a SQL Server database. However, this article is concerned only with data conversion between SQL Server objects.

Implicit Data Conversions

When data needs to be converted from one data type to another, SQL Server can often convert the value automatically (implicitly). For example, suppose you want to add two integers together and they’re configured with different data types, as in the following example:

In this example, I’ve configured the @Val1 variable with the CHAR data type and assigned a value of 2. I’ve configured the @Val2 variable with the INT data type and assigned a value of 3. When I add these values together, SQL Server automatically converts the @Val1 variable to an INT data type and produces a numeric sum, which I confirm by using the ISNUMERIC function. If the function returns a 0, the sum is not numeric. If the function returns a 1, it is numeric. The following table shows the query results, which indicate that the sum is a numeric value:

Total

IsNumber

5

Yes

If I were to change the second line to…

…then I’d get the result…

Total

IsNumber

2 please

No

If you mix data types such as INT and VARCHAR, SQL Server will always attempt to convert everything to the one that has the highest precedence. This is why you will sometimes get errors if you try to use implicit conversions to add a number to a string.

SQL Server supports a significant number of implicit data conversions that let you easily compare, combine, or move data. Data can be converted within a particular data type grouping, such as one type of string value to another (for instance, NCHAR to NVARCHAR) or between type groupings, such as a numeric value to a string value (for instance, INT to VARCHAR).

For the most part, SQL Server handles implicit conversions seamlessly, as in the first example above. However, you should be aware whenever the database engine performs implicit conversions because some conversions can lead to unexpected results.

For example, when SQL Server converts a string value of one size to a data type of a smaller size, the database engine truncates the value. In the following set of statements, I declare VARCHAR and CHAR variables and then assign values to them:

As you can see, I’ve configured @Var3 with the VARCHAR(4) data type and @Var4 with the CHAR(2) data type. After I assign the string abcd to @Val3, I then set the value of @Val4 to equal @Val3. Because @Val3 is made up of four characters and @Val4 can handle only two characters, the string is truncated, as the following results show:

Value3

Value4

abcd

ab

If you assign a VARCHAR value to a table that has insufficient space to contain it, using an INSERT or UPDATE statement, you would get a ‘truncation’ error. However, if you assign to a local variable or the parameter to a routine, you don’t: the value just gets truncated.

Another example of behavior to be aware of when implicit conversions are taking place is when you convert a number from a DECIMAL data type to an INT, as I do in the following example:

Notice that the @Val5 variable has been assigned a value of 12345.67. I then assign that variable to @Var6. However, because @Var6 is configured with an INT data type, it can handle only whole numbers, so the numbers after the decimal are truncated, as you can see in the following results:

Value5

Value6

12345.67

12345

The dangers of any conversion can also be graphically illustrated by this code.

Which gives the surprising result:

Equality Test

Val3 as an int

Val3

Yes! These numbers are equal

1000000000

1E+09

From what I’ve told you, you’ll see that the REAL value has insufficient precision to hold the INT, and the original value has been rounded down.

You should also be aware of the impact that implicit conversions can have on a query’s performance. To demonstrate what I mean, I’ve created and populated the following table in the AdventureWorks2008 database:

As you can see, the table includes a primary key configured with the NVARCHAR data type. Because the ProductID column is the primary key, it will automatically be configured with a clustered index. Next, I set the statistics IO to on so I can view information about disk activity:

Then I run the following SELECT statement to retrieve product information for product 350:

Because statistics IO is turned on, my results include the following information:

Table ‘ProductInfo’. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Two important items to notice is that the query performed a scan and that it took six logical reads to retrieve the data. Because my WHERE clause specified a value in the primary key column as part of the search condition, I would have expected an index seek to be performed, rather than I scan. To learn more about the scan, I viewed the query’s execution plan, which is shown in Figure 1.

1211-Sheldon1.jpg

Figure 1: Execution Plan Showing an Index Scan

As the figure confirms, the database engine performed a scan, rather than a seek. Figure 2 shows the details of that scan (accessed by hovering the mouse over the scan icon).

1211-Sheldon2.jpg

Figure 2: Detail of the Index Scan Operation

Notice that in the Predicate section, the CONVERT_IMPLICIT function is being used to convert the values in the ProductID column in order to compare them to the value of 350 (represented by @1) I passed into the WHERE clause. The reason that the data is being implicitly converted is because I passed the 350 in as an integer value, not a string value, so SQL Server is converting all the ProductID values to integers in order to perform the comparisons.

Because there are relatively few rows in the ProductInfo table, performance is not much of a consideration in this instance. But if your table contains millions of rows, you’re talking about a serious hit on performance. The way to get around this, of course, is to pass in the 350 argument as a string, as I’ve done in the following example:

Once again, the statement returns the product information and the statistics IO data, as shown in the following results:

Notice that no scans have been performed this time and only two logical reads are indicated. Again, if this were a table with millions of rows, this would represent a significant improvement in performance because the query would be performing an index seek, as confirmed in the execution plan (shown in Figure 3).

1211-Sheldon3.jpg

Figure 3: Execution Plan Showing an Index Seek

Now the index is being properly used to locate the record. And if you refer to Figure 4, you’ll see that the values in the ProductID column are no longer being implicitly converted before being compared to the 350 specified in the search condition.

1211-Sheldon4.jpg

Figure 4: Detail of the Index Seek Operation

As this example demonstrates, you need to be aware of how performance can be affected by implicit conversions, just like you need to be aware of any types of implicit conversions being conducted by the database engine. For that reason, you’ll often want to explicitly convert your data so you can control the impact of that conversion.

Explicit Data Conversions

Explicit conversions let you exercise more control over your data type conversions whenever you compare, combine, or move data from one database object to another. To support explicit conversions, SQL Server provides two important functions: CAST and CONVERT. The functions are similar in their ability to convert data. However, the CAST function conforms to ISO specifications, which makes it more portable. The CONVERT function, on the other hand, is specific to SQL Server, but it supports additional functionality that lets you better control the format of some types of data.

Working with the CAST Function

The CAST function lets you convert an expression, such as a value retrieved from a column or variable, to a different data type. The function takes the following syntax:

As the syntax indicates, you must first specify the CAST keyword and then the necessary arguments, enclosed in the parentheses. The first argument is the expression that you want to convert. This is followed by the AS keyword and then by the data type that you want to convert the expression to. When applicable, you can also specify a length for your data type.

Let’s look at an example that demonstrates how this works. In the following SELECT statement, I retrieve data from the Production.Product table in the AdventureWorks2008 database:

In this statement, I use the CAST function to explicitly convert the value in the Name column. The column is configured with the VARCHAR data type, and I’m converting the value to the BINARY(30) data type. The following table shows the results returned by the query:

ProductName

BinaryName

HL Road Frame – Red, 52

0x48004C00200052006F006100640020004600720061006D00650020002D00

As you can see, SQL Server has converted the data to a binary value. One issue worth noting, however, is what happens to a value when you convert it to a binary data type of a different length than the original. For example, if you convert a string value to a binary value that is larger or smaller, SQL Server pads or truncates the data on the right side of the value. For other data types, SQL Server pads or truncates the value on the left.

Low let’s look at converting data to a string value. In the next example, I convert a UNIQUEIDENTIFIER value and DATETIME value to VARCHAR data:

In the first CAST function, I include the length (36) with the VARCHAR data type to accommodate the size of the UNIQUEIDENTIFIER column. However, in the second instance of CAST, I do not include the length. It’s not necessary in this case because SQL Server automatically returns the DATETIME data in the predefined format, as shown in the following query results:

ProductName

RowGuid

ModifiedDate

HL Road Frame – Red, 52

FCFEA68F-310E-4E6E-9F99-BB17D011EBAE

Mar 11 2004 10:01AM

Notice that the UNIQUEIDENTIFIER value from the rowguid column is returned as it is stored in the source column. However, because this column is configured with the UNIQUEIDENTIFIER data type, if I had specified a length shorter than 36, the SELECT statement would have generated an error.

In the next example, I convert the value in the ListPrice column, configured with the MONEY data type, to several different data types:

The original ListPrice value for this product is 1431.50. In the first instance of the CAST function, I convert the ListPrice value to VARCHAR. In the second instance, I convert the value to DECIMAL. And in the third, I convert the value to INT. The following table shows the results returned by the SELECT statement:

ProductName

StringPrice

DecimalPrice

IntPrice

HL Road Frame – Red, 52

1431.50

1431.5000

1432

As you can see, the conversion to VARCHAR is fairly straightforward; the original value is returned as a string. The DECIMAL value is a little different. Because I define the scale as 4, SQL Server adds two zeros to the amount to conform to the specified scale. However, when I convert the value to INT, the database engine rounds the value to an integer, in this case 1432.

As this example demonstrates, when you convert numeric data, your values can sometimes be changed. For example, converting DECIMAL data to FLOAT or REAL can result in the loss of precision.

In the following SELECT statement, I use the CAST function to convert a SMALLINT value to VARCHAR (the ReorderPoint column), a BIT value to TINYINT (the MakeFlag column), and DECIMAL value to FLOAT (the Weight column):

The following table shows the results returned by the SELECT statement:

ProductName

ReorderPoint

MakeFlag

ProductWeight

HL Road Frame – Red, 52

375

1

2.2

As you’d expect, the VARCHAR conversion returns the original value as a string. In addition, the TINYINT conversion returns the value as an integer. However, because the MakeFlag column is a BIT column, the value returned will be only 0 or 1. (Any nonzero value is promoted to 1.) The FLOAT conversion returns a slightly different value than the original. The original value is 2.20, but the conversion to FLOAT drops the final 0.

Working with the CONVERT Function

Like the CAST function, the CONVERT function lets you convert an expression to a different data type. The function takes the following syntax:

The syntax elements in the CONVERT function are in a different order from the CAST function, but are essentially the same. You specify the data type, along with the length, if necessary, followed by the expression to be converted. (There is no AS keyword in CONVERT.) In addition, the CONVERT function supports one other argument that CAST does not-style. The style argument is an integer that represents a predefined format. The styles are specific to the data type that is being converted. You can find a list of supported styles for each data type in the topic “CAST and CONVERT (Transact-SQL)” in SQL Server Books Online.

NOTE: Not all data types support format styles, in which case, you might consider using the CAST function instead because it conforms to ISO specifications.

For example, when you convert a DATETIME value to VARCHAR, there are numerous styles that you can choose from. In the following SELECT statement, I convert the value from the SellStartDate column (configured with the DATETIME data type) to three different styles supported for the DATETIME data type:

By themselves, these style numbers mean nothing. It’s only when they’re passed as an argument into the CONVERT function along with a specific data type do they become meaningful. The following table shows the results returned by the SELECT statement:

ProductName

StartDate100

StartDate101

StartDate102

HL Road Frame – Red, 52

Jul  1 2001 12:00AM

07/01/2001

2001.07.01

As you can see, each style displays the date in a different format. There are about 21 styles to choose from, so you should usually be able to come up with a format that fits your needs. Here is a chart of the various formats.

Name

2-digit year

example

4-digit year

Example

Default

100

Jan  4 2011 11:27AM

USA

1

01/04/11

101

01/04/2011

ANSI

2

11.01.04

102

2011.01.04

British/French

3

04/01/11

103

04/01/2011

German

4

04.01.11

104

04.01.2011

Italian

5

04-01-11

105

04-01-2011

dd mon yy

6

04 Jan 11

106

04 Jan 2011

Mon dd, yy

7

Jan 04, 11

107

Jan 04, 2011

hh:mm:ss

8

11:27:09

108

11:27:09

Default + milliseconds

109

Jan  4 2011 11:27:09:907AM

USA

10

01-04-11

110

01-04-2011

JAPAN

11

11/01/04

111

2011/01/04

ISO

12

110104

112

20110104

Europe default(24h) + milliseconds

113

04 Jan 2011 11:27:09:907

hh:mi:ss:mmm (24h)

14

11:27:09:907

114

11:27:09:907

ODBC canonical (24h)

120

2011-01-04 11:27:09

ODBC canonical (24h)+ milliseconds

121

2011-01-04 11:27:09.907

ISO8601

126

2011-01-04T11:27:09.907

ISO8601 with time zone

127

2011-01-04T11:27:09.907

Hijri

130

29 Ùحر٠1432 11:27:09:907AM

Hijri

131

29/01/1432 11:27:09:907AM

A rather different danger of relying on implicit conversions for dates can be shown in this example.

The following results show how SQL Server converts these values:

Date_ID

TheDate

1

2011-01-01 00:00:00.000

2

2011-01-02 00:00:00.000

3

2011-01-03 00:00:00.000

4

2011-01-04 00:00:00.000

5

1900-01-01 00:00:00.000

You probably intended that last date to be a null, or unknown, date in the original table, but it has been translated to the start of the twentieth century.

A better approach would involve at least a check that the date was valid (and that the language setting was correct.)

The CONVERT function is also handy when you want to convert a MONEY value to a string value. In the following example, I use the CONVERT function to convert a variable value configured with the MONEY data type:

In this case, two of the supported arguments are 1 and 2. The SELECT statement returns the following results:

ProductCost

Cost1

Cost2

123456789.1234

123,456,789.13

123456789.1267

As you can see, style 1 adds commas to the amount but rounds the value to two decimal places. Style 2, on the other hand, does not add commas and rounds value to four decimal places.

A Good Start

The rules that govern data conversion in SQL Server-whether implicit or explicit-can be somewhat complicated. Not only are there issues of performance and truncation to take into consideration, but also the fact that not all data types can be converted from one to another. In addition, only some types can be implicitly converted by the database engine. The others must be explicitly converted.

At best, what I’ve provided here is an introduction to data type conversion. You should verify the implications of your conversions whenever you’re uncertain of the impact of what you’re doing. Two good topics worth referencing in SQL Server Books Online are “Data Type Conversion (Database Engine)” and “CAST and CONVERT (SQL Server).” Above all, you should be aware of all conversions that occur when moving, copying, or comparing data in SQL Server.