- “How do I use the PIVOT operator to rotate data in SQL Server?”
- “Can I specify column names different from the values in the pivoted column?”
- “I’m confused by how data is grouped when using the PIVOT operator. Can I control grouping?”
- “How do I avoid grouping on a column’s values when pivoting data so I can return a single-row result set that aggregates the entire dataset?”
- “SQL Server does not permit the use of column expressions in the PIVOT clause. Is there a way to get around this limitation?”
- “Can I group data by more than one column when I use the PIVOT operator? “
- “Is there a way to work around SQL Server’s restriction against using COUNT(*) in the PIVOT clause?”
- “Is it possible to pivot more than one column when pivoting data in SQL Server?”
- “I sometimes run queries against a SQL Server 2000 database, which doesn’t support the PIVOT operator. How do I pivot data without using PIVOT?”
- “Does the PIVOT operator let you perform multiple aggregations?”
- “How do I pivot data if I don’t know the values in the pivoted column?”
- “How do I unpivot data in SQL Server?”
“How do I use the PIVOT
operator to rotate data in SQL Server?”
You use the PIVOT
operator within your query’s FROM
clause to rotate and aggregate the values in a dataset. The data is pivoted based on one of the columns in the dataset. Each unique value in that column becomes its own column, which contains aggregated pivoted data.
To better understand how this all works, let’s start with the basic syntax for a query that uses the PIVOT
operator:
1 2 3 4 5 6 7 8 9 |
SELECT column_list FROM table_expression PIVOT ( aggregate_function(aggregate_column) FOR pivot_column IN( pivot_column_values ) ) [AS] pivot_table_alias [ORDER BY column_list]; |
For the SELECT
clause, you can specify an asterisk (*) or the individual columns, and for the FROM
clause, you can specify a table or table expression. If you use a table expression, then you must also define a table alias. You can also include an ORDER
BY
clause, but that’s optional. You’ll get to see these clauses in action as we progress through the questions. For now, let’s focus on the PIVOT
clause. You need to understand how this clause works to make sure your pivots work the way you want.
After you specify the PIVOT
keyword, you pass in what are essentially three arguments, enclosed in parentheses. The first is the aggregate function and the name of the column to be aggregated. You can use any aggregate function except the COUNT
function, when used with an asterisk, as in COUNT(*)
.
Next, you define the FOR
subclause, which specifies the column on which the pivot will be based. It is this column whose distinct values are turned into their own columns. The FOR
subclause also includes the IN
operator, which is where you specify the pivot column values that will be turned into columns. The values you specify here must exist in the pivot column or they will be ignored.
This will all make more sense when you see it in action. Let’s start with the following temporary table and data:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); SELECT * FROM #BookSales; |
The SELECT
statement returns the following results, which include sales totals for the two book types over the course of two years:
BookType |
SalesYear |
BookSales |
Fiction |
2014 |
11201.00 |
Fiction |
2014 |
12939.00 |
Fiction |
2013 |
10436.00 |
Fiction |
2013 |
9346.00 |
Nonfiction |
2014 |
7214.00 |
Nonfiction |
2014 |
5800.00 |
Nonfiction |
2013 |
8922.00 |
Nonfiction |
2013 |
7462.00 |
Although this is a very simple dataset, it gives us what we need to demonstrate the PIVOT
operator. In the following SELECT
statement, I use the operator to turn the SalesYear
values into columns and to pivot and aggregate the sales totals:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
If you refer back to the syntax above, you can see that BookSales
is the aggregate column, SalesYear
is the pivot column, and the SalesYear
values, which are passed into the IN
operator, become the new columns. The following table shows the results returned by the SELECT
statement:
BookType |
2013 |
2014 |
Fiction |
19782.00 |
24140.00 |
Nonfiction |
16384.00 |
13014.00 |
An important point to note about the results is that the first column is BookType
, with a row for fiction sales and a row for nonfiction sales. The BookSales
values are aggregated according to year, with the data grouping based on the BookType
column. SQL Server automatically grouped the data by the values in this column, even though the column is not explicitly included in the PIVOT clause. (We’ll cover grouping more in depth in a separate question.)
In the preceding example, I used the SUM
function to aggregate the data, but we can use a different aggregate function. For example, the following PIVOT
clause uses the AVG
function:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(AVG(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
This time, the SELECT
statement returns sales averages, rather than sales totals, as shown in the following table:
BookType |
2013 |
2014 |
Fiction |
9891.00 |
12070.00 |
Nonfiction |
8192.00 |
6507.00 |
In the previous two examples, the IN
operator in the PIVOT
clause specified all the SalesYear
values, but you can specify whatever values you want to return, as long as they’re included in the pivoted column. For example, the IN
operator in the following PIVOT
clause includes only the 2014
value:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2014]) ) AS PivotSales; |
As the following table shows, the results now include only two columns:
BookType |
2014 |
Fiction |
24140.00 |
Nonfiction |
13014.00 |
There is, of course, more to the PIVOT
operator than what’s shown here, but this should give you the basics. The key to understanding the operator is in knowing which column to aggregate, which column to pivot, and which one should be implicitly grouped.
“Can I specify column names different from the values in the pivoted column?”
Yes, you can, by using column aliases, but before we get into that, let’s return to the syntax I introduced in the previous question:
1 2 3 4 5 6 7 8 9 |
SELECT column_list FROM table_expression PIVOT ( aggregate_function(aggregate_column) FOR pivot_column IN( pivot_column_values ) ) [AS] pivot_table_alias [ORDER BY column_list]; |
Notice that I use the column_list
placeholder in the SELECT
clause. Before we dig any deeper into the select list, let’s again start with the #BookSales
temporary table and populate it:
1 2 3 4 5 6 7 8 9 10 11 12 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); |
As you saw in the preceding examples, if we want to go with the default column names produced by the PIVOT
operator, we can use just an asterisk in the SELECT
clause:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
As expected, the query returns the data shown in the following table:
BookType |
2013 |
2014 |
Fiction |
19782.00 |
24140.00 |
Nonfiction |
16384.00 |
13014.00 |
Suppose we now want to better control how the columns are returned in the result set. Let’s revise our syntax a bit, with more detail in the SELECT
clause:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
SELECT group_column [ [AS] column_alias ], pivoted_column_1 [ [AS] column_alias ], pivoted_column_2 [ [AS] column_alias ], ... pivoted_column_last [ [AS] column_alias ] FROM table_expression PIVOT ( aggregate_function(aggregate_column) FOR pivot_column IN( pivot_column_values ) ) [AS] pivot_table_alias [ORDER BY column_list]; |
By default, the first column in the select list is the column on which the aggregated grouping is based. This is the column that you do not specify in the PIVOT
clause. The subsequent columns are based on the unique values in the pivoted column. If we were to repeat the previous example, but include the column names in the select list, our query would look something like the following:
1 2 3 4 5 |
SELECT BookType, 2013, 2014 FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
All I’ve done here is specify the column names rather than use an asterisk. However, because we’re dealing with integers for some of the column names, the results we get are not what we might expect:
BookType |
(No column name) |
(No column name) |
Fiction |
2013 |
2014 |
Nonfiction |
2013 |
2014 |
In this case, the values 2013
and 2014
are treated as literals and returned as column values, rather than column names. We can easily remedy this by delimiting the column names:
1 2 3 4 5 |
SELECT BookType, [2013], [2014] FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
Now we get the results we would expect:
BookType |
2013 |
2014 |
Fiction |
19782.00 |
24140.00 |
Nonfiction |
16384.00 |
13014.00 |
Of course, all we’ve achieved here is to return the same results we would have achieved if we had used an asterisk. However, we can use the select list to specify the column names in a different order than how the columns are listed in the IN
operator within the FOR
subclause:
1 2 3 4 5 |
SELECT BookType, [2014], [2013] FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
Now the aggregated data is listed in an order different from the previous examples:
BookType |
2014 |
2013 |
Fiction |
24140.00 |
19782.00 |
Nonfiction |
13014.00 |
16384.00 |
We can in fact, even change the position of the group column, which in this case is BookType
:
1 2 3 4 5 |
SELECT [2013], BookType, [2014] FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
In most cases, though, you’ll want to keep the group column in the first position, or you end up with results like those shown in the following table:
2013 |
BookType |
2014 |
19782.00 |
Fiction |
24140.00 |
16384.00 |
Nonfiction |
13014.00 |
Clearly, moving the group column around in this way can make the results less readable. Imagine what it would be like if you were dealing with 10 times the number of columns.
Another advantage of being able to specify the columns in the select list is that you can assign aliases to them. In the next example, I’ve assigned an alias to each of the three columns:
1 2 3 4 5 6 |
SELECT BookType AS BookCategory, [2013] AS Year2013, [2014] AS Year2013 FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
As you’d expect, the column aliases are reflected in the query results:
BookCategory |
Year2013 |
Year2014 |
Fiction |
19782.00 |
24140.00 |
Nonfiction |
16384.00 |
13014.00 |
Whenever you find that the values in your pivot column do not make good column names for the pivoted data, you can easily assign aliases to them in this way to make your data more readable.
“I’m confused by how data is grouped when using the PIVOT operator. Can I control grouping?”
The PIVOT
operator is a bit odd in the fact that data is grouped implicitly based on the columns you do not specify in the PIVOT
operator. Let’s return to our example temporary table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); Now let's once again use the PIVOT to rotate the data based on the SalesYear column: SELECT * FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
In this case, the source dataset is a simple three-column table. BookSales
serves as the aggregate column, and SalesYear
serves as the pivot column. That leaves BookType
, which automatically becomes the group column, as shown in the following results:
BookType |
2013 |
2014 |
Fiction |
19782.00 |
24140.00 |
Nonfiction |
16384.00 |
13014.00 |
In the real world, our source datasets are seldom that simple. Suppose, for example, our temp table included an additional column, a simple nullable BIT
column that indicates whether the totals have been verified, with 1
meaning YES
, 0
meaning NO
, and NULL
meaning we don’t know one way or the other:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY, Verified BIT); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201, 1); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939, 1); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436, NULL); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346, 0); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214, 0); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800, 1); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922, NULL); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462, 0); |
Chances are, you might not actually make such a column nullable, preferring instead to default to 0
, but this is good enough for now to demonstrate a point, that is, grouping is based on all columns not specified in the PIVOT
clause. Let’s try it out. The following SELECT
statement is identical to the one in the preceding example:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
But now our results are quite different, skewing them in a direction we likely do not want to go:
BookType |
Verified |
2013 |
2014 |
Fiction |
NULL |
10436.00 |
NULL |
Nonfiction |
NULL |
8922.00 |
NULL |
Fiction |
0 |
9346.00 |
NULL |
Nonfiction |
0 |
7462.00 |
7214.00 |
Fiction |
1 |
NULL |
24140.00 |
Nonfiction |
1 |
NULL |
5800.00 |
The database engine has grouped the data by both the BookType
and Verified
columns, rather than just BookType
. To get around this, we need to specify a table expression in our FROM
clause that retrieves only the relevant columns:
1 2 3 4 5 6 7 |
SELECT * FROM (SELECT BookType, SalesYear, BookSales FROM #BookSales) Sales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
Now we get the results we expect, with the data grouped only by the BookType
table:
BookType |
2013 |
2014 |
Fiction |
19782.00 |
24140.00 |
Nonfiction |
16384.00 |
13014.00 |
In most cases, you’ll find you’ll need to use a table expression in the FROM
clause, rather than specifying only the base table. In fact, you’ll often find that most of the real engineering for pivoting data rests with the table expression.
“How do I avoid grouping on a column’s values when pivoting data so I can return a single-row result set that aggregates the entire dataset?”
The trick to using PIVOT
operator to aggregate and rotate data effectively often rests with getting the grouping right. This idea also applies to aggregating the entire dataset. Let’s return again to our basic three-column temporary table and perform a simple pivot:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); SELECT * FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
As expected, our query returns the results in the following table:
BookType |
2013 |
2014 |
Fiction |
19782.00 |
24140.00 |
Nonfiction |
16384.00 |
13014.00 |
But suppose we don’t want to group the data by the BookType
column and instead want to return the aggregated values for the entire year. For that, we need the right table expression, one that returns only the BookSales
and SalesYear
columns:
1 2 3 4 5 6 7 |
SELECT * FROM (SELECT BookSales, SalesYear FROM #BookSales) Sales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
As before, we’ve specified BookSales
table as our aggregate column and SalesYear
as our pivot column. However, because these are now the only two columns in our dataset, we no longer have a third column to serve as the group column. As a result, our query returns the following data:
2013 |
2014 |
36166.00 |
37154.00 |
We now have totals for each year, without the data being grouped by any one column. We can, however, specify a column in the first position that provides context to the other data:
1 2 3 4 5 6 7 8 |
SELECT 'Total Sales' AS [Year], [2013], [2014] FROM (SELECT BookSales, SalesYear FROM #BookSales) Sales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
In this case, I’ve created a column named Year
and assigned the value Total
Sales
to that column, as shown in the following results:
Year |
2013 |
2014 |
Total Sales |
36166.00 |
37154.00 |
This can be a handy strategy to make it easier to understand the data, but you need to be careful because the column name can be a little deceiving in that it actually points to the other column names, rather than the value in its own column. Be sure to give careful consideration to what you’re doing if you go down this route.
“SQL Server does not permit the use of column expressions in the PIVOT clause. Is there a way to get around this limitation?”
Like many of the PIVOT
clause limitations, the way to get around the restriction against column expressions is to use the table expression in the FROM
clause. There you can create whatever expressions you need. Let’s start with a different variation of our #BookSales
temporary table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesDate DATETIME, BookSales MONEY, BookCost MONEY); GO INSERT INTO #BookSales VALUES ('Fiction', '2014-12-23 15:11:12.017', 11201, 9043), ('Fiction', '2014-05-23 15:11:12.017', 12939, 8578), ('Fiction', '2013-11-23 15:11:12.017', 10436, 9843), ('Fiction', '2013-03-23 15:11:12.017', 9346, 7743), ('Nonfiction', '2014-09-23 15:11:12.017', 7214, 5677), ('Nonfiction', '2014-06-23 15:11:12.017', 5800, 4301), ('Nonfiction', '2013-12-23 15:11:12.017', 8922, 6895), ('Nonfiction', '2013-04-23 15:11:12.017', 7462, 6873); |
We now have a DATETIME
column, rather than a simple INT
column for the year, and we’ve added a column for the wholesale cost. Suppose we now want to calculate the net sales and extract only the year for our pivot. We can create a query similar to the following:
1 2 3 4 5 6 7 8 9 |
SELECT * FROM (SELECT BookType AS BookCategory, (BookSales - BookCost) AS NetSales, YEAR(SalesDate) AS SalesYear FROM #BookSales) AS Sales PIVOT(SUM(NetSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
Once again, our table expression is doing the work for us by extracting the year and calculating the net sale. We then use the name of the computed columns in our PIVOT
clause, giving us the results shown in the following table:
BookCategory |
2013 |
2014 |
Fiction |
2196.00 |
6519.00 |
Nonfiction |
2616.00 |
3036.00 |
Although this is a simple example, it demonstrates the importance of getting the table expression right so your PIVOT
operator has exactly the data it needs to do its magic.
“Can I group data by more than one column when I use the PIVOT operator?”
Yes, you can group multiple columns, but you need to be sure you’re grouping only those columns that should be grouped. As mentioned earlier, the PIVOT
operator groups data by those columns in the data set that are not specified as the aggregate or pivot columns. To demonstrate multi-column grouping, let’s recast our #BookSales
temp table once again:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
CREATE TABLE #BookSales (BookType VARCHAR(20), BookSubtype VARCHAR(20), SalesYear INT, BookSales MONEY, Verified BIT); GO INSERT INTO #BookSales VALUES('Fiction', 'Adults', 2014, 11201, 1); INSERT INTO #BookSales VALUES('Fiction', 'Children', 2014, 12939, 0); INSERT INTO #BookSales VALUES('Fiction', 'Adults', 2013, 10436, 1); INSERT INTO #BookSales VALUES('Fiction', 'Children', 2013, 9346, 0); INSERT INTO #BookSales VALUES('Nonfiction', 'Adults', 2014, 7214, 0); INSERT INTO #BookSales VALUES('Nonfiction', 'Children', 2014, 5800, 1); INSERT INTO #BookSales VALUES('Nonfiction', 'Adults', 2013, 8922, 1); INSERT INTO #BookSales VALUES('Nonfiction', 'Children', 2013, 7462, 0); INSERT INTO #BookSales VALUES('Fiction', 'Adults', 2014, 14209, 0); INSERT INTO #BookSales VALUES('Fiction', 'Children', 2014, 11489, 0); INSERT INTO #BookSales VALUES('Fiction', 'Adults', 2013, 9909, 1); INSERT INTO #BookSales VALUES('Fiction', 'Children', 2013, 8726, 0); INSERT INTO #BookSales VALUES('Nonfiction', 'Adults', 2014, 4399, 1); INSERT INTO #BookSales VALUES('Nonfiction', 'Children', 2014, 5248, 0); INSERT INTO #BookSales VALUES('Nonfiction', 'Adults', 2013, 7740, 0); INSERT INTO #BookSales VALUES('Nonfiction', 'Children', 2013, 8267, 1); INSERT INTO #BookSales VALUES('Fiction', 'YA', 2014, 9854, 0); INSERT INTO #BookSales VALUES('Fiction', 'YA', 2013, 8756, 1); |
This time we’re also including the BookSubtype
column to provide an additional way to categorize the data. We can then include the column in our table expression, as shown in the following example:
1 2 3 4 5 6 7 8 |
SELECT * FROM (SELECT BookType, BookSubtype, SalesYear, BookSales FROM #BookSales) AS Sales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales ORDER BY BookType, BookSubtype; |
The table expression defines our dataset, which includes the BookSubtype
column. However, as in the previous examples, the PIVOT
clause specifies only the BookSales
and SalesYear
columns, which means the data will be grouped by the BookType
and BookSubtype
columns when it is pivoted, as shown in the following results:
BookType |
BookSubtype |
2013 |
2014 |
Fiction |
Adults |
20345.00 |
25410.00 |
Fiction |
Children |
18072.00 |
24428.00 |
Fiction |
YA |
8756.00 |
9854.00 |
Nonfiction |
Adults |
16662.00 |
11613.00 |
Nonfiction |
Children |
15729.00 |
11048.00 |
Once again, the key to making this all work is to ensure that you get your table expression right so that you’re controlling the columns on which you group the data.
“Is there a way to work around SQL Server’s restriction against using COUNT(*) in the PIVOT clause?”
It’s true that you can’t use the COUNT
aggregate function with an asterisk in your PIVOT
clause, but you can use the function if you specify a one of the columns in the dataset. To demonstrate this, let’s create our temporary table once more:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2014, 9856); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 4372); |
Suppose that we now want to return the number of entries, rather than the total number of sales. One way we can do this is to pass the BookSales
column in as an argument to the COUNT
function:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(COUNT(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
The statement will now return a count, rather than a sale amount, as shown in the following table:
BookType |
2013 |
2014 |
Fiction |
2 |
3 |
Nonfiction |
3 |
2 |
Although this appears to work, the challenge with this approach is that it assumes the specified column is non-nullable, that is, does not contain NULL
values. But that’s not always the case. For example, suppose we insert the following two rows to serve as placeholders during the load process:
1 2 |
INSERT INTO #BookSales VALUES('Fiction', 2014, NULL); INSERT INTO #BookSales VALUES('Nonfiction', 2014, NULL); |
If we were to rerun the previous SELECT
statement, we would receive the same results, despite the fact that we’ve added two rows. That’s because the COUNT
function ignores NULL
values. Another approach we might consider is to pass the SalesYear
column into the function:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(COUNT(SalesYear) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
Unfortunately, this confuses matters even more because the SalesYear
column is also the pivot column, giving us the following results:
BookType |
BookSales |
2013 |
2014 |
Fiction |
NULL |
0 |
1 |
Nonfiction |
NULL |
0 |
1 |
Nonfiction |
4372.00 |
1 |
0 |
Nonfiction |
5800.00 |
0 |
1 |
Nonfiction |
7214.00 |
0 |
1 |
Nonfiction |
7462.00 |
1 |
0 |
Nonfiction |
8922.00 |
1 |
0 |
Fiction |
9346.00 |
1 |
0 |
Fiction |
9856.00 |
0 |
1 |
Fiction |
10436.00 |
1 |
0 |
Fiction |
11201.00 |
0 |
1 |
Fiction |
12939.00 |
0 |
1 |
Because we left the BookSales
column out of the PIVOT
clause, it’s now treated as a group column. One way around this is to create a table expression in the FROM
clause so we return only the BookType
and SalesYear
columns:
1 2 3 4 5 6 7 |
SELECT * FROM (SELECT BookType, SalesYear FROM #BookSales) AS Sales PIVOT(COUNT(SalesYear) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
This time we get the results we want:
BookType |
2013 |
2014 |
Fiction |
2 |
4 |
Nonfiction |
3 |
3 |
Unfortunately, with this approach we’re once again assuming that the column will contain no NULL
values. A safer approach might be to add a column to our dataset that contains the same constant for every row:
1 2 3 4 5 6 7 8 |
SELECT * FROM (SELECT BookType, SalesYear, 1 AS CntCol FROM #BookSales) AS Sales PIVOT(COUNT(CntCol) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
In this case, I’ve created a column name CntCol
and set its value to 1
. This way, I don’t have to worry about NULL
values skewing the results. The following table shows the data returned by SELECT
statement:
BookType |
2013 |
2014 |
Fiction |
2 |
4 |
Nonfiction |
3 |
3 |
Adding a column in this way is a handy trick to know when you want to return reliable counts in your pivoted data and you can’t guarantee that the aggregate column will contain no NULL
values..
“Is it possible to pivot more than one column when pivoting data in SQL Server?”
Yes, it is possible to base your pivot on more than one column; however, you can’t do this within the PIVOT
clause. As with other PIVOT
limitations, you must use your table expression to define the logic and return the necessary dataset. Let’s start with yet another variation on our #BookSales
temp table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
CREATE TABLE #BookSales (BookType VARCHAR(20), Region VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 'east', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 'west', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 'west', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 'east', 2013, 9346); INSERT INTO #BookSales VALUES('Fiction', 'east', 2013, 4356); INSERT INTO #BookSales VALUES('Nonfiction', 'east', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 'west', 2014, 8456); INSERT INTO #BookSales VALUES('Nonfiction', 'west', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 'east', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 'west', 2013, 7462); |
Suppose we now want to pivot the data based on both the Region
and SalesYear
columns so that we end up with columns for each year/region combination. The place to start, of course, is with our table expression:
1 2 3 4 5 6 7 8 9 10 |
SELECT * FROM (SELECT BookType AS BookCategory, (CAST(SalesYear AS VARCHAR(4)) + '_' + Region) AS SalesRegion, BookSales FROM #BookSales) AS Sales PIVOT(SUM(BookSales) FOR SalesRegion IN([2013_east], [2013_west], [2014_east], [2014_west]) ) AS PivotSales; |
In this case, I’m concatenating the SalesYear
and Region
columns to produce the SalesRegion
computed column. This will give me values such as 2014_east
and 2014_west
. I then specify SalesRegion
as the pivot column and pass in the column’s values to the IN
operator. The SELECT
statement returns the results shown in the following table:
BookCategory |
2013_east |
2013_west |
2014_east |
2014_west |
Fiction |
13702.00 |
10436.00 |
11201.00 |
12939.00 |
Nonfiction |
8922.00 |
7462.00 |
7214.00 |
14256.00 |
As you can see, each book category in the result set is broken down by the year/region combination. Again, it’s all about the table expression.
“I sometimes run queries against a SQL Server 2000 database, which doesn’t support the PIVOT operator. How do I pivot data without using PIVOT
?”
If you’re working on a SQL Server edition that predates SQL Server 2005 or if you don’t want to use the PIVOT
operator for another reason, you can revert to a series of CASE
expressions in the SELECT
clause that define each pivoted column. To demonstrate this, we’ll once more create the #BookSales
temporary table:
1 2 3 4 5 6 7 8 9 10 11 12 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); |
As we’ve seen before, if we were to use the PIVOT
operator, we would end up with a query that looks similar to the following:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales; |
With results like those shown in the following table:
BookType |
2013 |
2014 |
Fiction |
19782.00 |
24140.00 |
Nonfiction |
16384.00 |
13014.00 |
We can achieve the same results if we specifically group the data on the BookType
column and then use CASE
expressions in the SELECT
clause:
1 2 3 4 5 |
SELECT BookType, SUM(CASE WHEN SalesYear=2013 THEN BookSales END) AS [2013], SUM(CASE WHEN SalesYear=2014 THEN BookSales END) AS [2014] FROM #BookSales GROUP BY BookType; |
The SELECT
clause starts with the BookType
column and then includes a CASE
expression for each column we want to include in the pivoted result set. Each CASE
expression checks the value in the SalesYear
column and returns the related BookSales
value for that year.
For a simple example like this, the original way of pivoting data isn’t so bad. But if you’re dealing with a lot of unique values in your pivot column, it can become fairly tedious creating all those CASE
expressions. On the other hand, it’s nice to have an alternative you can turn to when you don’t want to-or you can’t-use the PIVOT
operator.
“Does the PIVOT operator let you perform multiple aggregations?”
Yes, you can perform multiple aggregations, but you can’t do it within a single PIVOT
clause. To demonstrate, let’s return once again to the #BookSales
table:
1 2 3 4 5 6 7 8 9 10 11 12 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); |
Suppose we now want to return both sales totals and averages, in which case, we need to use the SUM
and AVG
aggregate functions. A fairly straightforward approach to doing this is to create two common table expressions (CTEs) that each pivot the data and then join the two CTEs together, as shown in the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
WITH SalesTotal AS ( SELECT BookType, [2013] AS [2013_Total], [2014] AS [2014_Total] FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales ), SalesAvg AS ( SELECT BookType, [2013] AS [2013_Avg], [2014] AS [2014_Avg] FROM #BookSales PIVOT(AVG(BookSales) FOR SalesYear IN([2013], [2014]) ) AS PivotSales ) SELECT st.BookType, st.[2013_Total], sa.[2013_Avg], st.[2014_Total], sa.[2014_Avg] FROM SalesTotal AS st INNER JOIN SalesAvg AS sa ON st.BookType = sa.BookType; |
I first define a CTE named SalesTotal
that uses the SUM
function to aggregate the data. I then define a CTE named SalesAvg
that uses the AVG
function to aggregate the data. Notice that in each case the SELECT
statement returns pivot column names that are easily identifiable. After I define the two CTEs, I use an inner join to merge them together, based on the BookType
column. The statement returns the results shown in the following table:
BookType |
2013_Total |
2013_Avg |
2014_Total |
2014_Avg |
Fiction |
19782.00 |
9891.00 |
24140.00 |
12070.00 |
Nonfiction |
16384.00 |
8192.00 |
13014.00 |
6507.00 |
As you can see, for each book category, we now have a total sales amount as well as an average sales amount for each year. Having the capacity to use CTE’s to join multiple pivot operations helps to make the PIVOT
operator much more flexible.
“How do I pivot data if I don’t know the values in the pivoted column?”
Up to this point, the examples have assumed that we’ll know the values in our pivoted column at the time we write our query. Often, however, this isn’t the case, which means we have no way of defining the new columns. To overcome this limitation, we need to turn to dynamic SQL to construct our T-SQL statement on the fly. Let’s look at how this works, but first, another incarnation of the #BookSales
table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Fiction', 2012, 8956); INSERT INTO #BookSales VALUES('Fiction', 2012, 7809); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); INSERT INTO #BookSales VALUES('Nonfiction', 2012, 7739); INSERT INTO #BookSales VALUES('Nonfiction', 2012, 6842); |
Although we’re in on the ground floor in this case, in terms of having created and populated the table, let’s assume we have no idea what the values will be in the SalesYear
column, although we still want to pivot the data based on that column. In this case, we need to construct our query dynamically, pulling the data out of the pivot column as we need it. To do so, we must declare a couple variables, retrieve the list of values from the pivot column, and assemble our query, as shown in the following example:
1 2 3 4 5 6 7 8 9 10 11 12 |
DECLARE @sql AS NVARCHAR(2000);DECLARE @col AS NVARCHAR(2000); SELECT @col = ISNULL(@col + ', ', '') + QUOTENAME(SalesYear) FROM (SELECT DISTINCT SalesYear FROM #BookSales) AS BookTypes; SET @sql = N'SELECT BookType, ' + @col + 'FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN (' + @col + ')) AS PivotSales'; EXEC sp_executesql @sql; |
In this example, I first declare the @sql
and @col
variables. The @sql
variable will store the final SELECT
statement, and the @col
variable will store the list of values from the pivot column. Next, I use a SELECT
statement to retrieve a list of distinct values from the SalesYear
column and assign those values to the @col
variable.
Finally, I use a SET
statement to construct the final query, passing in the @col
variable to provide the column names to the IN
operator and select list, all of which I assign to the @sql
variable. I then use an EXECUTE
statement to run the final query, which returns the results shown in the following table:
BookType |
2012 |
2013 |
2014 |
Fiction |
16765.00 |
19782.00 |
24140.00 |
Nonfiction |
14581.00 |
16384.00 |
13014.00 |
Using dynamic SQL is a great way to get around the need to know the column values in advance, but be sure to take precautions against injection attacks when using dynamic SQL. For example, the QUOTENAME
function returns a delimited string to make sure the column name is a valid delimited identifier, helping to avoid injection through object names. Make sure you understand how SQL injection works before using dynamic SQL.
“How do I unpivot data in SQL Server?”
You can unpivot data in SQL Server by using the UNPIVOT
operator, which in many ways is similar to the PIVOT
operator. As you’ve seen, the PIVOT
operator rotates a column’s values, turning them into their own columns. The UNPIVOT
operator does the opposite by rotating columns and turning them into column values. Essentially, the PIVOT
operator denormalizes data, and the UNPIVOT
operator attempts to normalize it. For example, suppose our #BookSales
table look more like a denormalized Excel spreadsheet:
1 2 3 4 5 6 7 8 |
CREATE TABLE #BookSales (BookType VARCHAR(20), [2012] MONEY, [2013] MONEY, [2014] MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 16765.00, 19782.00, 24140.00); INSERT INTO #BookSales VALUES('Nonfiction', 14581.00, 16384.00, 13014.00); SELECT * FROM #BookSales; |
As expected, the SELECT
statement returns the results shown in the following table:
BookType |
2012 |
2013 |
2014 |
Fiction |
16765.00 |
19782.00 |
24140.00 |
Nonfiction |
14581.00 |
16384.00 |
13014.00 |
Now suppose we want to bring a little more relational-like structure to the data. We can use the UNPIVOT
operator to reverse engineer our dataset:
1 2 3 4 5 |
SELECT BookType, SalesYear, SalesTotal FROM #BookSales UNPIVOT ( SalesTotal FOR SalesYear IN([2012], [2013], [2014]) ) AS UnpivotSales; |
Notice that the FROM
clause includes the UNPIVOT
subclause, similar to a PIVOT
clause. In the case of UNPIVOT
, however, we first specify a value column (SalesTotal
), which will display the sales amounts in our result set, and then include a FOR
subclause, which specifies the new pivot column (SalesYear
) and the values that will be added to that column. These values are the original column names. The SELECT
statement returns the results shown in the following table:
BookType |
SalesYear |
SalesTotal |
Fiction |
2012 |
16765.00 |
Fiction |
2013 |
19782.00 |
Fiction |
2014 |
24140.00 |
Nonfiction |
2012 |
14581.00 |
Nonfiction |
2013 |
16384.00 |
Nonfiction |
2014 |
13014.00 |
As with the PIVOT
operator, you want to be sure that the table expression in your FROM
clause returns the right data for the UNPIVOT
clause. However, you should not think of UNPIVOT
as merely an undoing of a PIVOT
operation. For example, suppose that this time around we start with a more normalized version of the #BookSales
table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
CREATE TABLE #BookSales (BookType VARCHAR(20), SalesYear INT, BookSales MONEY); GO INSERT INTO #BookSales VALUES('Fiction', 2014, 11201); INSERT INTO #BookSales VALUES('Fiction', 2014, 12939); INSERT INTO #BookSales VALUES('Fiction', 2013, 10436); INSERT INTO #BookSales VALUES('Fiction', 2013, 9346); INSERT INTO #BookSales VALUES('Fiction', 2012, 8956); INSERT INTO #BookSales VALUES('Fiction', 2012, 7809); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 7214); INSERT INTO #BookSales VALUES('Nonfiction', 2014, 5800); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 8922); INSERT INTO #BookSales VALUES('Nonfiction', 2013, 7462); INSERT INTO #BookSales VALUES('Nonfiction', 2012, 7739); INSERT INTO #BookSales VALUES('Nonfiction', 2012, 6842); SELECT * FROM #BookSales |
The SELECT
statement returns the results shown in the following table:
BookType |
SalesYear |
BookSales |
Fiction |
2014 |
11201.00 |
Fiction |
2014 |
12939.00 |
Fiction |
2013 |
10436.00 |
Fiction |
2013 |
9346.00 |
Fiction |
2012 |
8956.00 |
Fiction |
2012 |
7809.00 |
Nonfiction |
2014 |
7214.00 |
Nonfiction |
2014 |
5800.00 |
Nonfiction |
2013 |
8922.00 |
Nonfiction |
2013 |
7462.00 |
Nonfiction |
2012 |
7739.00 |
Nonfiction |
2012 |
6842.00 |
Now let’s pivot the data in the same way we’ve been doing all along:
1 2 3 4 5 |
SELECT * FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2012], [2013], [2014]) ) AS PivotSales; |
Not surprisingly, the statement returns the following results:
BookType |
2012 |
2013 |
2014 |
Fiction |
16765.00 |
19782.00 |
24140.00 |
Nonfiction |
14581.00 |
16384.00 |
13014.00 |
Instead of just returning the results, however, let’s put them into the #PivotSales
temporary table:
1 2 3 4 5 6 |
SELECT * INTO #PivotSales FROM #BookSales PIVOT(SUM(BookSales) FOR SalesYear IN([2012], [2013], [2014]) ) AS PivotSales; |
Now let’s unpivot our new temporary table:
1 2 3 4 5 |
SELECT BookType, SalesYear, SalesTotal FROM #PivotSales UNPIVOT ( SalesTotal FOR SalesYear IN([2012], [2013], [2014]) ) AS UnpivotSales; |
The SELECT
statement returns the results in the following table:
BookType |
SalesYear |
SalesTotal |
Fiction |
2012 |
16765.00 |
Fiction |
2013 |
19782.00 |
Fiction |
2014 |
24140.00 |
Nonfiction |
2012 |
14581.00 |
Nonfiction |
2013 |
16384.00 |
Nonfiction |
2014 |
13014.00 |
If you compare these to the results we received when we first created the #BookSales
table, you’ll see that we don’t get the granularity of the original table. The UNPIVOT
operator has no insight into the past and can work only with the dataset in its present form. In general, this shouldn’t present a problem; just don’t expect the UNPIVOT
operator to be able to undo your PIVOT
operations.
Load comments