In my last article, “Working with the XML Data Type in SQL Server,” I included examples that returned data from columns and variables of the XML
data type. Based on those examples, it might seem that retrieving XML data is similar to retrieving objects configured with other data types. However, that’s true only when returning the entire XML value. If instead you want to work with individual components within an XML instance, such as when retrieving the value of a single attribute, you must use one of the five methods that available to the XML
data type-query()
, value()
, exist()
, nodes()
, or modify()
.
To use an XML
method, you call it in association with the database object configured with the XML
data type, as the following syntax illustrates:
1 |
DbObject.XmlMethod('XQuery'[, 'SqlType']) [AS TableAlias(ColumnAlias)] |
The DbObject placeholder refers to a column, variable, or parameter configured with the XML
data type. After the object name, you add a period, following by the name of the XML
method. Next you provide, in parentheses, the arguments required for the specific method.
Most of the XML
methods require only one argument: an XQuery expression that identifies the XML components to be retrieved or modified, and any actions to be taken on those components. XQuery, a powerful scripting language used specifically to access XML data, contains the elements necessary to create complex expressions that can include functions, operators, variables, and values. MSDN provides a complete XQuery language reference that you can refer to while you work with the XML
methods.
In addition to the XQuery expression, an XML
method might also require a Transact-SQL data type as a second argument or a table and column alias tagged on after the arguments. As we work through the article, you’ll see when and why these additional components are necessary.
To demonstrate how the various XML
methods work, I’ve written a number of examples that use the methods to retrieve and modify data. The examples are based on the Stores
table, which I created and populated with the following Transact-SQL script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
USE AdventureWorks2008R2 GO IF OBJECT_ID('Stores') IS NOT NULL DROP TABLE Stores GO CREATE TABLE Stores ( StoreID INT PRIMARY KEY, Survey_untyped XML, Survey_typed XML(Sales.StoreSurveySchemaCollection) ); INSERT INTO Stores VALUES ( 292, '<StoreSurvey> <AnnualSales>800000</AnnualSales> <AnnualRevenue>80000</AnnualRevenue> <BankName>United Security</BankName> <BusinessType>BM</BusinessType> <YearOpened>1996</YearOpened> <Specialty>Mountain</Specialty> <SquareFeet>21000</SquareFeet> <Brands>2</Brands> <Internet>ISDN</Internet> <NumberEmployees>13</NumberEmployees> <Products Type="Bikes"> <Product>Mountain</Product> <Product>Road</Product> <Product>Racing</Product> </Products> <Products Type="Clothes"> <Product>Jerseys</Product> <Product>Jackets</Product> <Product>Shorts</Product> </Products> </StoreSurvey>', (SELECT Demographics FROM Sales.Store WHERE BusinessEntityID = 292) ); |
Notice that the table includes the Survey_untyped
column, which is an XML
untyped column, and the Survey_typed
column, which is an XML
typed column. The schema collection I associated with the typed column is already included in the AdventureWorks2008R2
database, which is where I’ve created the table. The INSERT
statement in the Transact-SQL above also shows the XML document that I added to the untyped column. For the typed column, I retrieved the following XML document from the Demographics
column (a typed XML
column) in the Sales.Store
table in the AdventureWorks2008R2
database:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
<StoreSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"> <AnnualSales>800000</AnnualSales> <AnnualRevenue>80000</AnnualRevenue> <BankName>United Security</BankName> <BusinessType>BM</BusinessType> <YearOpened>1996</YearOpened> <Specialty>Mountain</Specialty> <SquareFeet>21000</SquareFeet> <Brands>2</Brands> <Internet>ISDN</Internet> <NumberEmployees>13</NumberEmployees> </StoreSurvey> |
Notice that the XML includes the schema associated with that data. The schema comes from the same schema collection I used when I created the typed column in the Stores
table.
The data I added to the untyped column is nearly identical to the typed XML. The untyped XML doesn’t include the namespace information, of course. However, it does include additional product information, which we’ll use to demonstrate the XML
methods. So now that we have setup out of the way, let’s look at how those methods work.
The query() method
The query()
method retrieves a subset of untyped XML from the target XML instance. It’s probably the simplest and most straightforward of the XML methods. You need only specify the database object, the method name, and an XQuery expression, as shown in the following syntax:
1 |
DbObject.query('XQuery') |
You’ll notice that the XQuery expression is entirely enclosed in single quotes and those are enclosed in parentheses. The trick with any XML method is to get the XQuery expression right. But at its simplest, the expression must define the path of the element you want to retrieve.
In the following example, I use the query()
method to return data from the <StoreSurvey>
element in both the typed and untyped columns:
1 2 3 4 5 6 7 8 9 |
SELECT Survey_untyped.query('/StoreSurvey') AS Info_untyped, Survey_typed.query('declare namespace ns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"; /ns:StoreSurvey') AS Info_typed FROM Stores; |
For the untyped column, I specified the column name (Survey_untyped
) followed by a period and then the method name (query
). Within the parentheses and single quotes, I defined the XQuery expression (/StoreSurvey
). That’s all there is to it. Because the <StoreSurvey>
element is the root node, the entire XML document is returned.
The XQuery expression for the Survey_typed
column is a bit more complex. Because it is a typed column, the expression should be preceded by a namespace declaration. The namespace must be the same as the one referenced within the XML document stored in the typed column.
To declare a namespace, you specify the declare namespace
keywords, followed by the alias name (in this case, ns
). After the alias, you include an equal sign (=
) and then the schema path and name. You then end the declaration with a semi-colon (;
) to separate it from the main body of the XQuery expression. After you declare the namespace, you can use the alias within the expression to reference that namespace.
The expression itself is nearly identical to the one used for the untyped column, except that you must precede the element name with the namespace alias and a colon (ns:
). As with the untyped column, the expression will return the entire XML document because it specifies only the root node.
Although the preceding example is helpful in demonstrating the basics of using the query()
method, it’s not much use beyond that because you can just as easily retrieve all the column contents simply by specifying the column name. However, in the following example, I get more specific by limiting the results to the <AnnualSales>
child element:
1 2 3 4 5 6 7 8 9 |
SELECT Survey_untyped.query('/StoreSurvey/AnnualSales') AS Info_untyped, Survey_typed.query('declare namespace ns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"; /ns:StoreSurvey/ns:AnnualSales') AS Info_typed FROM Stores; |
The only difference between this example and the preceding one is that I added /AnnualSales
to the XQuery expression for the untyped column and /ns:AnnualSales
to the expression for the typed column. The XQuery for the untyped column returns the following results.
1 |
<AnnualSales>800000</AnnualSales> |
And the XQuery for the typed column returns these results:
1 |
<ns:AnnualSales xmlns:ns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey">800000</ns:AnnualSales> |
The primary difference between the two is that the data returned by the typed column includes the namespace information. The element values themselves (800000) are the same in both columns. If the <AnnualSales>
element had included its own child elements, those too would have been displayed. For instance, the following example retrieves the <Products>
elements, which are child elements of <StoreSurvey>
, just like <AnnualSales>
:
1 2 3 4 5 6 |
SELECT Survey_untyped.query('/StoreSurvey/Products') AS Products FROM Stores; |
Because the XML document includes two <Products>
elements and those elements each include several <Product>
child elements, the SELECT
statement returns all product-related elements, as shown in the following results:
1 2 3 4 5 6 7 8 9 10 |
<Products Type="Bikes"> <Product>Mountain</Product> <Product>Road</Product> <Product>Racing</Product> </Products> <Products Type="Clothes"> <Product>jerseys</Product> <Product>jackets</Product> <Product>shorts</Product> </Products> |
If you want to return a specific element and its child elements, you can do so by referencing one of its attributes. For example, suppose you want to include only products from the Bikes category. To do so, you can modify your XQuery expression as follows:
1 2 3 4 5 |
SELECT Survey_untyped.query('/StoreSurvey/Products[@Type="Bikes"]') AS BikeProducts FROM Stores; |
Now the XQuery expression includes bracketed information that specifies the Type
attribute and its value. The attribute name is preceded by the at (@
) symbol, and the attribute value is preceded by an equal sign (=
) and enclosed in double quotes. As a result, the SELECT
statement now returns the following XML fragment:
1 2 3 4 5 6 |
<Products Type="Bikes"> <Product>Mountain</Product> <Product>Road</Product> <Product>Racing</Product> </Products> |
As you would expect, only elements whose products are in the Bikes category are returned. You could have just as easily specified “Clothes” rather than “Bikes,” and you would have received the product elements related to clothing.
The value() method
The value()
method returns a scalar value from the targeted XML document. The returned value is converted to the data type you specify when you call the method. The value()
method makes it easier to work with individual values in order to compare or combine them, either with other XML values or values of different types.
As the following syntax illustrates, when you call the value()
method, you must specify the XQuery expression and the Transact-SQL data type for the returned data:
1 |
DbObject.value('XQuery', 'SqlType') |
For example, suppose you want to pull the amount of sales from the <AnnualSales>
element in either the typed or untyped column in the Stores
table. You can specify the path in the XQuery expression plus a numerical data type as the second argument, as shown in the following SELECT
statement:
1 2 3 4 5 6 7 8 9 |
SELECT Survey_untyped.value('(/StoreSurvey/AnnualSales)[1]', 'int') AS Sales_untyped, Survey_typed.value('declare namespace ns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"; (/ns:StoreSurvey/ns:AnnualSales)[1]', 'int') AS Sales_typed FROM Stores; |
For the untyped column, the XQuery expression includes the element path as you saw in earlier examples. However, after the path, you must add an integer in brackets that indicates which element of that name you should retrieve. More often than not, you will simply add [1]
, as I’ve done here. This assures that, if there are multiple elements with the same name, only the first one will be returned. It also assures that only one element is being referenced at a time. In fact, even when there is only one element, as is the case with <AnnualSales>
, you must still include the [1]
because the value()
method expects a singleton value.
The second argument passed into the value()
method is the name of the data type, in this case, int
. That means an int
value will be returned by the method. Note that the method’s two arguments must each be enclosed in single quotes and separated with a comma.
As for the typed column, the namespace declaration and path are also similar to what you’ve seen in earlier examples. The only difference is that this expression must also include the [1]
, just as we did it for the untyped column. As a result, for each column, the SELECT
statement returns a single int
value of 800000
, without any additional element information.
You can also retrieve an attribute value, rather than an element value, by specifying the name of the attribute in your element path. For example, the following SELECT
statement retrieves the value of the Type
attribute for the second instance of the <Products>
element:
1 2 3 4 5 6 |
SELECT Survey_untyped.value('(/StoreSurvey/Products/@Type)[2]', 'varchar(10)') AS ProductType FROM Stores; |
The expression includes [2]
, rather than [1]
, in order to retrieve data from the second instance of <Products>
. As a result, the SELECT
statement now returns the varchar
value Clothes
.
As stated earlier, the value()
method is also handy if you want to combine or compare data. In the following example, I create a calculated column that’s based on two values returned from the untyped column:
1 2 3 4 5 6 7 |
SELECT Survey_untyped.value('(/StoreSurvey/AnnualSales)[1]', 'int') - Survey_untyped.value('(/StoreSurvey/AnnualRevenue)[1]', 'int') AS Expenses FROM Stores; |
The first instance of value()
retrieves the <AnnualSales>
value. The second value()
instance retrieves the <AnnualRevenue>
value. The second value is then subtracted from the first value to return a scalar value of 720000
.
You can also use XQuery functions in your expressions. For instance, in the following example, I use the concat
function to add a string to the <Specialty>
value:
1 2 3 4 5 6 7 |
SELECT Survey_untyped.value('concat("Bike specialty: ", (/StoreSurvey/Specialty)[1])', 'varchar(25)') AS Specialty FROM Stores; |
When you use the concat
function, you specify each element that you want to concatenate as an argument to the function and separate those arguments with a comma. The statement returns the value Bike specialty: Mountain
.
As you can see, the string has been concatenated with the <Specialty>
value. I could have just as easily concatenated multiple element values or added more string values.
The exist() method
The exist()
method lets you test for the existence of an element or one of its values. The method takes only one argument, the XQuery expression, as shown in the following syntax:
1 |
DbObject.exist('XQuery') |
The key to using the exist()
method is in understanding the values it returns. Unlike the query()
and value()
methods, the exist()
method doesn’t return XML content. Rather, the method returns one of the following three values:
- A
BIT
value of1
if the XQuery expression returns a nonempty result - A
BIT
value of0
if the XQuery expression returns an empty result. - A
NULL
value if the XML data type instance is null.
A good way to test how the exist()
method works is to use a variable to capture the method’s results, as I’ve done in the following example:
1 2 3 4 5 6 |
DECLARE @xml XML; DECLARE @exist BIT; SET @xml = (SELECT Survey_untyped FROM Stores); SET @exist = @xml.exist('/StoreSurvey[BusinessType="BM"]'); SELECT @exist; |
First, I declared the @xml
variable as type XML
. Then I declared the @exist
variable as type BIT
. I set the value of @xml
to equal the XML document in the Survey_untyped
column. I then used the exist()
method on the @xml
variable to test for the existence of /StoreSurvey[BusinessType="BM"]
within the XML document. In other words, the exist()
methods checks whether the <BusinessType>
child element exists and whether it contains a value of “BM”.
I then assigned the results returned by the exist()
method to the @exist
variable and used a SELECT
statement to return the contents of the variable. Because the XML document contains this child element and that element has a value of “BM”, the method returns a value of 1
. However, if I change “BM” to “BMX”, as I’ve done in the following example, the method returns 0
because the element and value don’t exist exactly as specified:
1 2 3 4 5 6 |
DECLARE @xml2 XML; DECLARE @exist2 BIT; SET @xml2 = (SELECT Survey_untyped FROM Stores); SET @exist2 = @xml2.exist('/StoreSurvey[BusinessType="BMX"]'); SELECT @exist2; |
Now that you have a sense of how the exist()
method works, let’s look at how you might use it in your queries. In the following example, I include a WHERE
clause that uses the exist()
method to again test for the existence of /StoreSurvey[BusinessType="BM"]
:
1 2 3 4 5 6 7 8 |
SELECT Survey_untyped.query('/StoreSurvey/Products[@Type="Bikes"]') AS BikeProducts FROM Stores WHERE Survey_untyped.exist('/StoreSurvey[BusinessType="BM"]') = 1; |
The WHERE
clause compares the results of the exist()
method to the number 1. If the results equal 1, that is, if the XML contains the specified element and values, the WHERE
clause evaluates to True
and the SELECT
statement returns the product information, as shown in the following results:
1 2 3 4 5 6 |
<Products Type="Bikes"> <Product>Mountain</Product> <Product>Road</Product> <Product>Racing</Product> </Products> |
If the exist()
method had specified a different <BusinessType>
value, as it does in the following example, the SELECT
statement would instead return an empty result set:
1 2 3 4 5 6 7 8 |
SELECT Survey_untyped.query('/StoreSurvey/Products[@Type="Bikes"]') AS BikeProducts FROM Stores WHERE Survey_untyped.exist('/StoreSurvey[BusinessType="BMX"]') = 1; |
Again, the key to the exist()
method is to remember that it returns only one of three values: 1
, 0
, or NULL
.
The nodes() method
The nodes()
method can be a bit more slippery to understand than the other XML
methods. To begin with, rather than returning XML or scalar values, the nodes()
method returns what is essentially a table that includes one column. That means you should use the method only in those parts of a statement that can handle rowset views, such as the FROM
clause. It also means that, when you call the nodes()
method, you must assign a table alias and column alias to the rowset view returned by the method, as shown in the following syntax:
1 |
DbObject.nodes('XQuery') AS TableAlias(ColumnAlias) |
The table and column aliases let you reference the rowset view in other parts of the statement. The method is useful when you want to shred an XML document, that is, decompose the document into a relational format. To better understand how this works, let’s look at an example that uses the nodes()
method on an XML
variable:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
DECLARE @bikes XML SET @bikes = '<Products> <Product>Mountain</Product> <Product>Road</Product> <Product>Racing</Product> </Products>' SELECT Category.query('./text()') AS BikeTypes FROM @bikes.nodes('/Products/Product') AS Bike(Category); |
First, I declared the @bikes
variable with the XML
data type. Then I set its value to equal the XML fragment. In the FROM
clause, I invoked the nodes()
method on the variable to shred the <Product>
elements. My goal was to retrieve the value from each instance of that element. I then provided a table alias (Bike
) and a column alias (Category
) to identify my rowset view. Now I essentially have a table with a single column that contains a row for each <Product>
element.
When you use the nodes()
method to create a rowset view of the XML data, you have to use one of the other XML
methods to retrieve the contents of that view. In this case, I used the query()
method in the SELECT
list to return the value of each row. Notice that to retrieve the value I used a period to reference the current context node and the text()
node function to retrieve only the element values, as shown in the following results:
1 2 3 4 |
Mountain Road Racing |
If you want to use the nodes()
method to retrieve data from a table, you can use the CROSS APPLY
operator in the FROM
clause to associate the table with the rowset view returned by the method, as shown in the following example:
1 2 3 4 5 6 7 8 |
SELECT Category.query('./text()') AS BikeTypes FROM Stores CROSS APPLY Survey_untyped.nodes('/StoreSurvey/Products[@Type="Bikes"]/Product') AS Bike(Category); |
In this case, I specified the XQuery path as StoreSurvey/Products[@Type="Bikes"]/Product'
in order to return only the bike-related products from the Survey_untyped
column. However the SELECT
list itself is the same as the preceding example, so this statement returns the same results as that example.
The modify() method
The methods we’ve looked at up to this point have been concerned only with reading data in an XML document, but the modify()
method lets you update that data. As the following syntax shows, the only argument you pass into the modify()
method is the XQuery expression:
1 |
DbObject.modify('XQuery') |
In this case, however, the XQuery expression is actually a special type of expression that uses the XML Data Modification Language (XML DML), which is a SQL Server extension to XQuery. The modify()
method lets you define XML DML expressions that can add, update, or delete elements within an XML document.
NOTE: This section covers only the basics of the modify()
method. You can find more information about the method and its uses in the XML Data Modification Language Workbench, which provides additional details and examples about modifying XML data.
Adding components to XML data
To add components to your XML data, you must specify the insert
keyword and target location in your expression. For instance, the following UPDATE
statement adds the <Comments>
element to the typed and untyped XML
columns in the Stores
table:
1 2 3 4 5 6 7 8 9 10 |
UPDATE Stores SET Survey_untyped.modify(' insert(<Comments>Largest bike store in region</Comments>) after(/StoreSurvey/NumberEmployees)[1]'), Survey_typed.modify('declare namespace ns= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey"; insert(<ns:Comments>Largest bike store in region</ns:Comments>) after(/ns:StoreSurvey/ns:NumberEmployees)[1]') WHERE StoreID = 292; |
For the Survey_untyped
column, I first specified the modify()
method name, followed by the XML DML expression, which I enclosed in parentheses and single quotes, just like the other XML
methods. Within the expression, I included the insert
keyword and the element I wanted to add: (<Comments>Largest bike store in region</Comments>
). Notice that I enclosed the element in parenthesis and included the element’s value.
Next, I added the after
keyword and specified the location of where to add the new element. I also included the [1]
because the method requires a singleton value. Now the new element will be added after the first instance of the /StoreSurvey/NumberEmployees
element.
As you would expect, modifying the Survey_typed
column followed the same process, except that I also included the namespace declaration and aliases. An important issue to consider, however, when working with typed XML
columns is that any changes you make must conform to the schema that governs the XML content. For example, if I had tried to add the <Comments>
element to any other location within the document, I would have received a violation error. However, if you’ve set up your statement properly and adhered to the schema, the new element should be added to the XML document with no problem.
Updating components in XML data
To use the modify()
method to update XML data, you must include the replace value of
keywords, rather than the insert
keyword. You must then specify the component you want to update as well as the updated information. For example, in the following UPDATE
statement, I change the value of the <Comments>
element:
1 2 3 4 5 6 |
UPDATE Stores SET Survey_untyped.modify(' replace value of (/StoreSurvey/Comments/text())[1] with "2nd largest bike store in region" ') WHERE StoreID = 292; |
After I specified the replace value of
keywords, I added the element path along with the text()
node function, which let me change only the element’s value (without affecting the element itself). And as with the previous example, I also included a [1]
because a singleton value is expected. Next, I specified the with
keyword and the new element value, enclosed in double quotes. As a result, when I ran the statement, it replaced the old element value with the new one.
Deleting components from XML data
To use the modify()
method to remove data from an XML document, you must specify the delete
keyword, followed by the component you want to delete, as shown in the following example:
1 2 3 4 |
UPDATE Stores SET Survey_untyped.modify('delete(/StoreSurvey/Comments)[1]') WHERE StoreID = 292; |
In this case, it was simply a matter of specifying the path to the <Comments>
element, in parentheses, after the delete
keyword. Of course, all this was followed by [1]
to keep the database engine happy.
The XML Methods
The methods available to the XML data type provide you with a set of powerful tools for working with XML data. And as you can see from the examples, most of that power rests in your ability to create XQuery expressions that target the information you want to access. Yet the expressions shown here are relatively basic when compared to how extensive the XQuery language is. In fact, to make the most of what the XML methods offer, you must invest the time necessary to understand the various elements that make up that language. Until then, what I’ve shown you here should provide you with the first steps necessary to start accessing and updating your XML data. Just know that there’s a much bigger universe out there waiting to be discovered.
Load comments