Beginning SQL Server 2005 XML Programming

XML has been used to represent semi-structured (as well as unstructured) data such as documents and emails. If information in these models has to be queried, then XML is probably the simplest way to represent such information.

XML has been widely adopted as a platform-independent mechanism for representing data, and is also commonly used to exchange data between disparate and loosely-coupled systems, such as B2B applications and workflow solutions. More recently, XML has been used to represent semi-structured (as well as unstructured) data such as documents and emails. If information in these models has to be queried, then XML is probably the simplest way to represent such information.

For example, if documents are represented in XML, it is very easy to write a DOM or XPATH query to extract the contents of a section titled, for example, “Recipe”. Many applications also require other pieces of information that typically reside in a database. For example, consider an application that allows a user to personalize the look and feel of an application. It is not uncommon to store the preferences of the user against the user record in the database. XML lends itself well to storing such unstructured information to cope with an ever expanding list of user preferences.

When we store this kind of semi-structured or unstructured information as XML in the database, it is not always feasible to extract the content to the application tier and then process the XML using XML parsers. If the database platform is able to provide native XML processing capabilities then that is a massive advantage. In that way we can also take advantage of other database capabilities such as query optimization, indexing, backups and, most importantly of all, interoperability with relational data.

Fortunately, SQL Server 2005 now natively supports an XML data type and enterprise applications that rely heavily on XML processing can take advantage of these native capabilities. This article will provide:

  • A brief overview of the basic techniques for getting XML into and out of SQL Server. This section provides basic code examples and descriptions, together with links for further reading.
  • An introduction to the new XML data type, how to define XML columns and variables and load them with data
  • How to define typed XML columns using XML schema

XML Support in SQL Server 2000

SQL Server 2000 provided both server-side and client-side XML support and it is useful to briefly review these capabilities in order to fully appreciate how SQL Server 2005 extends and improves them. We will focus on the server-side capabilities in this article, since XML management in the database is a relatively new phenomenon. For further information, MSDN provides a very useful Survey of SQL Server 2000 XML Features article.

SQL Server 2000 Server-Side XML

The major elements of server-side XML support in SQL Server 2000 are as follows:

  1. The creation of XML fragments from relational data using the FOR XML extension to the SELECT statement
  2. The ability to shred XML data, using the OPENXML function, so that it can be imported into relational tables.
  3. Storing XML data natively in the database.


The FOR XML extension allows the creation of XML from relational data. It supports several “modifiers” that dictate the shape of the resulting XML fragment. Following is the full syntax of the FOR XML clause:

The following table explains each of the available options:




When specified, the results of a query are returned as an XML document. One of the three options, from RAW, AUTO and EXPLICIT, must be specified


Transforms each row in the result set into an XML element with a generic identifier <row /> as the element tag


Returns query results in a simple, nested XML tree. For each table in the FROM clause, for which there is at least one column listed in the SELECT clause, an XML element is created. The columns are mapped to the appropriate element attributes


Specifies the shape of the resulting XML tree explicitly. Using this mode, queries must be written in a particular way so that additional information about the nesting is specified


Returns the schema, but does not add the root element to the result


Specifies that the columns are returned as sub-elements to the table element. Otherwise, they are mapped as attributes


Specifies that the query returns the binary data in binary base64-encoded format

The following query uses AUTO mode and returns as an XML fragment information about a particular author. Note that we are using the pubs sample database that is shipped with SQL Server2000.

This query returns the following output:



The OPENXML function provides a rowset view over an XML document. Following is the full syntax of the OPENXML command:

Basically, the input to OPENXML is a “handle” to an XML document that is generated by making a call to the sp_xml_preparedocument procedure, and then a “row pattern” which is an XPATH used to identify the nodes in the XML document that are to be processed as rows. We can also specify the mapping information between the XML document and the relational rowset that is generated. The following example shows a simple usage of OPENXML:

The nodes are processed based on the rowset schema declaration provided after the WITH clause. The “2” in the query represents that we want to use element-centric mapping. The output of the above command is as follows:


You can refer to the full syntax of OPENXML, along with other complex examples, in the OPENXML section of SQL Server 2000 Books Online.

Storing XML in the Database

SQL Server 2000 also allows you to store XML documents and fragments in the database. However, since SQL Server 2000 does not support an XML data type, the XML needs to be stored in a TEXT, NTEXT or IMAGE column. When there is a need to process this XML, applications have to retrieve this XML into the application tier and then process the same using the MSXML APIs.

SQL Server 2000 Client Side XML

Client-side XML support in SQL Server 2000 comes in the form of SQLXML. Following is a brief summary of the main SQLXML technologies:

  • XML Views, which provide a bidirectional mapping between XML schemas and underlying relation tables. Essentially, mapping annotations are added to the XML Schema that defines your particular business object. This produces an an XML View that allows you to present the underlying data in a hierarchical, semi-structured format. A nice example of this is given in the previously-cited Survey of SQL Server 2000 XML Features article. XML views also support querying using XPATH and packaging the result as XML.
  • Creation of XML Templates that allow creation of dynamic sections in XML documents. FOR XML queries can be embedded within the XML document, and/or XPATH expressions over mapping queries. When the template is executed, the query block is replaced with the result of the query.

In SQL Server 2000, there are two ways to access SQLXML functionality:

  1. Via a SQLXMLOLEDB Provider that exposes SQLXML functionality through ADO
  2. Via HTTP access through an ISAPI filter. Using a configuration tool, you can set up a web site to receive incoming requests to execute XML templates, FOR XML and XPATH statements, and then apply transformations to the results

SQL Server 2005 XML Enhancements

Although SQL Server 2000 provides quite extensive XML support, as described, the major drawback is that it is not native, and also does not provide all of the features that traditional XML processing requires. Some of the limitations are:

  1. No support for XML schemas in the database for validating XML documents
  2. XML is only stored as text and thus cannot be queried using XPATH syntax. The only option is to extract the XML to the application tier using ADO (or other data access APIs) and perform manipulations
  3. Results generating XML cannot be stored in variables inside of procedures and functions. The only option available is to stream it to the client tier using FOR XML
  4. Support only for XPATH 1.0 queries. The poses restrictions on some of the types of queries that can be written

However, in spite of these limitations, the XML features of SQL Server 2000 are extensively used and there are many enterprise applications that actively depend on it.

SQL Server 2005 builds removes these limitations and offers XML as a full-fledged native XML data type along with querying capabilities. Enhancements have also been made to the FOR XML syntax. SQLXML 4.0 also brings with it many improvements. In this article, we will focus on the basics of the new XML data type.

The XML Data Type

The XML data type is a built-in data type similar to other built-in types in SQL Server 2005. You can define XML columns, XML variables, XML parameters for procedures and functions, and can also return XML as return values from functions. Internally, the XML data type is stored in a proprietary binary format. The maximum permissible size of the XML content is 2GB.

Defining XML Variables and Columns

To declare a variable of type XML is as simple as this:


In order to create an XML column (using untyped XML – more on this later), we could use the following:

Loading Data into XML Columns and Variables

Having defined an XML variable or column, we can store information in it via the following methods:

  1. Assigning a string to the XML variable / column
  2. Using the output of the SELECT…FOR XML clause
  3. Using bulkload

Simple String Assignment

We can load a string into our @xmlDoc variable as follows:

Note that string content is implicitly converted to the XML type. You can also explicitly convert the string to XML as shown:

In each example, we then select this XML content. If you run this example in SQL Server Management Studio with the “results to grid” option enabled, the XML output will be as shown below:


Notice that the XML content is hyperlinked and if you click this link, a new window opens and shows the entire XML document. This is a great improvement over SQL Server 2000 Query Analyzer, which would show the results of XML queries as a single line of output and also truncate the results if the result size is greater than 8192 bytes.

We can insert regular string content into the feedXML column of our rssFeeds table using the following method:


In the following example, we initialize the @xmlDOC variable using a SELECT…FOR XML statement:

The output of the above command is as follows (only partial results are shown. To see the full results, you need to click on the hyperlink in Management Studio):


Streaming results from FOR XML statements directly into XML variables is a great improvement over SQL Server 2000 (which did not allow this). Also note that the above example returns an XML fragment and not an XML document. If you need to return an XML document, you can use the new ROOT directive in the FOR XML command as shown:

Executing the above command produces the following output (again, only partial results are shown):


Bulkloading XML data using the OPENROWSET Function

SQL Server 2005 provides extensions to the OPENROWSET function to bulk load an XML document from the file system. The following example shows how to bulkload our @xmlDoc variable using this technique:

The RSS.XML file that is referred to in the above query is a sample RSS XML that I generated from the MSDN site. To do this, you can query the MSDN RSS site at: and then save results as XML from Internet Explorer.

The output of the above command is as follows:


Similarly, the following code uses the OPENROWSET command to load an XML document into our feedXML column:

The following output is generated:


Typing XML

Up to this point, the XML content that we have used is referred to as untyped XML. An untyped XML variable/column is one that is not associated with a schema. If we associate a schema with an XML type, it is then referred to as typed XML. Once an XML variable/column is typed using a schema, any data that is inserted into it is automatically validated against the schema, and SQL Server 2005 raises an error if the data does not conform to the defined schema rules.

It is usually recommended to associate schemas to XML content in SQL Server, since query optimization can use the information and also it provides better operational semantics. For example, we can perform arithmetic operations on numbers, but not strings. If no schema is attached to an XML, then all values are considered to be strings. MSDN provides further information about schemas and their maintenance.

Creating typed XML involves two basic steps: 1) Create an XML schema and 2) Register it in the database.

Creating a Schema

Using tools such as Visual Studio 2005 and XML Spy, creating a schema is pretty simple. Using the tool of your choice, create the following schema definition:

This schema defines an XML document of the form:

Save this schema in a file called ContactsSchema.xsd in the C:\Temp folder.

Registering a Schema in SQL Server 2005

You register a schema in SQL Server 2005 using the CREATE XML SCHEMA COLLECTION command. The following script loads our XML schema into the database.

In the following code, we type an XML variable to the MyContacts schema and then assign both valid and invalid content to the XML (by invalid, we mean that the content does not adhere to the schema):

Note how we have declared the XML variable. The schema reference is specified in parentheses after the variable. We have also used the new TRY…CATCH error handling syntax of SQL Server 2005 to capture the error and display the corresponding message.

When you execute the above snippet, the following output is generated:


You should also receive the following error message (I have wrapped the error message to make it readable):