{"id":1233,"date":"2011-11-07T00:00:00","date_gmt":"2011-11-07T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/working-with-the-xml-data-type-in-sql-server\/"},"modified":"2021-08-16T15:02:03","modified_gmt":"2021-08-16T15:02:03","slug":"working-with-the-xml-data-type-in-sql-server","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/learn\/working-with-the-xml-data-type-in-sql-server\/","title":{"rendered":"Working with the XML Data Type in SQL Server"},"content":{"rendered":"<div id=\"pretty\">\n<p class=\"start\">Not all SQL Server data types are created equal. Just look at the XML data type. On the surface, it might seem like your run-of-the-mill type, except, of course, being geared toward XML data; but the ways in which it&#8217;s used, how its data is queried, and when and how XML columns should be indexed quickly sets the type apart from the rest of the crowd. And those differences are what matter when working within the extensible world of XML.<\/p>\n<p>The XML data type, in fact, lies at the heart of understanding how to store and query XML data in a SQL Server database. That&#8217;s not to suggest that all XML data should be stored with the XML type, but knowing how the type works will help you determine when to use it and how to effectively access its data.<\/p>\n<p>In some cases, you shouldn&#8217;t use the XML data type, but instead use large object storage-<code>VARCHAR(MAX), NVARCHAR(MAX),<\/code> or <code>VARBINARY(MAX)<\/code>. For example, if you simply store your XML documents in the database and retrieve and update those documents as a whole-that is, if you never need to query or modify the individual XML components-you should consider using one of the large object data types. The same goes for XML files that you want to preserve in their original form, such as legal documents. If you need to retain an exact textual copy, use large object storage.<\/p>\n<p>But the rest of the time, you should consider the XML data type. The type ensures that the data is well formed according to ISO standards, and it supports fine-grained queries and modifications to specific elements and attributes within the XML. You can also index an XML column and associate its data with an XML schema collection in order to preserve its content and structure. In addition, the XML data type lets you store data that follows a structure too fluid and complex to fit easily into a relational model.<\/p>\n<p>However, when considering whether to use the XML data type, you should also be aware of its limitations. For instance, an XML column cannot be used as a key in an index, and a data value stored in an XML column cannot exceed 2 GB. You also cannot compare or sort data that uses the XML data type, nor can the data be used in a <code>GROUP BY<\/code> clause. For a complete description of the limitations on the XML data type, as well as other details about XML, see the topic &#8220;<a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/ms189887.aspx\">Implementing XML in SQL Server<\/a>&#8221; in SQL Server Books Online.<\/p>\n<h1>Creating XML Database Objects<\/h1>\n<p>SQL Server lets you assign the XML data type to columns, variables, or parameters and store in any of these objects either XML documents or fragments. The data is considered a document if it has a single top-level element. Otherwise it falls under the category of fragment.<\/p>\n<div class=\"note\">\n<p class=\"note\"><strong>NOTE:<\/strong> You can also assign the XML data type to values returned by a function. However, function return values usually require XML components more complex than what we&#8217;ll cover in this article. For this reason, functions will be covered in a later article, after those XML components have been discussed.<\/p>\n<\/div>\n<p>When you assign the XML data type to a column, variable, or parameter, you can optionally associate an XML schema collection with the object, thus ensuring that data within that object conforms to schema specifications. In such cases, the object is referred to as <i>typed<\/i>. An XML object with no associated schema collection is considered <i>untyped<\/i>.<\/p>\n<h2>Creating Untyped XML Objects<\/h2>\n<p>An untyped XML object still requires that the data be well formed according to ISO standards; however, the data is not bound to an XML schema collection. You should choose untyped XML (the default) if you don&#8217;t have a schema to associate with the data or you don&#8217;t want to adhere to the constraints imposed by a schema. For instance, you might have a workable schema but might also need to store nonconforming fragments temporarily in the XML column.<\/p>\n<p>To create an XML object in your database, you simply specify the XML data type as you would any other type. For instance, the following Transact-SQL code creates the <code>Resumes<\/code> tables, inserts data into the table, and then retrieves data from that table:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">USE AdventureWorks2008R2;\r\nGO\r\n\u00a0\r\nIF OBJECT_ID('dbo.Resumes') IS NOT NULL\r\nDROP TABLE dbo.Resumes;\r\nGO\r\n\u00a0\r\nCREATE TABLE dbo.Resumes\r\n(\r\n\u00a0 CandidateID INT IDENTITY PRIMARY KEY,\r\n\u00a0 CandidateResume XML\r\n);\r\n\u00a0\r\nINSERT INTO Resumes (CandidateResume)\r\nSELECT Resume\r\nFROM HumanResources.JobCandidate;\r\n\u00a0\r\nSELECT * FROM Resumes;\r\n<\/pre>\n<p>Notice that the table includes the <code>CandidateResume<\/code> column, which is configured with the XML data type. Because this is an untyped column, you don&#8217;t have to specific any other parameters related to the XML. You can, of course, specify the nullability or other column properties, but XML is all you need to include to create an XML column.<\/p>\n<div class=\"note\">\n<p class=\"note\"><strong>NOTE:<\/strong> I created the examples in this article within the AdventureWorks2008R2 database on a local instance of SQL Server 2008 R2.<\/p>\n<\/div>\n<p>When I ran the example above, the<code> INSERT<\/code> statement added 13 rows from the <code>JobCandidate<\/code> table into the <code>Resumes<\/code> table. I verified that the new rows had been added by running the SELECT statement, which also returned 13 rows. Each returned row included an XML value in the <code>CandidateResume<\/code> column. (Note that, in SQL Server Management Studio, you can click an XML value to open a window that displays the entire XML document or fragment.)<\/p>\n<p>Because the <code>CandidateResume<\/code> column is untyped XML, the inserted data did not have to conform to a specific schema collection; however, the data still had to conform to the ISO standards that govern XML.<\/p>\n<p>You can create an XML variable just as easily as you can create a column. In the following example, I declare the <code>@Resume<\/code> variable and set its value equal to the XML data retrieved from the <code>Resumes<\/code> table for candidate 1:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">DECLARE @resume XML;\r\n\u00a0\r\nSELECT @resume = CandidateResume\r\nFROM dbo.Resumes\r\nWHERE CandidateID = 1;\r\n\u00a0\r\nSELECT @resume AS Resume;\u00a0\r\n<\/pre>\n<p>As with the column definition, I simply specified the XML data type when I declared the variable. Again, the XML is untyped, which means that the data does not have to conform to a schema collection. When I ran the <code>SELECT<\/code> statement after declaring and setting the variable, it returned only a single value: the XML data I had inserted into the <code>Resumes<\/code> table for that candidate.<\/p>\n<p>As mentioned above, you can also assign the XML data type to a parameter. In the next example, I create a stored procedure that&#8217;s defined with an XML input parameter:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">USE AdventureWorks2008R2;\r\nGO\r\n\u00a0\r\nIF OBJECT_ID ( 'dbo.AddResume', 'P' ) IS NOT NULL \r\nDROP PROCEDURE dbo.AddResume;\r\nGO\r\n\u00a0\u00a0\u00a0 \r\nCREATE PROCEDURE AddResume\r\n\u00a0 @resume XML\r\nAS\r\n\u00a0 INSERT INTO Resumes (CandidateResume)\r\n\u00a0 VALUES (@Resume);\r\nGO\u00a0\r\n<\/pre>\n<p>Notice that the <code>@Resume<\/code> parameter is untyped XML. The stored procedure uses that parameter to insert data into the <code>Resumes<\/code> table. After I created the procedure, I declared an XML variable named <code>@Resume2<\/code> and assigned XML data to the variable. (I retrieved the data from the <code>JobCandidate<\/code> table.) I then called the <code>@Resume2<\/code> variable when I executed the stored procedure, as shown in the following example:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">DECLARE @resume2 XML;\r\n\u00a0\r\nSELECT @resume2 = Resume\r\nFROM HumanResources.JobCandidate\r\nWHERE JobCandidateID = 4;\r\n\u00a0\r\nEXEC AddResume @resume2;\r\n\u00a0\r\nSELECT * FROM Resumes;\u00a0\r\n<\/pre>\n<p>The stored procedure adds another row to the <code>Resumes<\/code> table. The row includes the XML data that I assigned to the <code>@Resume2<\/code> variable. When I ran the <code>SELECT <\/code>statement this time, it returned 14 rows and showed the new row that was added to the table.<\/p>\n<h2>Creating Typed XML Objects<\/h2>\n<p>If you have a schema collection available to associate with your XML data, you should consider using that collection. Typed XML has several advantages over untyped. To begin with, it provides another level of validation. Not only must the XML data be well formed (based on ISO standards), but it must also conform to the validation constraints and data type specifications defined by the schema. For example, the type information enforces more precise semantics on the XML values. In addition, storage and queries are better optimized than on an untyped column.<\/p>\n<p>However, before you can associate an XML object with a particularly schema, that schema must be registered in the database in which you&#8217;ll be defining the object. Only then can you reference that schema when you create your XML object.<\/p>\n<div class=\"note\">\n<p class=\"note\"><strong>NOTE:<\/strong> Registering schema collections is beyond the scope of this article. You can find details about how to register a collection in SQL Server Books Online, in the topic &#8220;<a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/ms176009.aspx\">CREATE XML SCHEMA COLLECTION (Transact-SQL)<\/a>.&#8221;<\/p>\n<\/div>\n<p>Once you&#8217;ve registered your schema collection, you can define your XML column, variable, or parameter. To do so, you must include the name of the collection in parenthesis after the XML data type name. For example, the following Transact-SQL re-creates the <code>Resumes<\/code> table with a typed XML column:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">USE AdventureWorks2008R2;\r\nGO\r\n\u00a0\r\nIF OBJECT_ID('dbo.Resumes') IS NOT NULL\r\nDROP TABLE dbo.Resumes;\r\nGO\r\n\u00a0\r\nCREATE TABLE dbo.Resumes\r\n(\r\n\u00a0 CandidateID INT IDENTITY PRIMARY KEY,\r\n\u00a0 CandidateResume XML (HumanResources.HRResumeSchemaCollection)\r\n);\r\n\u00a0\r\nINSERT INTO Resumes (CandidateResume)\r\nSELECT Resume\r\nFROM HumanResources.JobCandidate;\r\n\u00a0\r\nSELECT * FROM Resumes;\u00a0\r\n<\/pre>\n<p>Notice that the <code>CandidateResume<\/code> column now includes a reference to the schema collection (in parentheses) after the data type name. In this case, I used <code>HumanResources.HRResumeSchemaCollection<\/code>, which is included in the <code>AdventureWorks2008R2<\/code> database. This is the same collection associated with the Resume column in the <code>JobCandidate<\/code> table.<\/p>\n<p>As you can see, the process for creating a typed XML object is fairly straightforward. As long as the schema collection is registered and you get the name right, you should have no problem. Something worth noting, though. By default, SQL Server permits typed XML objects to store data as either an XML document or fragment. You can override the default behavior and specify that the object be limited to only XML documents. To do so, precede the schema collection name with the <code>DOCUMENT<\/code> keyword. For example, the following column definition restricts the values to XML documents:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">CandidateResume XML (DOCUMENT HumanResources.HRResumeSchemaCollection)<\/pre>\n<p>Regardless of whether you limit the XML to documents only, the process for creating a typed XML object is the same for columns, variables, and parameters. For instance, in the following example, I declare a typed XML variable and assign an XML value to the variable:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">DECLARE @resume XML (HumanResources.HRResumeSchemaCollection);\r\n\u00a0\r\nSELECT @resume = CandidateResume\r\nFROM dbo.Resumes\r\nWHERE CandidateID = 1;\r\n\u00a0\r\nSELECT @resume AS Resume;\u00a0\r\n<\/pre>\n<p>As you can see, this example is nearly identical to my earlier example in which I declared an untyped variable. All I&#8217;ve done is add the schema name. Now any XML document or fragment assigned to this variable must adhere to the structure defined in the schema collection. And just like before, if I were to run the <code>SELECT<\/code> statement along with the other statements, one row with one value would be returned, and that value would be the XML data.<\/p>\n<p>At this point, you can probably see how easy it is to create a typed object. But just in case, here&#8217;s what it would look like if you re-created the <code>AddResume<\/code> stored procedure with a typed input parameter:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">USE AdventureWorks2008R2;\r\nGO\r\n\u00a0\r\nIF OBJECT_ID ( 'dbo.AddResume', 'P' ) IS NOT NULL \r\nDROP PROCEDURE dbo.AddResume;\r\nGO\r\n\u00a0\u00a0\u00a0 \r\nCREATE PROCEDURE AddResume\r\n\u00a0 @resume XML (HumanResources.HRResumeSchemaCollection)\r\nAS\r\n\u00a0 INSERT INTO Resumes (CandidateResume)\r\n\u00a0 VALUES (@Resume);\r\nGO\r\n\u00a0\r\nDECLARE @resume2 XML;\r\n\u00a0\r\nSELECT @resume2 = Resume\r\nFROM HumanResources.JobCandidate\r\nWHERE JobCandidateID = 4;\r\n\u00a0\r\nEXEC AddResume @resume2;\r\n\u00a0\r\nSELECT * FROM Resumes;\u00a0\r\n<\/pre>\n<p>Again, there should be no surprises here. After I re-created the table and then re-created the stored procedure, I declared a variable to supply the parameter value to the stored procedure. When I ran the store procedure, it inserted the new row into the <code>Resumes<\/code> table, and the <code>SELECT<\/code> statement returned 14 rows.<\/p>\n<h1>Indexing an XML Column<\/h1>\n<p>Data in an XML column is stored as large binary objects (BLOBs). When no XML index is defined on the column, the database engine shreds the BLOBs at runtime during the query evaluation stage. This process can be quite time-consuming if your XML values are relatively large compared to the values being retrieved. In such cases, if you&#8217;re planning to generate a lot of queries, you should consider indexing your column.<\/p>\n<p>There are two types of XML indexes that you can create on an XML column:<\/p>\n<ul>\n<li><b>Primary:<\/b> Indexes all tags, paths, and values, along with such details as node type and document order information.<\/li>\n<li><b>Secondary:<\/b> One of three types of indexes (<code>PATH, VALUE<\/code>, and <code>PROPERTY<\/code>) that target specific query types used to retrieve data from the XML column.<\/li>\n<\/ul>\n<p>You must create a primary XML index before you can create any secondary XML indexes. In addition, the table that contains the XML column must be configured with a primary key and a clustered index based on that key. The database engine uses the primary key within the primary XML index to correlate rows.<\/p>\n<div class=\"note\">\n<p class=\"note\"><strong>NOTE:<\/strong> You can also create full-text indexes on XML columns. However, the indexes ignore the XML markup and include only element content.<\/p>\n<\/div>\n<h2>Creating a Primary XML Index<\/h2>\n<p>The primary XML index provides a shredded and persisted representation of the data in an XML column. The query process uses the index for queries that target specific components within the XML data. Queries that retrieve the full XML instance-that is, the entire XML document or fragment-do not use the primary XML index, but instead retrieve the data directly from the XML column.<\/p>\n<p>To create a primary XML index, you must specify the table and XML column where the index will be created, as shown in the following example:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">CREATE PRIMARY XML INDEX idx_resumes_xml \r\nON Resumes(CandidateResume);\u00a0\r\n<\/pre>\n<p>After you provide the <code>CREATE PRIMARY XML INDEX<\/code> keywords, you must specify a name for the index (in this case, <code>idx_resumes_xml<\/code>). Next, add the <code>ON clause<\/code>, which includes the name of the table (<code>Resumes<\/code>) and the name of the XML column (<code>CandidateResume<\/code>).<\/p>\n<p>That&#8217;s all there is to creating a primary XML index. Once you&#8217;ve done that, you can create one or more secondary XML indexes.<\/p>\n<h2>Creating a Secondary XML Index<\/h2>\n<p>As mentioned above, SQL Server lets you define three types of secondary XML indexes:<\/p>\n<ul>\n<li><b>PATH:<\/b> Supports queries that use a significant number of path expressions.<\/li>\n<li><b>PROPERTY:<\/b> Supports queries that use path expressions to retrieve multiple values from individual XML instances.<\/li>\n<li><b>VALUE:<\/b> Supports queries that retrieve values without knowing the element or attribute names that contain those values.<\/li>\n<\/ul>\n<p>You can define any combination of secondary XML indexes on your XML column, as long as a primary XML index has first been defined on that column. For instance, you can define <code>PATH<\/code> and <code>VALUE<\/code> indexes or<code> PATH<\/code> and <code>PROPERTY<\/code> indexes or only one index or all three.<\/p>\n<div class=\"note\">\n<p class=\"note\"><strong>NOTE:<\/strong> To fully understand the differences between the three types of secondary XML indexes, you need a basic understanding of how you query specific XML components (as opposed to querying the entire XML instance). In my next article on XML, I&#8217;ll cover the various methods used to access the individual components within an XML document or fragment.<\/p>\n<\/div>\n<p>Creating a secondary XML index is slightly more complicated than creating a primary one, but not too terribly difficult. As the following example illustrates, you must first specify an index name as well as the target table and column:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">CREATE XML INDEX idx_resumes_xml_value\r\nON Resumes(CandidateResume)\r\nUSING XML INDEX idx_resumes_xml\r\nFOR VALUE;\u00a0\r\n<\/pre>\n<p>The first two lines of this statement should look similar to the statement you use to create a primary XML index. The only thing missing is the<code> PRIMARY<\/code> keyword. In the <code>CREATE XML INDEX<\/code> clause, you provide a name for the index (in this case, <code>idx_resumes_xml_value<\/code>). In the <code>ON<\/code> clause, you provide the name of the table (<code>Resumes<\/code>) and the name of the XML column (<code>CandidateResume<\/code>).<\/p>\n<p>The next clause is <code>USING XML INDEX<\/code>. This is where you specify the name of the column&#8217;s primary XML index <code>(idx_resumes_xml<\/code>). Next, you define the <code>FOR <\/code>clause, which specifies the type of secondary index you want to create. In the example above, I created a <code>VALUE <\/code>secondary index.<\/p>\n<p>That&#8217;s all there is to creating a secondary index. If you want to create a <code>PROPERTY<\/code> or <code>PATH<\/code> index on the same column, you simply change the index name and the value in the <code>FOR<\/code> clause.<\/p>\n<p>You can verify that the primary and secondary XML indexes have been created by using the <code>sys.xml_indexes<\/code> catalog view to retrieve a list of indexes. For example, I used the following <code>SELECT<\/code> statement to retrieve those indexes whose names contain &#8216;resume&#8217;:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT name AS IndexName\r\nFROM sys.xml_indexes\r\nWHERE name LIKE '%resumes%';\u00a0\r\n<\/pre>\n<p>The statement should return the following results:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">IndexName\r\nidx_resumes_xml\r\nidx_resumes_xml_value\u00a0\r\n<\/pre>\n<p>You can remove these indexes from the database by using a <code>DROP INDEX <\/code>statement, as shown in the following example:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">DROP INDEX idx_resumes_xml ON Resumes;<\/pre>\n<p>This statement drops the primary XML index from the XML column. When you drop a primary index, the secondary indexes are automatically removed. As a result, if I were to re-query the <code>sys.xml_indexes<\/code> catalog view, I would not see any of the XML indexes I created on the <code>Resumes<\/code> table.<\/p>\n<h1>Moving Forward with XML<\/h1>\n<p>As the examples have demonstrated, you can implement XML database objects and create XML indexes on columns with relative ease. And inserting data into those columns is little different from inserting data into other column types, assuming the data conforms to the type&#8217;s structure. The ingredient that&#8217;s missing from this mix is how you work with that data once you get it in there.<\/p>\n<p>As it turns out, the XML data type supports several methods for querying and manipulating XML data. Unfortunately, that&#8217;s a discussion that must wait till my next article, in which I&#8217;ll cover each of these methods and show you how to use them to work with XML data. Until then, you should now have a basic overview of the XML data type and how to get started using it when creating columns, variables, and parameters. From this foundation, you&#8217;ll be ready to jump into the world of the XML methods in no time at all.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>The XML data type, introduced in SQL Server 2005, is a powerful construct.  When used wisely, it can provide useful extensions to SQL Server. Robert Sheldon, in the first part of a series, describes how create and index a typed XML column in a table, and discusses when you should consider using an XML Datatype.&hellip;<\/p>\n","protected":false},"author":221841,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143525],"tags":[5508,4149,4150,5134,4151,4217,5507],"coauthors":[6779],"class_list":["post-1233","post","type-post","status-publish","format-standard","hentry","category-learn","tag-data-types","tag-learn-sql-server","tag-sql","tag-sql-prompt","tag-sql-server","tag-xml","tag-xml-data-type"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1233","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221841"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=1233"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1233\/revisions"}],"predecessor-version":[{"id":75141,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1233\/revisions\/75141"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=1233"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=1233"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=1233"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=1233"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}