Articles by Phil Factor about JSON and SQL Server:
- Consuming JSON Strings in SQL Server (Nov 2012)
- SQL Server JSON to Table and Table to JSON (March 2013)
- Producing JSON Documents From SQL Server Queries via TSQL (May 2014)
- Consuming hierarchical JSON documents in SQL Server using OpenJSON (Sept 2017)
- Importing JSON data from Web Services and Applications into SQL Server(October 2017)
If you need to provide JSON-based results from the database, you are faced with a problem, because SQL Server doesn’t , at the time of writing this, have native JSON-integration (see the article ‘Consuming hierarchical JSON documents in SQL Server using OpenJSON‘ in the list above for a SQL Server 2016 solution). You would normally craft the SQL to do this ‘by hand’ for the specific job, using one of a variety of techniques. However, what if the developers needed to have large variety of results, maybe even from SQL created by the application rather than the database? That’s cool, though there will be a performance hit to using a generic solution that will produce JSON from any SQL. Until SQL Server provides ‘native’ JSON support it will be rather slow, but not difficult.
In this article, I’ll be introducing a few ideas about providing a generic way to produce JSON and other types of data documents, from SQL.
The simplest strategy to provide a general way to output JSON is to convert your SQL result to XML by using the FOR XML
syntax of the SELECT
statement. Then you parse the XML, which has obligingly converted all the values into string form, gives you the names, their position in the hierarchy and even the DataType. When I say ‘simple’ I mean simpler than any other alternative. The advantage of using XML is that you can make use of the versatility of the WITH XML PATH
syntax, or the XPath SQL Server extensions, to specify the hierarchy. If you have a generic way to convert from any XML document, whether derived from SQL or not, to JSON, then it becomes even more useful!
If you’re not too concerned with performance, you can experiment with some functions I’ve published over the years for dealing with data documents. For handling the type of hierarchical information that is transferred via JSON, CSV or XML, I’ve published a variety of stored procedures and functions that all use a rather crude hierarchy table that is sufficient for the purpose. There is already a function that produces a JSON document from such a table. All we have to do is to add a new Table-Valued function that produces a hierarchy from XML. We then just create the SQL we want, get it to create XML and the rest is simple.
Back to AdventureWorks 2008, and here is an example of its use.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
DECLARE @MyHierarchy Hierarchy -- to pass the hierarchy table around insert into @MyHierarchy SELECT * from dbo.ParseXML( ---your SQL Goes here ---> (SELECT top 5 title,firstname,middleName,Lastname,suffix,emailAddress,Phone, AddressLine1,AddressLine2,City, postalcode,SP.Name,SP.CountryRegionCode as ContactString from AdventureWorks.person.contact c inner join AdventureWorks.Sales.Individual I ON C.ContactID = I.ContactID inner join AdventureWorks.Sales.CustomerAddress AS CA ON CA.CustomerID = I.CustomerID inner join AdventureWorks.Person.Address AS A ON A.AddressID = CA.AddressID inner join AdventureWorks.person.StateProvince SP ON A.StateProvinceID=SP.StateProvinceID ---You add this magic spell, making it XML, and giving a name for the 'list' of rows and the root for XML path ('customer'), root('customers') -- end of SQL ) ) SELECT dbo.ToJSON(@MyHierarchy) |
Well, that’s it. You just give the function consisting of any working SQL Query, adding the FOR XML PATH, ROOT to turn it into XML and you’ll get JSON, reflecting the structures you specify in the FOR XML query. As in much of life, there is some detail to attend to, which we’ll go into later. Firstly, this is fine for the small stuff, but this requires a lot of string manipulation, and SQL isn’t designed for doing that. If you just want to churn out JSON, we’ll also show you a ‘quick’ version that will dish out 20,000 rows of ‘flattened’ JSON in around ten to fifteen seconds. The limitations are that you lose any attributes, and you have to use the simple default ‘root’ and ‘row’ structure of the XML, but none of that will matter for the common jobs since you don’t need attributes, or nesting, and if you do simple queries with FOR XML PATH, ROOT
you don’t get ’em. So here we go with an example,
1 2 3 4 5 6 7 8 9 10 11 |
SELECT dbo.FlattenedJSON( (SELECT top 20 o.SalesOrderID, o.OrderDate, od.ProductID, p.Name, od.OrderQty, od.UnitPrice, od.LineTotal FROM AdventureWorks.Sales.SalesOrderHeader AS o JOIN AdventureWorks.Sales.SalesOrderDetail AS od ON o.SalesOrderID = od.SalesOrderID JOIN AdventureWorks.Production.Product AS p ON od.ProductID = p.ProductID WHERE p.Name like 'Road%' FOR XML path, root) ) |
And there you have your JSON ‘document’.
JSON representation of tabular results.
Although there is no ANSI SQL way of representing results from SQL Expressions as JSON, SELECT
statements usually produce results that are represented by ‘flattened’ or ‘raw’ JSON lists, like this
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
[ {"Name":"Road-650 Red, 44","OrderQty":"1","UnitPrice":"419.4589"}, {"Name":"Road-450 Red, 52","OrderQty":"1","UnitPrice":"874.7940"}, {"Name":"Road-650 Red, 52","OrderQty":"3","UnitPrice":"419.4589"}, {"Name":"Road-650 Black, 52","OrderQty":"5","UnitPrice":"419.4589"}, {"Name":"Road-450 Red, 58","OrderQty":"4","UnitPrice":"874.7940"}, {"Name":"Road-650 Red, 44","OrderQty":"5","UnitPrice":"419.4589"}, {"Name":"Road-650 Black, 58","OrderQty":"3","UnitPrice":"419.4589"}, {"Name":"Road-650 Black, 44","OrderQty":"2","UnitPrice":"419.4589"}, {"Name":"Road-150 Red, 56","OrderQty":"1","UnitPrice":"2146.9620"}, {"Name":"Road-450 Red, 44","OrderQty":"1","UnitPrice":"874.7940"}, {"Name":"Road-650 Red, 48","OrderQty":"3","UnitPrice":"419.4589"}, {"Name":"Road-450 Red, 52","OrderQty":"6","UnitPrice":"874.7940"}, {"Name":"Road-150 Red, 62","OrderQty":"1","UnitPrice":"2146.9620"}, {"Name":"Road-650 Red, 60","OrderQty":"3","UnitPrice":"419.4589"}, {"Name":"Road-650 Black, 60","OrderQty":"3","UnitPrice":"419.4589"}, {"Name":"Road-450 Red, 60","OrderQty":"1","UnitPrice":"874.7940"}, {"Name":"Road-650 Red, 60","OrderQty":"1","UnitPrice":"419.4589"}, {"Name":"Road-650 Red, 52","OrderQty":"1","UnitPrice":"419.4589"}, {"Name":"Road-150 Red, 56","OrderQty":"1","UnitPrice":"2146.9620"}, {"Name":"Road-450 Red, 44","OrderQty":"1","UnitPrice":"874.7940"} ] |
I just did the top twenty records just so we could see them in the article, but this should be good for 20,000 in a batch.
But this is not the only way of doing it. We can take the opportunity of JSON to return a nested hierarchy of data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
{ "employee": { "EmployeeID" : 1537, "Sale_Order" : [ { "CustomerID" : 6812 , "Region" : "East Anglia" }, ... { "CustomerID" : 2543 , "Region" : "Wales" } ] } }, { "employee" : { "Employee_ID" : 5723, "Sale_Order" : [ { "Customer_ID" : 234 , "Region" : "London" }, ... { "Customer_ID" : 68125 , "Region" : "Midlands" } ] } } |
Or even using a similar device to CSV
1 2 3 4 5 6 7 8 |
{ "columns":[ "Make", "Model", "Year", "Color", "Mileage" ], "results": [ ["Volkswagen","Eurovan","2003","White","56,830"], ["Honda","CRV","2009","Black","35,600"] ] } |
Once they are suitably compressed, the obvious space-savings tend to vanish so the formats are down to personal preference. Also, the ‘flattened’ format can be formatted with indents to make it easier to read.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
[ { "Name":"Road-650 Red, 44", "OrderQty":"1", "UnitPrice":"419.4589" }, { "Name":"Road-650 Red, 48", "OrderQty":"3", "UnitPrice":"419.4589" }, { "Name":"Road-650 Red, 52", "OrderQty":"1", "UnitPrice":"419.4589" }, { "Name":"Road-150 Red, 56", "OrderQty":"1", "UnitPrice":"2146.9620" }, { "Name":"Road-450 Red, 44", "OrderQty":"1", "UnitPrice":"874.7940" } ] |
Getting your JSON out fast.
Because SQL produces, from the ‘document’ perspective, flat rows of values, we can cut some corners to get JSON out fast. Firstly, we’ll insist that the root is called ‘root’ and each row is called ‘row’. All we have to worry about now is to render every row in JSON format, with the right separators and brackets to represent a JSON list of objects, which consist of name/Value pairs.
Here is a simple version
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
DECLARE @cars xml; SET @cars = '<?xml version="1.0" encoding="UTF-8"?> <root> <row> <Make>Volkswagen</Make> <Model>Eurovan</Model> <Year>2003</Year> <Color>White</Color> </row> <row> <Make>Honda</Make> <Model>CRV</Model> <Year>2009</Year> <Color>Black</Color> <Mileage>35,600</Mileage> </row> </root>'; SELECT Stuff( --we want to snip out the leading comma (SELECT TheLine from --this is to glue each row into a string (SELECT ', {'+ --this is the start of the row, representing the row object in the JSON list --the local-name(.) is an eXPath function that gives you the name of the node Stuff((SELECT ',"'+coalesce(b.c.value('local-name(.)', 'NVARCHAR(255)'),'')+'":"'+ b.c.value('text()[1]','NVARCHAR(MAX)') +'"' -- 'text()[1]' gives you the text contained in the node from x.a.nodes('*') b(c) --get the row XML and split it into each node for xml path(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)') ,1,1,'')+'}'--remove the first comma from @Cars.nodes('/root/*') x(a) --get every row ) JSON(theLine) --each row for xml path(''),TYPE).value('.','NVARCHAR(MAX)' ) ,1,1,'')--remove the first leading comma |
Which would give you this
1 2 |
{"Make":"Volkswagen","Model":"Eurovan","Year":"2003","Color":"White"}, {"Make":"Honda","Model":"CRV","Year":"2009","Color":"Black","Mileage":"35,600"} |
Hmm. This seems OK. I’ve tried several methods but this seems to work best. I don’t profess any expertise in SQL XML querying so please let me know if there is a faster way of doing this!
We now wrap it into a function and take care of properly escaping various whitespace characters
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
IF OBJECT_ID (N'dbo.FlattenedJSON') IS NOT NULL DROP FUNCTION dbo.FlattenedJSON GO CREATE FUNCTION dbo.FlattenedJSON (@XMLResult XML) RETURNS NVARCHAR(MAX) WITH EXECUTE AS CALLER AS BEGIN DECLARE @JSONVersion NVARCHAR(MAX), @Rowcount INT SELECT @JSONVersion = '', @rowcount=COUNT(*) FROM @XMLResult.nodes('/root/*') x(a) SELECT @JSONVersion=@JSONVersion+ STUFF( (SELECT TheLine FROM (SELECT ', {'+ STUFF((SELECT ',"'+COALESCE(b.c.value('local-name(.)', 'NVARCHAR(255)'),'')+'":"'+ REPLACE( --escape tab properly within a value REPLACE( --escape return properly REPLACE( --linefeed must be escaped REPLACE( --backslash too REPLACE(COALESCE(b.c.value('text()[1]','NVARCHAR(MAX)'),''),--forwardslash '\', '\\'), '/', '\/'), CHAR(10),'\n'), CHAR(13),'\r'), CHAR(09),'\t') +'"' FROM x.a.nodes('*') b(c) FOR XML PATH(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)'),1,1,'')+'}' FROM @XMLResult.nodes('/root/*') x(a) ) JSON(theLine) FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)' ) ,1,1,'') IF @Rowcount>1 RETURN '['+@JSONVersion+' ]' RETURN @JSONVersion END |
You can alter the routine to give you different formats. Here is a bare-bones version that gives you a rather similar document to the old CSV format that I’ve described earlier in the article.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
DECLARE @cars xml; SET @cars = '<?xml version="1.0" encoding="UTF-8"?> <root> <row> <Make>Volkswagen</Make> <Model>Eurovan</Model> <Year>2003</Year> <Color>White</Color> <Mileage>56,830</Mileage> </row> <row> <Make>Honda</Make> <Model>CRV</Model> <Year>2009</Year> <Color>Black</Color> <Mileage>35,600</Mileage> </row> </root>'; SELECT '{ "columns":['--start by creating a list of all the column headings, like the CSV header-line. +Stuff((SELECT ', "'+coalesce(x.a.value('local-name(.)', 'NVARCHAR(255)'),'')+'"' from @Cars.nodes('/root[1]/row[1]/*') x(a) --get first row for xml path(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)') ,1,1,'')+' ],'--the end of the list of column headings + ' "results": ['--create a list (rows) of lists (tuples) +Stuff( --we want to snip out the leading comma (SELECT TheLine from --this is to glue each row into a string (SELECT ', ['+ --this is the start of the row, representing the row object in the JSON list --the local-name(.) is an eXPath function that gives you the name of the node Stuff((SELECT ',"'+ b.c.value('text()[1]','NVARCHAR(MAX)') +'"' -- 'text()[1]' gives you the text contained in the node from x.a.nodes('*') b(c) --get the row XML and split it into each node for xml path(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)') ,1,1,'')+']'--remove the first comma from @Cars.nodes('/root/*') x(a) --get every row ) JSON(theLine) --each row for xml path(''),TYPE).value('.','NVARCHAR(MAX)' ) ,1,1,'')--remove the first leading comma +' ] }' |
1 2 3 4 5 6 7 8 |
{ "columns":[ "Make", "Model", "Year", "Color", "Mileage" ], "results": [ ["Volkswagen","Eurovan","2003","White","56,830"], ["Honda","CRV","2009","Black","35,600"] ] } |
More Complex JSON
Sometimes, you don’t want flattened JSON. Sometimes you have attributes, for example. You might have a more complex hierarchy. The FOR XML PATH syntax can produce truly startling XML. Just to test our routine out though, we’ll first take a classic XML example from the JSON documentation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
Declare @XMLSample XML SELECT @XMLSample=' <glossary><title>example glossary</title> <GlossDiv><title>S</title> <GlossList> <GlossEntry id="SGML"" SortAs="SGML"> <GlossTerm>Standard Generalized Markup Language</GlossTerm> <Acronym>SGML</Acronym> <Abbrev>ISO 8879:1986</Abbrev> <GlossDef> <para>A meta-markup language, used to create markup languages such as DocBook.</para> <GlossSeeAlso OtherTerm="GML" /> <GlossSeeAlso OtherTerm="XML" /> </GlossDef> <GlossSee OtherTerm="markup" /> </GlossEntry> </GlossList> </GlossDiv> </glossary>' DECLARE @MyHierarchy Hierarchy -- to pass the hierarchy table around insert into @MyHierarchy SELECT * from dbo.ParseXML(@XMLSample) SELECT dbo.ToJSON(@MyHierarchy) |
Which would give you
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
{ "title" : "example glossary", "GlossDiv" : { "title" : "S", "GlossList" : { "GlossEntry" : { "GlossTerm" : "Standard Generalized Markup Language", "Acronym" : "SGML", "Abbrev" : "ISO 8879:1986", "GlossDef" : { "para" : "A meta-markup language, used to create markup languages such as DocBook.", "GlossSeeAlso" : { "OtherTerm" : "GML" }, "GlossSeeAlso" : { "OtherTerm" : "XML" } }, "GlossSee" : { "OtherTerm" : "markup" }, "ID" : "SGML", "SortAs" : "SGML" } } } } |
What we’ve done here is to convert the result into an intermediary SQL Table which is defined by a hierarchy type. Markup languages such as JSON and XML all represent object data as hierarchies. We then pass it to a separate function, ToJSON, to render it as JSON.
The first trick is to represent it as a Adjacency list hierarchy in a table. This Adjacency list is really the Database equivalent of any of the nested data structures that are used for the interchange of serialized information with the application, and can be used to create XML, CSV, OSX Property lists, Python nested structures or YAML as easily as JSON.
Adjacency list tables have the same structure whatever the data in them. This means that you can define a single Table-Valued Type and pass data structures around between stored procedures. However, they are best held at arms-length from the data, since they are not relational tables, but something more like the dreaded EAV (Entity-Attribute-Value) tables. Converting the data from its Hierarchical table form will be different for each application, but is easy with a CTE.
It is unlikely that we’d have to produce JSON from raw XML as we did in the first example, but you might be faced by a SQL query like this, embedded in this SQL harness
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
DECLARE @MyHierarchy Hierarchy -- to pass the hierarchy table around insert into @MyHierarchy SELECT * from dbo.ParseXML(( --->The Query starts here SELECT ProductModelID AS "@ProductModelID", Name AS "@ProductModelName", (SELECT top 20 ProductID AS "data()" FROM AdventureWorks2012.Production.Product p WHERE p.ProductModelID = p.ProductModelID FOR XML PATH ('') ) AS "@ProductIDs", ( SELECT top 10 Name AS "ProductName" FROM AdventureWorks2012.Production.Product pp WHERE pp.ProductModelID = pp.ProductModelID FOR XML PATH (''), type ) AS "ProductNames" FROM AdventureWorks2012.Production.ProductModel WHERE ProductModelID= 7 OR ProductModelID=9 FOR XML PATH('ProductModelData') ---> Query stops here )) SELECT dbo.ToJSON(@MyHierarchy) |
… which gives this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
{ "ProductNames" : [ "HL Road Frame - Black, 58", "HL Road Frame - Red, 58", "Sport-100 Helmet, Red", "Sport-100 Helmet, Black", "Mountain Bike Socks, M", "Mountain Bike Socks, L", "Sport-100 Helmet, Blue", "AWC Logo Cap", "Long-Sleeve Logo Jersey, S", "Long-Sleeve Logo Jersey, M" ], "ProductModelID" : 9, "ProductModelName" : "LL Road Frame", "ProductIDs" : "680 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724" } |
The code to do all this is attached to this article, but it is worth going through the ParseXML
function that takes an XML fragment or document and creates a SQL Hierarchy table from it. It can be used to inspect the contents of XML documents, of course, but it is primarily for interchange and for getting the contents of an XML table when you don’t know the structure up-front well-enough to use XPath, or if, like me, your brain isn’t quite big enough to understand the full expanse of XPath.
Basically, the routine starts by saving the base nodes (the root node if it is there) in a temporary table along with their name and then repeatedly extracting any element and attribute associated with it, and all its nested nodes until it has done them all. It then works out what sort of DataType each one is. It does all this simply by exploring the XML Structure without prior knowledge, using XPath wildcard syntax, and then using an XPath function ‘local-name()’ to determine the name of each node. It uses ‘text()[1]” to get the value, if any, associated with the node and also finds any attributes associated with the node. It does not make any difference between an attribute or element, since no other document format seems to care. Although I got a lot of inspiration for this routine from XMLTable
routine of the great Jacob Sebastian (whose book on XSD should be on your bedside table) I didn’t follow the temptation to use a recursive CTE. No sir, iteration is good enough for me, and a lot quicker, it turns out! Jacob was the pioneer, and still wins the elegance prize for
XMLTable
, though.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
CREATE TYPE dbo.Hierarchy AS TABLE /*Markup languages such as JSON and XML all represent object data as hierarchies. Although it looks very different to the entity-relational model, it isn't. It is rather more a different perspective on the same model. The first trick is to represent it as a Adjacency list hierarchy in a table, and then use the contents of this table to update the database. This Adjacency list is really the Database equivalent of any of the nested data structures that are used for the interchange of serialized information with the application, and can be used to create XML, OSX Property lists, Python nested structures or YAML as easily as JSON. Adjacency list tables have the same structure whatever the data in them. This means that you can define a single Table-Valued Type and pass data structures around between stored procedures. However, they are best held at arms-length from the data, since they are not relational tables, but something more like the dreaded EAV (Entity-Attribute-Value) tables. Converting the data from its Hierarchical table form will be different for each application, but is easy with a CTE. You can, alternatively, convert the hierarchical table into XML and interrogate that with XQuery */ ( element_id INT primary key, /* internal surrogate primary key gives the order of parsing and the list order */ sequenceNo int NULL, /* the place in the sequence for the element */ parent_ID INT,/* if the element has a parent then it is in this column. The document is the ultimate parent, so you can get the structure from recursing from the document */ [Object_ID] INT,/* each list or object has an object id. This ties all elements to a parent. Lists are treated as objects here */ NAME NVARCHAR(2000),/* the name of the object, null if it hasn't got one */ StringValue NVARCHAR(MAX) NOT NULL,/*the string representation of the value of the element. */ ValueType VARCHAR(10) NOT null /* the declared type of the value represented as a string in StringValue*/ ) Go |
Now here is the code for the function that parses XML to fill a hierarchy table
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
IF OBJECT_ID (N'dbo.ParseXML') IS NOT NULL DROP FUNCTION dbo.ParseXML GO CREATE FUNCTION dbo.ParseXML( @XML_Result XML) /* Returns a hierarchy table from an XML document. Author: Phil Factor Revision: 1.2 date: 1 May 2014 example: DECLARE @MyHierarchy Hierarchy INSERT INTO @myHierarchy SELECT* from dbo.ParseXML((SELECT* from adventureworks.person.contact where contactID in (123,124,125) FOR XML path('contact'), root('contacts'))) SELECTdbo.ToJSON(@MyHierarchy) DECLARE @MyHierarchy Hierarchy INSERT INTO @myHierarchy SELECT* from dbo.ParseXML('<root><CSV><item Year="1997" Make="Ford" Model="E350" Description="ac, abs, moon" Price="3000.00" /><item Year="1999" Make="Chevy" Model="Venture "Extended Edition"" Description="" Price="4900.00" /><item Year="1999" Make="Chevy" Model="Venture "Extended Edition, Very Large"" Description="" Price="5000.00" /><item Year="1996" Make="Jeep" Model="Grand Cherokee" Description="MUST SELL! air, moon roof, loaded" Price="4799.00" /></CSV></root>') SELECTdbo.ToJSON(@MyHierarchy) */ RETURNS @Hierarchy TABLE ( Element_ID INT PRIMARY KEY, /* internal surrogate primary key gives the order of parsing and the list order */ SequenceNo INT NULL, /* the sequence number in a list */ Parent_ID INT,/* if the element has a parent then it is in this column. The document is the ultimate parent, so you can get the structure from recursing from the document */ [Object_ID] INT,/* each list or object has an object id. This ties all elements to a parent. Lists are treated as objects here */ [Name] NVARCHAR(2000),/* the name of the object */ StringValue NVARCHAR(MAX) NOT NULL,/*the string representation of the value of the element. */ ValueType VARCHAR(10) NOT NULL /* the declared type of the value represented as a string in StringValue*/ ) AS BEGIN DECLARE @Insertions TABLE( Element_ID INT IDENTITY PRIMARY KEY, SequenceNo INT, TheLevel INT, Parent_ID INT, [Object_ID] INT, [Name] VARCHAR(50), StringValue VARCHAR(MAX), ValueType VARCHAR(10), TheNextLevel XML, ThisLevel XML) DECLARE @RowCount INT, @ii INT --get the base-level nodes into the table INSERT INTO @Insertions (TheLevel, Parent_ID, [Object_ID], [Name], StringValue, SequenceNo, TheNextLevel, ThisLevel) SELECT 1 AS TheLevel, NULL AS Parent_ID, NULL AS [Object_ID], FirstLevel.value('local-name(.)', 'varchar(255)') AS [Name], --the name of the element FirstLevel.value('text()[1]','varchar(max)') AS StringValue,-- its value as a string ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS SequenceNo,--the 'child number' (simple number sequence here) FirstLevel.query('*'), --The 'inner XML' of the current child FirstLevel.query('.') --the XML of the parent FROM @XML_Result.nodes('/*') a(FirstLevel) --get all nodes from the XML SELECT @RowCount=@@RowCount --we need this to work out if we are rendering an object or a list. SELECT @ii=2 WHILE @RowCount>0 --while loop to avoid recursion. BEGIN INSERT INTO @Insertions (TheLevel, Parent_ID, [Object_ID], [Name], StringValue, SequenceNo, TheNextLevel, ThisLevel) SELECT --all the elements first @ii AS TheLevel, --(2 to the final level) a.Element_ID, --the parent node NULL, --we do this later. The object ID is merely a surrogate key to distinguish each node [then].value('local-name(.)', 'varchar(255)') AS [name], --the name [then].value('text()[1]','varchar(max)') AS [value], --the value ROW_NUMBER() OVER(PARTITION BY a.Element_ID ORDER BY (SELECT 1)),--the order in the sequence [then].query('*'), --the 'inner' XML for the node [then].query('.') --the XML from which this node was extracted FROM @Insertions a CROSS apply a.TheNextLevel.nodes('*') whatsNext([then]) WHERE a.TheLevel = @ii - 1 --only look at the previous level UNION ALL -- to pick out the attributes of the preceding level SELECT @ii AS TheLevel, a.Element_ID,--the parent node NULL,--we do this later. The object ID is merely a surrogate key to distinguish each node [then].value('local-name(.)', 'varchar(255)') AS [name], --the name [then].value('.','varchar(max)') AS [value],--the value ROW_NUMBER() OVER(PARTITION BY a.Element_ID ORDER BY (SELECT 1)),--the order in the sequence '' , ''--no nodes FROM @Insertions a CROSS apply a.ThisLevel.nodes('/*/@*') whatsNext([then])--just find the attributes WHERE a.TheLevel = @ii - 1 OPTION (RECOMPILE) SELECT @RowCount=@@ROWCOUNT SELECT @ii=@ii+1 END; --roughly type the DataTypes (no XSD available here) UPDATE @Insertions SET [Object_ID]=CASE WHEN StringValue IS NULL THEN Element_ID ELSE NULL END, ValueType = CASE WHEN StringValue IS NULL THEN 'object' WHEN LEN(StringValue)=0 THEN 'string' WHEN StringValue LIKE '%[^0-9.-]%' THEN 'string' WHEN StringValue LIKE '[0-9]' THEN 'int' WHEN RIGHT(StringValue, LEN(StringValue)-1) LIKE'%[^0-9.]%' THEN 'string' WHEN StringValue LIKE'%[0-9][.][0-9]%' THEN 'real' WHEN StringValue LIKE '%[^0-9]%' THEN 'string' ELSE 'int' END--and find the arrays UPDATE @Insertions SET ValueType='array' WHERE Element_ID IN( SELECT candidates.Parent_ID FROM ( SELECT Parent_ID, COUNT(*) AS SameName FROM @Insertions --where they all have the same name (a sure sign) GROUP BY [Name],Parent_ID --no lists in XML HAVING COUNT(*)>1) candidates INNER JOIN @Insertions insertions ON candidates.Parent_ID= insertions.Parent_ID GROUP BY candidates.Parent_ID HAVING COUNT(*)=MIN(SameName))-- INSERT INTO @Hierarchy (Element_ID,SequenceNo, Parent_ID, [Object_ID], [Name], StringValue,ValueType) SELECT Element_ID, SequenceNo, Parent_ID, [Object_ID], [Name], COALESCE(StringValue,''), ValueType FROM @Insertions--and insert them into the hierarchy. RETURN END |
You can, of course, use a routine like this to get XML into a relational format but it is much easier to use the XML VALUE()
method together with the NODE()
method to do this sort of work for you.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
SELECT parent_ID, convert(varchar(20),Max(Case when Name='Make' then stringValue else '' end)) as [Make], convert(varchar(20),max(Case when Name='Model' then stringValue else '' end)) as [Model], convert(int,max(Case when Name='Year' then stringValue else '' end)) as [Year], convert(varchar(10),max(Case when Name='Color' then stringValue else '' end)) as [Color], convert(int,replace(max(Case when Name='Mileage' then stringValue else '' end),',','')) as [Mileage] from (SELECT Element_ID, SequenceNo, Parent_ID, [Object_ID], Name, StringValue, Valuetype from dbo.ParseXML( '<?xml version="1.0" encoding="UTF-8"?> <root> <row> <Make>Volkswagen</Make> <Model>Eurovan</Model> <Year>2003</Year> <Color>White</Color> <Mileage>56,830</Mileage> </row> <row> <Make>Honda</Make> <Model>CRV</Model> <Year>2009</Year> <Color>Black</Color> <Mileage>35,600</Mileage> </row> </root>')) x(Element_ID, SequenceNo, Parent_ID, [Object_ID], Name, StringValue, Valuetype) where parent_ID>1 group by Parent_ID |
Which would give:
1 2 3 4 5 6 |
parent_ID Make Model Year Color Mileage ----------- --------------- --------------- ----------- ---------- ----------- 2 Volkswagen Eurovan 2003 White 56830 3 Honda CRV 2009 Black 35600 (2 row(s) affected) |
Conclusions
If you can do so, it is nowadays often much easier for the application developer to deal with XML or JSON than with a SQL Result, particularly if the data required is hierarchical in nature. If the volume of data isn’t great, then it should be possible to deliver this quickly without undue CPU burden on the server, thereby avoiding a separate conversion process within the application. We already do this with XML so why not JSON, or any other format required?
SQL Server isn’t geared up for complex string manipulation, and isn’t therefore an obvious place to engage in this sort of activity. We need native JSON-integration in relational databases just as we already have XML-integration. After all, Sybase and PostgreSQL already have it so why not SQL Server?. Occasionally, the data as seen from the application perspective just isn’t table-based, and it is much easier, and sometimes quicker, to deliver what the application requires in a format it can consume. Whilst we await native JSON-integration, we must do the best we can with the tools that are available
Load comments