Producing JSON Documents from SQL Server queries via TSQL

Although SQL Server supports XML well, XML's little cousin JSON gets no love. This is frustrating now that JSON is in so much demand. Maybe, Phil suggests, it is possible to leverage all that XML, and XPath, goodness in SQL Server to produce JSON in a versatile way from SQL Queries? Yes, it so happens that there are plenty of alternatives.

Articles by Phil Factor about JSON and SQL Server:

  1. Consuming JSON Strings in SQL Server (Nov 2012)
  2. SQL Server JSON to Table and Table to JSON (March 2013)
  3. Producing JSON Documents From SQL Server Queries via TSQL (May 2014)
  4. Consuming hierarchical JSON documents in SQL Server using OpenJSON (Sept 2017)
  5. Importing JSON data from Web Services and Applications into SQL Server(October 2017)

If you need to provide JSON-based results from the database, you are faced with a problem, because SQL Server doesn’t , at the time of writing this,  have native JSON-integration (see the article ‘Consuming hierarchical JSON documents in SQL Server using OpenJSON‘ in the list above for a SQL Server 2016 solution). You would normally  craft the SQL to do this ‘by hand’ for the specific job, using one of a variety of techniques. However, what if the developers needed to have large variety of results, maybe even from SQL created by the application rather than the database? That’s cool, though there will be a performance hit to using a generic solution that will produce JSON from any SQL. Until SQL Server provides ‘native’ JSON support it will be rather slow, but not difficult.

In this article, I’ll be introducing a few ideas about providing a generic way to produce JSON and other types of  data documents, from SQL.

The simplest strategy to provide a general way to output JSON is to convert your SQL result to XML by using the FOR XML syntax of the SELECT statement.  Then you parse the XML, which has obligingly converted all the values into string form, gives you the names, their position in the hierarchy and even the DataType. When I say ‘simple’ I mean simpler than any other alternative. The advantage of using XML is that you can make use of the versatility of the WITH XML PATH syntax, or the XPath SQL Server extensions, to specify the hierarchy. If you have a generic way to convert  from any XML document, whether derived from SQL or not, to JSON, then it becomes even more useful!

If you’re not too concerned with performance, you can experiment with some functions I’ve published over the years for dealing with data documents. For handling the type of hierarchical information that is transferred via JSON, CSV or XML, I’ve published a variety of stored procedures and functions that all use a rather crude hierarchy table that is sufficient for the purpose. There is already a function that produces a JSON document from such a table. All we have to do is to add a new Table-Valued function that produces a hierarchy from XML. We then just create the SQL we want, get it to create XML and the rest is simple.

Back to AdventureWorks 2008, and here is an example of its use.

Well, that’s it. You just give the function consisting of any working SQL Query, adding the FOR XML PATH, ROOT  to turn it into XML  and you’ll get JSON, reflecting the structures you specify in the FOR XML query. As in much of life, there is some detail to attend to, which we’ll go into later.  Firstly, this is fine for the small stuff, but this requires a lot of string manipulation, and SQL isn’t designed for doing that. If you just want to churn out JSON, we’ll also show you a ‘quick’ version that will dish out  20,000 rows of ‘flattened’ JSON  in around ten to fifteen seconds. The limitations are that you lose any attributes, and you have to use the simple default ‘root’ and ‘row’ structure of the XML, but none of that will matter for the common jobs since you don’t need attributes, or nesting, and if you do simple queries with FOR XML PATH, ROOT you don’t get ’em.  So here we go with an example,

And there you have your JSON ‘document’.

JSON representation of tabular results.

Although there is no ANSI SQL way of representing results from SQL Expressions as JSON, SELECT statements usually produce results that are represented by ‘flattened’  or ‘raw’ JSON lists, like this

I just did the top twenty records just so we could see them in the article, but this should be good  for 20,000 in a batch.

But  this is not the only way of doing it. We can take the opportunity of JSON to return a nested hierarchy of data

Or even using a similar device to CSV

Once they are suitably compressed, the obvious space-savings tend to vanish so the formats are down to personal preference. Also, the ‘flattened’ format can be formatted with indents to make it easier to read. 

Getting your JSON out fast.

Because SQL produces, from the ‘document’ perspective,  flat rows of values, we can cut some corners to get JSON out fast. Firstly, we’ll insist that the root is called ‘root’ and each row is called ‘row’.  All we have to worry about now is to render every row in JSON format, with the right separators and brackets to represent a JSON list of  objects, which consist of name/Value pairs.

Here is a simple version

Which would give you this

Hmm. This seems OK. I’ve tried several methods but this seems to work best. I don’t profess any expertise in SQL XML querying so please let me know if there is a faster way of doing this!

We now wrap it into a function and take care of properly escaping various whitespace characters

You can alter the routine to give you different formats. Here is a bare-bones version that gives you a rather similar document to the old CSV format that I’ve described earlier in the article.

More Complex JSON

Sometimes, you don’t want flattened JSON.  Sometimes you have attributes, for example. You might have a more complex hierarchy.  The FOR XML PATH syntax can produce truly startling XML. Just to test our routine out though, we’ll  first take a classic XML example from the JSON documentation.

Which would give you

What we’ve done here is to convert the result into an intermediary SQL Table which is defined by a hierarchy type. Markup languages such as JSON and XML all represent object data as hierarchies.  We then pass it to a separate function, ToJSON, to render it as JSON.

The first trick is to represent it as a Adjacency list hierarchy in a table. This Adjacency list is really the Database equivalent of any of the nested data structures that are used for the interchange of serialized information with the application, and can be used to create XML, CSV, OSX Property lists, Python nested structures or YAML as easily as JSON.

Adjacency list tables have the same structure whatever the data in them. This means that you can define a single Table-Valued Type and pass data structures around between stored procedures. However, they are best held at arms-length from the data, since they are not relational tables, but something more like the dreaded EAV (Entity-Attribute-Value) tables. Converting the data from its Hierarchical table form will be different for each application, but is easy with a CTE.

It is unlikely that we’d have to produce JSON from raw XML as we did in the first example, but you might be faced by a SQL query like this, embedded in this SQL harness

… which gives this:

The code to do all this is attached to this article, but it is worth going through the ParseXML function that takes an XML fragment or document and creates a SQL Hierarchy table from it. It can be used to inspect the contents of XML documents, of course, but it  is primarily for interchange and for getting the contents of an XML table when you don’t know the structure up-front well-enough to use XPath, or if, like me, your brain isn’t quite big enough to understand the full expanse of XPath.

Basically, the routine starts by saving the base nodes (the root node if it is there) in a temporary table along with their name and then repeatedly extracting any element and attribute associated with it, and all its nested nodes until it has done them all. It then works out what sort of DataType each one is. It does all this simply by exploring the XML Structure without prior knowledge, using XPath  wildcard syntax, and then using an XPath function ‘local-name()’ to determine the name of each node. It uses ‘text()[1]” to get the value, if any, associated with the node and also finds any attributes associated with the node. It does not make any difference between an attribute or element, since no other document format seems to care. Although I got a lot of inspiration for this routine from XMLTable routine of  the great Jacob Sebastian  (whose book on XSD should be on your bedside table) I didn’t follow the temptation to use a recursive CTE. No sir, iteration is good enough for me, and a lot quicker, it turns out! Jacob was the pioneer, and still wins the elegance prize for  XMLTable, though.

Now here is the code for the function that parses XML to fill a hierarchy table

You can, of course, use a routine like this to get XML into a relational format but it is much easier to use the XML VALUE() method  together with the NODE() method to do this sort of work for you.

Which would give:

Conclusions

If you can do so, it is nowadays often much easier for the application developer to deal with XML or JSON than with a SQL Result, particularly if the data required is hierarchical in nature.  If the volume of data isn’t great, then it should be possible to deliver this quickly without undue CPU burden on the server, thereby avoiding a separate conversion process within the application.  We already do this with XML so why not JSON, or any other format  required?

SQL Server isn’t geared up for complex string manipulation, and isn’t therefore an obvious place to engage in this sort of activity. We need native JSON-integration in relational databases just as we already have XML-integration. After all, Sybase and PostgreSQL already have it so why not SQL Server?. Occasionally, the data  as seen from the application perspective just isn’t table-based, and it is much easier, and sometimes quicker,  to deliver what the application requires in a format it can consume.  Whilst we await native JSON-integration, we must do the best we can with the tools that are available

Further Reading