{"id":88974,"date":"2020-11-04T16:56:58","date_gmt":"2020-11-04T16:56:58","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=88974"},"modified":"2024-01-17T19:45:04","modified_gmt":"2024-01-17T19:45:04","slug":"unwrapping-json-to-sql-server-tables","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/blogs\/unwrapping-json-to-sql-server-tables\/","title":{"rendered":"Unwrapping JSON to SQL Server Tables"},"content":{"rendered":"<p>If you know the structure and contents of a JSON document, then it is possible to turn this into one or more relational tables, but even then I dare to you claim that it is easy to tap in a good OpenJSON SELECT statement to do it. If you don\u2019t know what\u2019s in that JSON file, then you\u2019re faced with sweating over a text editor trying to work it all out. You long to just get the contents into a relational table and take it on from there. Even then, You\u2019ve got several struggles before that table appears in the result pane. You must get the path to the tabular data correct, you have to work out the SQL Datatypes, and you need to list the full panoply of keys. Let\u2019s face it: it is a chore. Hopefully, all that is in the past with these helper functions.<\/p>\n<p>What about being able to do this, for example \u2026<\/p>\n<pre class=\"lang:tsql decode:true\">EXECUTE TablesFromJSON @TheJSON= N'[\r\n    {\"name\":\"Phil\", \"email\":\"PhilipFactor@geeMail.com\"},\r\n    {\"name\":\"Geoff\", \"email\":\"Geoff2435@geeMail.com\"},\r\n    {\"name\":\"Mo\", \"email\":\"MoHussain34@geeMail.com\"},\r\n    {\"name\":\"Karen\", \"email\":\"KarenAlott34@geeMail.com\"},\r\n    {\"name\":\"Bob\", \"email\":\"bob32@geeMail.com\"}\r\n]'<\/pre>\n<p>\u2026and getting this?<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-101269\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/01\/PhilFactor_MissingPicture1.png\" alt=\"\" width=\"236\" height=\"126\" \/><\/p>\n<p>Or if there is more than one table somewhere there\u2026<\/p>\n<pre class=\"lang:tsql decode:true \">  DECLARE @json NVARCHAR(MAX) =\r\n  N'{\r\n\t\"id\": \"0001\",\r\n\t\"type\": \"donut\",\r\n\t\"name\": \"Cake\",\r\n\t\"ppu\": 0.55,\r\n\t\"batters\":\r\n\t\t{\r\n\t\t\t\"batter\":\r\n\t\t\t\t[\r\n\t\t\t\t\t{ \"id\": \"1001\", \"type\": \"Regular\" },\r\n\t\t\t\t\t{ \"id\": \"1002\", \"type\": \"Chocolate\" },\r\n\t\t\t\t\t{ \"id\": \"1003\", \"type\": \"Blueberry\" },\r\n\t\t\t\t\t{ \"id\": \"1004\", \"type\": \"Devil''s Food\" }\r\n\t\t\t\t]\r\n\t\t},\r\n\t\"topping\":\r\n\t\t[\r\n\t\t\t{ \"id\": \"5001\", \"type\": \"None\" },\r\n\t\t\t{ \"id\": \"5002\", \"type\": \"Glazed\" },\r\n\t\t\t{ \"id\": \"5005\", \"type\": \"Sugar\" },\r\n\t\t\t{ \"id\": \"5007\", \"type\": \"Powdered Sugar\" },\r\n\t\t\t{ \"id\": \"5006\", \"type\": \"Chocolate with Sprinkles\" },\r\n\t\t\t{ \"id\": \"5003\", \"type\": \"Chocolate\" },\r\n\t\t\t{ \"id\": \"5004\", \"type\": \"Maple\" }\r\n\t\t]\r\n}';\r\nEXECUTE TablesFromJson @JSON\r\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-101265\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/01\/PhilFactor_MissingPicture2.png\" alt=\"\" width=\"263\" height=\"361\" \/><\/p>\n<p>You may not want table results straight out of your JSON: I\u2019m really just showing off, but I\u2019m going to describe some routines that are useful to me for dealing with JSON import. Your requirements may be more subtle.<\/p>\n<p>The first stage is to get a good representation of a json document so you can work on it in SQL. I do this with a multi-statement table-valued function, but you\u2019ll see that it is a mostly pretty simple unwrapping of the json document. Built into it is a way of gauging the most appropriate SQL Datatype for each value. Unless you adopt JSON Schema, there is no onus on being consistent in assigning values to keys, so you have to test every simple value (i.e. everything other than arrays or objects)<\/p>\n<pre class=\"lang:tsql decode:true\">CREATE OR alter FUNCTION [dbo].[UnwrapJson]\r\n\/**\r\nsummary:   &gt;\r\n  This multi-statement table-valued function talkes a JSON string and\r\n  unwraps it into a relational hierarchy table that also retains\r\n  the path to each element in the JSON document, and calculates the\r\n  best-fit sql datatype fpr every simple value\r\nAuthor: Phil Factor\r\nRevision: 1.0\r\ndate: 1 Nov 2020\r\nexample:\r\n  - SELECT * FROM UnwrapJson (N'[  \r\n    {\"name\":\"Phil\", \"email\":\"PhilipFactor@gmail.com\"},  \r\n    {\"name\":\"Bob\", \"email\":\"bob32@gmail.com\"}  \r\n    ]')\r\nreturns:   &gt;\r\n  id, level, [key], Value, type, SQLDatatype, parent, path\r\n \r\n**\/    \r\n(\r\n    @JSON NVARCHAR(MAX)\r\n)\r\nRETURNS @Unwrapped TABLE \r\n  (\r\n  [id] INT IDENTITY, --just used to get a unique reference to each json item\r\n  [level] INT, --the hierarchy level\r\n  [key] NVARCHAR(100), --the key or name of the item\r\n  [Value] NVARCHAR(MAX),--the value, if it is a null, int,binary,numeric or string\r\n  type INT, --0 TO 5, the JSON type, null, numeric, string, binary, array or object\r\n  SQLDatatype sysname, --whatever the datatype can be parsed to\r\n  parent INT, --the ID of the parent\r\n  [path] NVARCHAR(4000) --the path as used by OpenJSON\r\n  )\r\nAS begin\r\nINSERT INTO @Unwrapped ([level], [key], Value, type, SQLDatatype, parent,\r\n[path])\r\nVALUES\r\n  (0, --the level\r\n   NULL, --the key,\r\n   @json, --the value,\r\n   CASE WHEN Left(ltrim(@json),1)='[' THEN 4 ELSE 5 END, --the type\r\n   'json', --SQLDataType,\r\n   0 , --no parent\r\n   '$' --base path\r\n  );\r\nDECLARE @ii INT = 0,--the level\r\n@Rowcount INT = -1; --the number of rows from the previous iteration\r\nWHILE @Rowcount &lt;&gt; 0 --while we are still finding levels\r\n  BEGIN\r\n    INSERT INTO @Unwrapped ([level], [key], Value, type, SQLDatatype, parent,\r\n    [path])\r\n      SELECT [level] + 1 AS [level], new.[Key] AS [key],\r\n        new.[Value] AS [value], new.[Type] AS [type],\r\n-- SQL Prompt formatting off\r\n\/* in order to determine the datatype of a json value, the best approach is to a determine\r\nthe datatype that can be parsed. It JSON, an array of objects can contain attributes that arent\r\nconsistent either in their name or value. *\/\r\n       CASE \r\n        WHEN new.Type = 0 THEN 'bit null'\r\n\t\tWHEN new.[type] IN (1,2)  then COALESCE(\r\n  \t\t  CASE WHEN TRY_CONVERT(INT,new.[value]) IS NOT NULL THEN 'int' END, \r\n  \t\t  CASE WHEN TRY_CONVERT(NUMERIC(14,4),new.[value]) IS NOT NULL THEN 'numeric' END,\r\n  \t\t  CASE WHEN TRY_CONVERT(FLOAT,new.[value]) IS NOT NULL THEN 'float' END,\r\n\t\t  CASE WHEN TRY_CONVERT(MONEY,new.[value]) IS NOT NULL THEN 'money' END,\r\n  \t\t  CASE WHEN TRY_CONVERT(DateTime,new.[value],126) IS NOT NULL THEN 'Datetime2' END,\r\n\t\t  CASE WHEN TRY_CONVERT(Datetime,new.[value],127) IS NOT NULL THEN 'Datetime2' END,\r\n\t\t  'nvarchar')\r\n\t   WHEN new.Type = 3 THEN 'bit'\r\n\t   WHEN new.Type = 5 THEN 'object' ELSE 'array' END AS SQLDatatype,\r\n        old.[id],\r\n        old.[path] + CASE WHEN old.type = 5 THEN '.' + new.[Key] \r\n\t\t\t\t\t   ELSE '[' + new.[Key] COLLATE DATABASE_DEFAULT + ']' END AS path\r\n-- SQL Prompt formatting on\r\n      FROM @Unwrapped old\r\n        CROSS APPLY OpenJson(old.[Value]) new\r\n          WHERE old.[level] = @ii AND old.type IN (4, 5);\r\n    SELECT @Rowcount = @@RowCount;\r\n    SELECT @ii = @ii + 1;\r\n  END;\r\n  return\r\nEND\r\ngo\r\n<\/pre>\n<p>Now, you have the document as a table. You can now unpick this in several ways, but we want to get all the tables that are embedded in the json. These are generally represented as arrays of objects, each of these objects representing a \u2018document\u2019 or \u2018row\u2019. In JSON, you can, of course, have an object or array as a value for one of the keys within the object; equivalent to storing xml or json in a relational column- but we\u2019re after the simple columns.<\/p>\n<p>Because we are looking for the tables, we can get a quick listing of them. (in reality, you\u2019d probably only want a sample of each if you\u2019re just browsing)<\/p>\n<p>First we place the output of dbo.unwrapJSON into a temporary table. I\u2019ve chosen #jsonObject. Now we can see the rows, if there are any table structures within the JSON.<\/p>\n<pre class=\"lang:tsql decode:true \">SELECT parent.path AS [TheTable] ,grandchild.*\r\n      FROM\r\n        (SELECT path, id FROM #jsonObject WHERE type = 4) Parent --start with an array\r\n        INNER JOIN #jsonObject Child\r\n          ON Child.parent = Parent.id AND child.type IN (4, 5) --either array or object\r\n        INNER JOIN #jsonObject GrandChild\r\n          ON GrandChild.parent = Child.id AND GrandChild.type NOT IN (4, 5)\r\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-101266\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/01\/PhilFactor_MissingPicture3.png\" alt=\"\" width=\"992\" height=\"647\" \/><\/p>\n<p>Well, that\u2019s fine as far as it goes, but it doesn\u2019t go far enough. What I want is the OpenJSON query that I can execute to get the actual result.<\/p>\n<p>Here is an inline table function that does just that, using the expression I\u2019ve just shown you.<\/p>\n<pre class=\"lang:tsql decode:true \">CREATE OR alter FUNCTION [dbo].[OpenJSONExpressions]\r\n\/**\r\nsummary:   &gt;\r\n  This inline table-valued function talkes a JSON string and\r\n  locates every table structure. Then it creates an OpenJSON\r\n  Statement that can then be executed to create that table\r\n  from the original JSON.\r\nAuthor: Phil Factor\r\nRevision: 1.0\r\ndate: 1 Nov 2020\r\nexample:\r\n  - SELECT * FROM OpenJSONExpressions (N'[  \r\n    {\"name\":\"Phil\", \"email\":\"PhilipFactor@gmail.com\"},  \r\n    {\"name\":\"Bob\", \"email\":\"bob32@gmail.com\"}  \r\n    ]')\r\nreturns:   &gt;\r\n  expression\r\n \r\n**\/    \r\n(\r\n   @JSON NVARCHAR(MAX)\r\n    \r\n)\r\nRETURNS TABLE AS RETURN\r\n(\r\nWITH UnwrappedJSON (id, [level], [key], [Value], [type], SQLDatatype, parent,\r\n                   [path]\r\n                   )\r\nAS (SELECT id, [level], [key], [Value], [type], SQLDatatype, parent, [path]\r\n      FROM dbo.UnwrapJson(@json) )\r\n  SELECT 'Select * from openjson(@json,''' + path + ''')\r\nWITH ('  + String_Agg(\r\n                       [name] + ' ' + datatype + ' ' --the WITH statement\r\n-- SQL Prompt formatting off\r\n   + case when datatype='nvarchar' then '('+length+')' \r\n     WHEN datatype='numeric' then  '(14,4)' ELSE '' end,', ')\r\n   WITHIN GROUP ( ORDER BY  TheOrder  ASC  )    +')' as expression\r\n-- SQL Prompt formatting on\r\n    FROM\r\n      (\r\n      SELECT Parent.path, GrandChild.[key] AS [name], Min(GrandChild.id) AS TheOrder,\r\n\t    Max(GrandChild.SQLDatatype) AS datatype,\r\n        Convert(NVARCHAR(100), Max(Len(GrandChild.Value))) AS length\r\n        FROM\r\n          (SELECT path, id FROM UnwrappedJSON WHERE type = 4) Parent\r\n          INNER JOIN UnwrappedJSON Child\r\n            ON Child.parent = Parent.id AND child.type IN (4, 5)\r\n          INNER JOIN UnwrappedJSON GrandChild\r\n            ON GrandChild.parent = Child.id AND GrandChild.type NOT IN (4, 5)\r\n        GROUP BY Parent.path, GrandChild.[key]\r\n      ) TheFields\r\n    GROUP BY path\r\n\t)\r\nGO\r\n\r\n<\/pre>\n<p>So we try it out with some JSON that has two table in it.<\/p>\n<pre class=\"lang:tsql decode:true\">SELECT * FROM OpenJSONExpressions (\r\n  N'{\"employees\":[  \r\n    {\"name\":\"Mo\", \"email\":\"mojaiswal@gmail.com\", \"StartDate\":\"2012-11-07T18:26:20\"},  \r\n    {\"name\":\"Bob\", \"email\":\"bob456@gmail.com\", \"StartDate\":\"2015-06-20\"},  \r\n\t{\"name\":\"Phil\", \"email\":\"PhilipFactor@gmail.com\", \"StartDate\":\"2015-08-01T08:05:20\"},\r\n    {\"name\":\"Susan\", \"email\":\"Su87@me.com\", \"StartDate\":\"2012-13-07\"}  \r\n],\"Customers\":[  \r\n    {\"name\":\"The Kamakaze Laxative Company\", \"contact\":\"Karen\", \"email\":\"Enquiries@KLCEnterprises.co.uk\", \"CustomerSince\":\"2012-11-07\"},  \r\n    {\"name\":\"GreenFence Softwear limited\", \"contact\":\"Dick\", \"email\":\"Bob@GreenFenceSoftwear.com\", \"CustomerSince\":\"2015-06-20\"},  \r\n\t{\"name\":\"Grimm and Grivas, Solicitors\",  \"contact\":\"Jaz\", \"email\":\"GrimmGrivas@gmail.com\", \"CustomerSince\":\"2015-08-01T08:05:20\"}\r\n]\r\n}  ')\r\n<\/pre>\n<p>From this we get the result \u2026<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-101267\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/01\/PhilFactor_MissingPicture4.png\" alt=\"\" width=\"1305\" height=\"135\" \/><\/p>\n<p>Which are the following queries\u2026<\/p>\n<pre class=\"lang:tsql decode:true \">Select * from openjson(@json,'$.Customers')\r\nWITH (name nvarchar (29), contact nvarchar (5), email nvarchar (30), CustomerSince Datetime2 )\r\n\r\nSelect * from openjson(@json,'$.employees')\r\nWITH (name nvarchar (5), email nvarchar (22), StartDate nvarchar (19))\r\n<\/pre>\n<p>Well, those look suspiciously-executable; so we\u2019ll do just that, in the following procedure<\/p>\n<pre class=\"lang:tsql decode:true\">go\r\nCREATE PROCEDURE TablesFromJSON @TheJSON NVARCHAR(MAX)\r\n\/**\r\nsummary:   &gt;\r\n  This procedure returns a table for every one found  in a JSON \r\n  string \r\nAuthor: Phil Factor\r\nRevision: 1.0\r\ndate: 1 Nov 2020\r\nexample:\r\n  - EXECUTE TablesFromJSON @TheJSON= N'[  \r\n    {\"name\":\"Phil\", \"email\":\"PhilipFactor@geeMail.com\"},  \r\n    {\"name\":\"Geoff\", \"email\":\"Geoff2435@geeMail.com\"},\r\n    {\"name\":\"Mo\", \"email\":\"MoHussain34@geeMail.com\"},\r\n    {\"name\":\"Karen\", \"email\":\"KarenAlott34@geeMail.com\"},\r\n\t{\"name\":\"Bob\", \"email\":\"bob32@geeMail.com\"}   \r\n    ]'\r\nreturns:   &gt;\r\n  expression\r\n \r\n**\/    \r\nAS\r\nDECLARE @expressions TABLE (id INT IDENTITY, TheExpression NVARCHAR(MAX));\r\nINSERT INTO @expressions (TheExpression)\r\n  SELECT expression FROM OpenJSONExpressions(@TheJSON);\r\nDECLARE @RowCount INT = -1, @ii INT = 1, @expressionToExcecute NVARCHAR(MAX);\r\nWHILE @RowCount &lt;&gt; 0\r\n  BEGIN\r\n    SELECT @expressionToExcecute = TheExpression FROM @expressions WHERE id = @ii;\r\n    SELECT @RowCount = @@RowCount;\r\n    SELECT @ii = @ii + 1;\r\n    IF @RowCount &gt; 0\r\n      EXECUTE sp_executesql @expressionToExcecute, N'@JSON NVARCHAR(MAX)',\r\n        @JSON = @TheJSON;\r\n  END;\r\nGO \r\nSo we try it out\u2026\r\nEXECUTE TablesFromJSON  \r\n  N'{\"employees\":[  \r\n    {\"name\":\"Mo\", \"email\":\"mojaiswal@gmail.com\", \"StartDate\":\"2012-11-07T18:26:20\"},  \r\n    {\"name\":\"Bob\", \"email\":\"bob456@gmail.com\", \"StartDate\":\"2015-06-20\"},  \r\n\t{\"name\":\"Phil\", \"email\":\"PhilipFactor@gmail.com\", \"StartDate\":\"2015-08-01T08:05:20\"},\r\n    {\"name\":\"Susan\", \"email\":\"Su87@me.com\", \"StartDate\":\"2012-13-07\"}  \r\n],\"Customers\":[  \r\n    {\"name\":\"The Kamakaze Laxative Company\", \"contact\":\"Karen\", \"email\":\"Enquiries@KLCEnterprises.co.uk\", \"CustomerSince\":\"2012-11-07\"},  \r\n    {\"name\":\"GreenFence Softwear limited\", \"contact\":\"Dick\", \"email\":\"Bob@GreenFenceSoftwear.com\", \"CustomerSince\":\"2015-06-20\"},  \r\n\t{\"name\":\"Grimm and Grivas, Solicitors\",  \"contact\":\"Jaz\", \"email\":\"GrimmGrivas@gmail.com\", \"CustomerSince\":\"2015-08-01T08:05:20\"}\r\n]\r\n}  '\r\n<\/pre>\n<p>And voila! Two results.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-101268\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/01\/PhilFactor_MissingPicture5.png\" alt=\"\" width=\"610\" height=\"238\" \/><\/p>\n<p>So there we have it. I have to admit that the TablesFromJSON procedure isn\u2019t quite so practical as I\u2019d like because it is impossible to get more than one result from a stored procedure within SQL (no problem from an application, of course). It turned out to be very useful in testing all the code out, though.<\/p>\n<p>I use all three routines. I hope that they\u2019re useful to you. If you spot a bug, then let me know in the comments.<\/p>\n<p>These are stored with my JSON\/SQL Server routines here <a href=\"https:\/\/github.com\/Phil-Factor\"><strong>Phil-Factor<\/strong><\/a><strong>\/<a href=\"https:\/\/github.com\/Phil-Factor\/JSONSQLServerRoutines\">JSONSQLServerRoutines<\/a><\/strong><\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/Phil-Factor\/JSONSQLServerRoutines\/blob\/master\/UnwrapJSON.sql\">UnwrapJSON.sql<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/Phil-Factor\/JSONSQLServerRoutines\/blob\/master\/TablesFromJSON.sql\">TablesFromJSON.sql<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/Phil-Factor\/JSONSQLServerRoutines\/blob\/master\/OpenJSONExpressions.sql\">OpenJSONExpressions.sql<\/a><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you know the structure and contents of a JSON document, then it is possible to turn this into one or more relational tables, but even then I dare to you claim that it is easy to tap in a good OpenJSON SELECT statement to do it. If you don\u2019t know what\u2019s in that JSON&#8230;&hellip;<\/p>\n","protected":false},"author":154613,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[2],"tags":[],"coauthors":[6813],"class_list":["post-88974","post","type-post","status-publish","format-standard","hentry","category-blogs"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/88974","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/154613"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=88974"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/88974\/revisions"}],"predecessor-version":[{"id":101264,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/88974\/revisions\/101264"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=88974"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=88974"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=88974"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=88974"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}