{"id":78695,"date":"2018-05-09T16:29:51","date_gmt":"2018-05-09T16:29:51","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=78695"},"modified":"2021-09-29T16:21:02","modified_gmt":"2021-09-29T16:21:02","slug":"importing-json-collections-sql-server","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/importing-json-collections-sql-server\/","title":{"rendered":"Importing JSON Collections into SQL Server"},"content":{"rendered":"<p>It is fairly easy to Import JSON collections of documents into SQL Server if there is an underlying \u2018explicit\u2019 table schema available to them. If each of the documents have different schemas, then you have little chance. Fortunately, schema-less data collections are rare.<\/p>\n<p>In this article we\u2019ll start simply and work through a couple of sample examples before ending by creating a SQL server database schema with ten tables, constraints and keys. Once those are in place we\u2019ll then import a single JSON Document, filling the ten tables with the data of 70,000 fake records from it.<\/p>\n<p>Let&#8217;s start this gently, putting simple collections into strings which we will insert into a table. We\u2019ll then try slightly trickier JSON documents with embedded arrays and so on. We&#8217;ll start by using the example of sheep-counting words, collected from many different parts of Great Britain and Brittany. The simple aim is to put them into a table. I don\u2019t use Sheep-counting words because they are of general importance but because they can be used to represent whatever data you are trying to import.<\/p>\n<p>You will need access to SQL Server version, 2016 and later or Azure SQL Database or Warehouse to play along and you can download data and code from <a href=\"https:\/\/github.com\/Phil-Factor\/JsonUpsertToSQLServer\">GitHub<\/a>.<\/p>\n<h2>Converting Simple JSON Arrays of Objects to Table-sources<\/h2>\n<p>We will start off by creating a simple table that we want to import into.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS SheepCountingWords \r\n  CREATE TABLE SheepCountingWords\r\n    (\r\n    Number INT NOT NULL,\r\n    Word VARCHAR(40) NOT NULL,\r\n    Region VARCHAR(40) NOT NULL,\r\n    CONSTRAINT NumberRegionKey PRIMARY KEY  (Number,Region)\r\n    );\r\n  GO<\/pre>\n<p>We then choose a simple JSON Format<\/p>\n<pre class=\"lang:c# theme:vs2012\">  [{\r\n    \"number\": 11,  \"word\": \"Yan-a-dik\"\r\n  }, {\r\n    \"number\": 12,  \"word\": \"Tan-a-dik\"\r\n  }, {\r\n    \"number\": 13,  \"word\": \"Tethera-dik\"\r\n  }, {\r\n    \"number\": 14,  \"word\": \"Pethera-dik\"\r\n  }, {\r\n    \"number\": 15,  \"word\": \"Bumfit\"\r\n  }, {\r\n    \"number\": 16,  \"word\": \"Yan-a-bumtit\"\r\n  }, {\r\n    \"number\": 17,  \"word\": \"Tan-a-bumfit\"\r\n  }, {\r\n    \"number\": 18,  \"word\": \"Tethera-bumfit\"\r\n  }, {\r\n    \"number\": 19,  \"word\": \"Pethera-bumfit\"\r\n  }, {\r\n    \"number\": 20,  \"word\": \"Figgot\"\r\n  }] <\/pre>\n<p>We can very easily use OpenJSON to create a table-source that reflects the contents.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT  Number, Word\r\n      FROM\r\n      OpenJson('[{\r\n    \"number\": 11,  \"word\": \"Yan-a-dik\"\r\n  }, {\r\n    \"number\": 12,  \"word\": \"Tan-a-dik\"\r\n  }, {\r\n    \"number\": 13,  \"word\": \"Tethera-dik\"\r\n  }, {\r\n    \"number\": 14,  \"word\": \"Pethera-dik\"\r\n  }, {\r\n    \"number\": 15,  \"word\": \"Bumfit\"\r\n  }, {\r\n    \"number\": 16,  \"word\": \"Yan-a-bumtit\"\r\n  }, {\r\n    \"number\": 17,  \"word\": \"Tan-a-bumfit\"\r\n  }, {\r\n    \"number\": 18,  \"word\": \"Tethera-bumfit\"\r\n  }, {\r\n    \"number\": 19,  \"word\": \"Pethera-bumfit\"\r\n  }, {\r\n    \"number\": 20,  \"word\": \"Figgot\"\r\n  }] '\r\n  )WITH (Number INT '$.number', Word VARCHAR(30) '$.word')<\/pre>\n<p>Once you have a table source, the quickest way to insert JSON into a table will always be the straight insert, even after an existence check. It is a good practice to make the process idempotent by only inserting the records that don&#8217;t already exist. I&#8217;ll use the MERGE statement just to keep things simple, though the left outer join with a null check is faster. The MERGE is often more convenient because it will accept a table-source such as a result from the OpenJSON function. We\u2019ll create a <em>temporary<\/em> procedure to insert the JSON data into the table.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP PROCEDURE IF EXISTS #MergeJSONwithCountingTable;\r\n   GO\r\n  CREATE PROCEDURE #MergeJSONwithCountingTable @json NVARCHAR(MAX),\r\n    @source NVARCHAR(MAX)\r\n  \/**\r\n  Summary: &gt;\r\n    This inserts, or updates, into a table (dbo.SheepCountingWords) a JSON string consisting \r\n    of sheep-counting words for numbers between one and twenty used traditionally by sheep\r\n    farmers in Gt Britain and Brittany. it allows records to be inserted or updated in any\r\n    order or quantity.\r\n    \r\n  Author: PhilFactor\r\n  Date: 20\/04\/2018\r\n  Database: CountingSheep\r\n  Examples:\r\n     - EXECUTE #MergeJSONwithCountingTable @json=@OneToTen, @Source='Lincolnshire'\r\n     - EXECUTE #MergeJSONwithCountingTable @Source='Lincolnshire', @json='[{\r\n       \"number\": 11, \"word\": \"Yan-a-dik\"}, {\"number\": 12, \"word\": \"Tan-a-dik\"}]'\r\n  Returns: &gt;\r\n    nothing\r\n  **\/\r\n  AS\r\n  MERGE dbo.SheepCountingWords AS target\r\n  USING\r\n    (\r\n    SELECT DISTINCT Number, Word, @source\r\n      FROM\r\n      OpenJson(@json)\r\n      WITH (Number INT '$.number', Word VARCHAR(20) '$.word')\r\n    ) AS source (Number, Word, Region)\r\n  ON target.Number = source.Number AND target.Region = source.Region\r\n  WHEN MATCHED AND (source.Word &lt;&gt; target.Word) THEN\r\n    UPDATE SET target.Word = source.Word\r\n  WHEN NOT MATCHED THEN \r\n    INSERT (Number, Word, Region)\r\n      VALUES\r\n        (source.Number, source.Word, source.Region);\r\n  GO<\/pre>\n<p>Now we try it out. Let&#8217;s assemble a couple of simple JSON strings from a table-source.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DECLARE @oneToTen NVARCHAR(MAX) =\r\n  \t(\r\n  \tSELECT LincolnshireCounting.number, LincolnshireCounting.word\r\n  \tFROM\r\n  \t\t(\r\n  \t\tVALUES (1, 'Yan'), (2, 'Tan'), (3, 'Tethera'), (4, 'Pethera'),\r\n  \t\t(5, 'Pimp'), (6, 'Sethera'), (7, 'Lethera'), (8, 'Hovera'),\r\n  \t\t(9, 'Covera'), (10, 'Dik')\r\n  \t\t) AS LincolnshireCounting (number, word)\r\n  \tFOR JSON AUTO\r\n  \t)\r\n  DECLARE @ElevenToTwenty NVARCHAR(MAX) =\r\n      (\r\n      SELECT LincolnshireCounting.number, LincolnshireCounting.word\r\n      FROM\r\n  \t\t(\r\n  \t\tVALUES (11, 'Yan-a-dik'), (12, 'Tan-a-dik'), (13, 'Tethera-dik'),\r\n  \t\t(14, 'Pethera-dik'), (15, 'Bumfit'), (16, 'Yan-a-bumtit'),\r\n  \t\t(17, 'Tan-a-bumfit'), (18, 'Tethera-bumfit'),\r\n  \t\t(19, 'Pethera-bumfit'), (20, 'Figgot')\r\n  \t\t) AS LincolnshireCounting (number, word)\r\n      FOR JSON AUTO\r\n      )<\/pre>\n<p>Now we can <strong>EXECUTE<\/strong> the procedure to store the Sheep-Counting Words in the table<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  EXECUTE #MergeJSONwithCountingTable @json=@ElevenToTwenty, @Source='Lincolnshire'\r\n  EXECUTE #MergeJSONwithCountingTable @json=@OneToTen, @Source='Lincolnshire'\r\n  --and make sure that we are protected against duplicate inserts\r\n  EXECUTE #MergeJSONwithCountingTable @Source='Lincolnshire', @json='[{\r\n    \"number\": 11, \"word\": \"Yan-a-dik\"}, {\"number\": 12, \"word\": \"Tan-a-dik\"}]'<\/pre>\n<p>Check to see that they were imported correctly by running this query:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT * FROM SheepCountingWords<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"235\" height=\"415\" class=\"wp-image-78696\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/05\/word-image.png\" \/><\/p>\n<h2>Converting to Table-source JSON Arrays of Objects that have Embedded Arrays<\/h2>\n<p>What if you want to import the sheep-counting words from several regions? So far, what we\u2019ve been doing is fine for a collection that models a single table. However, real life isn&#8217;t like that. Not even Sheep-Counting Words are like that. A little internalized <a href=\"https:\/\/www.red-gate.com\/simple-talk\/opinion\/opinion-pieces\/chris-date-and-the-relational-model\/\">Chris Date<\/a> will be whispering in your ear that there are two relations here, a region and the name for a number.<\/p>\n<p>Your JSON for a database of sheep-counting words will more likely look like this (I\u2019ve just reduced it to two numbers in the <strong>sequence<\/strong> array rather than the original twenty). Each JSON document in our collection has an embedded array.<\/p>\n<pre class=\"lang:c# theme:vs2012\">  [{\r\n     \"region\": \"Wilts\",\r\n     \"sequence\": [{\r\n        \"number\": 1,\r\n        \"word\": \"Ain\"\r\n     }, {\r\n        \"number\": 2,\r\n        \"word\": \"Tain\"\r\n     }]\r\n  }, {\r\n     \"region\": \"Scots\",\r\n     \"sequence\": [{\r\n        \"number\": 1,\r\n        \"word\": \"Yan\"\r\n     }, {\r\n        \"number\": 2,\r\n        \"word\": \"Tyan\"\r\n     }]\r\n  }]\r\n  *\/<\/pre>\n<p>After a bit of thought, we remember that the <strong>OpenJSON<\/strong> function actually allows you to put a JSON value in a column of the result. This means that you just need to <strong>CROSS APPLY<\/strong> each embedded array, passing to the \u2018cross-applied\u2019 <strong>OpenJSON<\/strong> function the JSON fragment representing the array, which it will then parse for you.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT Number, Word, Region\r\n    FROM\r\n    OpenJson('[{\r\n     \"region\": \"Wilts\",\r\n     \"sequence\": [{\r\n        \"number\": 1,\r\n        \"word\": \"Ain\"\r\n     }, {\r\n        \"number\": 2,\r\n        \"word\": \"Tain\"\r\n     }]\r\n  }, {\r\n     \"region\": \"Scots\",\r\n     \"sequence\": [{\r\n        \"number\": 1,\r\n        \"word\": \"Yan\"\r\n     }, {\r\n        \"number\": 2,\r\n        \"word\": \"Tyan\"\r\n     }]\r\n  }]'       )\r\n    WITH (Region NVARCHAR(30) N'$.region', sequence NVARCHAR(MAX) N'$.sequence' AS JSON)\r\n      OUTER APPLY\r\n    OpenJson(sequence) --to get the number and word within each array element \r\n    WITH (Number INT N'$.number', Word NVARCHAR(30) N'$.word');<\/pre>\n<p>I haven\u2019t found the fact documented anywhere, but you can leave out the path elements from the column declaration of the WITH statement if the columns are exactly the same as the JSON keys, with matching case.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">    SELECT number, word, region\r\n    FROM\r\n    OpenJson('[{\r\n     \"region\": \"Wilts\",\r\n     \"sequence\": [{\r\n        \"number\": 1,\r\n        \"word\": \"Ain\"\r\n     }, {\r\n        \"number\": 2,\r\n        \"word\": \"Tain\"\r\n     }]\r\n  }, {\r\n     \"region\": \"Scots\",\r\n     \"sequence\": [{\r\n        \"number\": 1,\r\n        \"word\": \"Yan\"\r\n     }, {\r\n        \"number\": 2,\r\n        \"word\": \"Tyan\"\r\n     }]\r\n  }]'       )\r\n    WITH (region NVARCHAR(30), sequence NVARCHAR(MAX)  AS JSON)\r\n      OUTER APPLY\r\n    OpenJson(sequence) --to get the number and word within each array element \r\n    WITH (number INT, word NVARCHAR(30));<\/pre>\n<p>The ability to drill into sub-arrays by cross-joining OpenJSON function calls allows us to easily insert a large collection with a number of documents that have embedded arrays. This is looking a lot more like something that could, for example, tackle the import of a MongoDB collection as long as it was exported as a document array with commas between documents. I\u2019ll include, with the download on GitHub, the JSON file that contains all the sheep-counting words that have been collected. Here is the updated stored procedure:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP PROCEDURE IF EXISTS #MergeJSONWithEmbeddedArraywithCountingTable;\r\n  GO\r\n  CREATE PROCEDURE #MergeJSONWithEmbeddedArraywithCountingTable @json NVARCHAR(MAX)\r\n  \/**\r\n  Summary: &gt;\r\n    This inserts, or updates, into a table (dbo.SheepCountingWords) a JSON collection \r\n    consisting of documents with an embedded array containing sheep-counting words for\r\n    numbers between one and twenty used traditionally by sheep farmers in Gt Britain and \r\n    Brittany. it allows records to be inserted or updated in any order or quantity.\r\n    \r\n  Author: PhilFactor\r\n  Date: 20\/04\/2018\r\n  Database: CountingSheep\r\n  Examples:\r\n     - EXECUTE #MergeJSONWithEmbeddedArraywithCountingTable @json=@AllTheRegions, \r\n     - EXECUTE #MergeJSONWithEmbeddedArraywithCountingTable @json='\r\n      [{\"region\":\"Wilts\",\"sequence\":[{\"number\":1,\"word\":\"Ain\"},{\"number\":2,\"word\":\"Tain\"}]},\r\n       {\"region\":\"Scots\",\"sequence\":[{\"number\":1,\"word\":\"Yan\"},{\"number\":2,\"word\":\"Tyan\"}]}]'\r\n  Returns: &gt;\r\n    nothing\r\n  **\/\r\n  AS\r\n  MERGE dbo.SheepCountingWords AS target\r\n  USING\r\n    (\r\n    SELECT DISTINCT Number, Word, Region\r\n    FROM OpenJson(@json) \r\n    WITH (Region NVARCHAR(30) N'$.region', sequence NVARCHAR(MAX) N'$.sequence' AS JSON)\r\n      OUTER APPLY\r\n    OpenJson(sequence)\r\n    WITH (Number INT N'$.number', Word NVARCHAR(30) N'$.word')\r\n    ) AS source (Number, Word, Region)\r\n  ON target.Number = source.Number AND target.Region = source.Region\r\n  WHEN MATCHED AND (source.Word &lt;&gt; target.Word) THEN\r\n    UPDATE SET target.Word = source.Word\r\n  WHEN NOT MATCHED THEN INSERT (Number, Word, Region)\r\n                        VALUES\r\n                          (source.Number, source.Word, source.Region);\r\n  GO<\/pre>\n<p>We can now very quickly ingest the whole collection into our table, pulling the data in from file. We include this file with the download on GitHub, so you can try it out. There are thirty-three different regions in the JSON file<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DECLARE @JSON nvarchar(max)\r\n  SELECT @json = BulkColumn\r\n   FROM OPENROWSET (BULK 'D:\\raw data\\YanTanTethera.json', SINGLE_BLOB) as jsonFile\r\n   EXECUTE #MergeJSONWithEmbeddedArraywithCountingTable @JSON\r\n  --The file must be UTF-16 Little Endian<\/pre>\n<p>We can now check that it is all in and correct<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT SheepCountingWords.Number,\r\n    Max(CASE WHEN SheepCountingWords.Region = 'Ancient British' THEN\r\n               SheepCountingWords.Word ELSE '' END\r\n       ) AS [Ancient British],\r\n    Max(CASE WHEN SheepCountingWords.Region = 'Borrowdale' THEN\r\n               SheepCountingWords.Word ELSE '' END\r\n       ) AS Borrowdale,\r\n    Max(CASE WHEN SheepCountingWords.Region = 'Bowland' THEN\r\n               SheepCountingWords.Word ELSE '' END\r\n       ) AS Bowland,\r\n    Max(CASE WHEN SheepCountingWords.Region = 'Breton' THEN\r\n               SheepCountingWords.Word ELSE '' END\r\n       ) AS Breton,\r\n  \/*\r\n  Many columns missing here. Full source of query included on GitHub  *\/\r\n    Max(CASE WHEN SheepCountingWords.Region = 'Wilts' THEN\r\n               SheepCountingWords.Word ELSE '' END\r\n       ) AS Wilts\r\n    FROM SheepCountingWords\r\n    GROUP BY SheepCountingWords.Number\r\n    ORDER BY SheepCountingWords.Number<\/pre>\n<p>Giving &#8230;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"710\" height=\"429\" class=\"wp-image-78697\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/05\/word-image-1.png\" \/><\/p>\n<p>Just as a side-note, this data was collected for this article in various places on the internet but mainly from <a href=\"https:\/\/en.wikipedia.org\/wiki\/Yan_Tan_Tethera\">Yan Tan Tethera<\/a>. Each table was pasted into Excel and tidied up. The JSON code was created by using three simple functions, one for the cell-level value, one for the row value and a final summation. This allowed simple adding, editing and deleting of data items. The technique is only suitable where columns are of fixed length.<\/p>\n<h2>Importing a More Complex JSON Data Collection into a SQL Server Database<\/h2>\n<p>We have successfully imported the very simplest JSON files into SQL Server. Now we need to consider those cases where the JSON document or collection represents more than one table.<\/p>\n<p>In any relational database, we can use two approaches to JSON data, we can accommodate it, meaning we treat it as an \u2018atomic\u2019 unit and store the JSON unprocessed, or we can assimilate it, meaning that we turn the data into a relational format that can be easily indexed and accessed.<\/p>\n<ul>\n<li>To accommodate JSON, we store it as a CLOB, usually NVARCHAR(MAX), with extra columns containing the extracted values for the data fields with which you would want to index the data. This is fine where all the database has to do is to store an application object without understanding it.<\/li>\n<li>To assimilate JSON, we need to extract all the JSON data and store it in a relational form.<\/li>\n<\/ul>\n<p>Our example represents a very simple customer database with ten linked tables. We will first accommodate the JSON document by creating a table (<strong>dbo.JSONDocuments<\/strong>) that merely stores, in each row, the reference to the customer, along with all the information about that customer, each aspect (addresses, phones, email addresses and so on) in separate columns as CLOB JSON strings.<\/p>\n<p>We then use this table to successively assimilate each JSON column into the relational database.<\/p>\n<p>This means that we need parse the full document only once.<\/p>\n<p>To be clear about the contents of the JSON file, we will be cheating by using spoof data. We would never have unencrypted personal information in a database or a JSON file. Credit Card information would never be unencrypted. This data is generated entirely by <a href=\"https:\/\/www.red-gate.com\/products\/sql-development\/sql-data-generator\/index\">SQL Data Generator<\/a>, and the JSON collection contains 70,000 documents. <a href=\"https:\/\/www.red-gate.com\/hub\/product-learning\/sql-data-generator\/generating-test-data-json-files-using-sql-data-generator\">The method of doing it is described here<\/a>.<\/p>\n<p>We\u2019ll make other compromises. We\u2019ll have no personal identifiers either. We will simply use the document order. In reality, the JSON would store the surrogate key of person_id.<\/p>\n<p>The individual documents will look something like this<\/p>\n<pre class=\"lang:c# theme:vs2012\">  { \r\n      \"Full Name\" : \"Mr Philip Fortescue Factor esq.\", \r\n      \"Name\" : {\r\n          \"Title\" : \"Mr\", \r\n          \"First Name\" : \" Philip\", \r\n          \"Middle Name\" : \" Fortescue\", \r\n          \"Last Name\" : \"Factor\", \r\n          \"Suffix\" : \"Esq.\"\r\n      }, \r\n      \"Addresses\" : [\r\n          {\r\n              \"type\" : \"Home\", \r\n              \"Full Address\" : \"83a Manor Parkway  Crook , York, Cornwall CF9 7MU\", \r\n              \"County\" : \"Cornwall\", \r\n              \"Dates\" : {\r\n                  \"Moved In\" : \"2012-04-01T16:11:29.570\", \r\n              }\r\n          }\r\n      ], \r\n      \"Notes\" : [\r\n          {\r\n              \"Text\" : \"you! your like to products. happy your management time) thank all plus service. Thank our our that as promise in\", \r\n              \"Date\" : \"2014-11-06T04:05:05.020\"\r\n          }\r\n      ], \r\n      \"Phones\" : [\r\n          {\r\n              \"TypeOfPhone\" : \"Home\", \r\n              \"DiallingNumber\" : \"01593 611001\", \r\n              \"Dates\" : {\r\n                  \"From\" : \"2004-10-20T15:57:47.480\"\r\n              }\r\n          }\r\n      ], \r\n      \"EmailAddresses\" : [\r\n          {\r\n              \"EmailAddress\" : \"Kian1@AAL.com\", \r\n              \"StartDate\" : \"2013-05-25\", \r\n              \"EndDate\" : \"2017-06-01\"\r\n          }, \r\n          {\r\n              \"EmailAddress\" : \"William1@Gmoul.com\", \r\n              \"StartDate\" : \"2017-06-31\" \r\n          }\r\n      ], \r\n      \"Cards\" : [\r\n          {\r\n              \"CardNumber\" : \"4684453190680369\", \r\n              \"ValidFrom\" : \"2017-05-14\", \r\n              \"ValidTo\" : \"2026-01-26\", \r\n              \"CVC\" : \"262\"\r\n          }, \r\n          {\r\n              \"CardNumber\" : \"5548597043927766\", \r\n              \"ValidFrom\" : \"2017-03-16\", \r\n              \"ValidTo\" : \"2028-10-27\", \r\n              \"CVC\" : \"587\"\r\n          }, \r\n          {\r\n              \"CardNumber\" : \"4896940995709652\", \r\n              \"ValidFrom\" : \"2015-08-17\", \r\n              \"ValidTo\" : \"2022-12-03\", \r\n              \"CVC\" : \"220\"\r\n          }\r\n      ]\r\n  }<\/pre>\n<p>We will import this into a SQL Server database designed like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"793\" height=\"862\" class=\"wp-image-78698\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/05\/word-image-2.png\" \/><\/p>\n<p>The build script is included with the download on GitHub.<\/p>\n<p>So, all we need now is the batch to import the JSON file that contains the collection and populate the table with the data. We will now describe individual parts of the batch.<\/p>\n<p>We start out by reading the customersUTF16.json file into a variable.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DECLARE @CJSON nvarchar(max)\r\n  SELECT @Cjson = BulkColumn\r\n   FROM OPENROWSET (BULK 'D:\\raw data\\customersUTF16.json', SINGLE_BLOB) as j\r\n  --text encoding must be littlendian UTF16<\/pre>\n<p>The next step is to create a table at the document level, with the main arrays within each document represented by columns. (In some cases, there are sub-arrays. The phone numbers, for example, have an array of dates.) This means that this initial slicing of the JSON collection needs be done only once. In our case, there are<\/p>\n<ul>\n<li>The details of the Name,<\/li>\n<li>Addresses,<\/li>\n<li>Credit Cards,<\/li>\n<li>Email Addresses,<\/li>\n<li>Notes,<\/li>\n<li>Phone numbers<\/li>\n<\/ul>\n<p>We fill this table via a call to openJSON. By doing this, we have the main details of each customer available to us when slicing up embedded arrays. The batch is designed so it can be rerun and should be idempotent. This means that there is less of a requirement to run the process in a single transaction.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS dbo.JSONDocuments;\r\n  CREATE TABLE dbo.JSONDocuments\r\n    (\r\n    Document_id INT NOT NULL,\r\n        [Full_Name] NVARCHAR(30) NOT NULL,\r\n   \t  Name NVARCHAR(MAX) NOT NULL,--holds a JSON object\r\n  \t  Addresses NVARCHAR(MAX) NULL,--holds an array of JSON objects\r\n  \t  Cards NVARCHAR(MAX) NULL,--holds an array of JSON objects\r\n  \t  EmailAddresses NVARCHAR(MAX) NULL,--holds an array of JSON objects\r\n  \t  Notes NVARCHAR(MAX) NULL,--holds an array of JSON objects\r\n  \t  Phones NVARCHAR(MAX) NULL,--holds an array of JSON objects\r\n  \t  CONSTRAINT JSONDocumentsPk PRIMARY KEY (Document_id)\r\n    ) ON [PRIMARY];<\/pre>\n<p>Now we fill this table with a row for each document, each representing the entire date for a customer. Each item of root data, such as the id and the customer&#8217;s full name, is held as a column. All other columns hold JSON. This table will be an \u2018accomodation\u2019 to the JSON data, in that each row represents a customer, but each JSON document in the collection is shredded to provide a JSON string that represents the attributes and relations of that customer. We can now assimilate this data step-by-step<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  INSERT INTO dbo.JSONDocuments ( Document_id,Full_name,[Name],Addresses, Cards, EmailAddresses, Notes, Phones)\r\n   SELECT [key] AS Document_id,Full_name,[Name],Addresses, Cards, EmailAddresses, Notes, Phones \r\n    FROM OpenJson(@CJSON) AS EachDocument\r\n        CROSS APPLY OpenJson(EachDocument.Value) \r\n  \t  WITH (\r\n  \t      [Full_Name] NVARCHAR(30) N'$.\"Full Name\"', \r\n  \t\t  Name NVARCHAR(MAX) N'$.Name' AS JSON,\r\n  \t\t  Addresses NVARCHAR(MAX) N'$.Addresses' AS JSON,\r\n  \t\t  Cards NVARCHAR(MAX) N'$.Cards' AS JSON,\r\n  \t\t  EmailAddresses NVARCHAR(MAX) N'$.EmailAddresses' AS JSON,\r\n  \t\t  Notes NVARCHAR(MAX) N'$.Notes' AS JSON,\r\n  \t\t  Phones NVARCHAR(MAX) N'$.Phones' AS JSON)<\/pre>\n<p>First we need to create an entry in the person table if it doesn&#8217;t already exist, as that has the person_id. We need to do this first because otherwise the foreign key constraints will protest.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SET IDENTITY_INSERT [Customer].[Person] On\r\n  MERGE [Customer].[Person] AS target\r\n  USING\r\n    (--get the required data for the person table and merge it with what is there\r\n    SELECT JSONDocuments.Document_id, Title, FirstName, \r\n         MiddleName, LastName, Suffix\r\n      FROM dbo.JSONDocuments\r\n        CROSS APPLY\r\n      OpenJson(JSONDocuments.Name)\r\n      WITH\r\n        (\r\n        Title NVARCHAR(8) N'$.Title', FirstName VARCHAR(40) N'$.\"First Name\"',\r\n        MiddleName VARCHAR(40) N'$.\"Middle Name\"',\r\n        LastName VARCHAR(40) N'$.\"Last Name\"', Suffix VARCHAR(10) N'$.Suffix'\r\n        )\r\n    ) AS source (person_id, Title, FirstName, MiddleName, LastName, Suffix)\r\n  ON target.person_id = source.person_id \r\n  WHEN NOT MATCHED THEN \r\n    INSERT (person_id, Title, FirstName, MiddleName, LastName, Suffix)\r\n      VALUES\r\n        (source.person_id, source.Title, source.FirstName, \r\n  \t  source.MiddleName, source.LastName, source.Suffix);\r\n  SET IDENTITY_INSERT [Customer].[Person] Off<\/pre>\n<p>Now we do the notes. We\u2019ll do this first because it is a bit awkward. This has the complication because there is a many to many relationship with the notes and the people, because the same standard notes can be associated with many customers such an overdue invoice payment etc. We\u2019ll use a table variable to allow us to guard against inserting duplicate records.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DECLARE @Note TABLE (document_id INT NOT NULL, Text NVARCHAR(MAX) NOT NULL, Date DATETIME)\r\n  INSERT INTO @Note (document_id, Text, Date)\r\n    SELECT JSONDocuments.Document_id, Text, Date\r\n      FROM dbo.JSONDocuments\r\n        CROSS APPLY OpenJson(JSONDocuments.Notes) AS TheNotes\r\n        CROSS APPLY\r\n      OpenJson(TheNotes.Value)\r\n      WITH (Text NVARCHAR(MAX) N'$.Text', Date DATETIME N'$.Date')\r\n  \tWHERE Text IS NOT null\r\n  --if the notes are new then insert them\r\n  INSERT INTO Customer.Note (Note)\r\n    SELECT DISTINCT newnotes.Text\r\n      FROM @Note AS newnotes\r\n        LEFT OUTER JOIN Customer.Note\r\n          ON note.notestart = Left(newnotes.Text,850)--just compare the first 850 chars\r\n      WHERE note.note IS NULL \r\n  \/* now fill in the many-to-many table relating notes to people, making sure that you\r\n  --do not duplicate anything*\/\r\n  INSERT INTO Customer.NotePerson (Person_id, Note_id)\r\n    SELECT newnotes.document_id, note.note_id\r\n      FROM @Note AS newnotes\r\n        INNER JOIN Customer.Note\r\n          ON note.note = newnotes.Text\r\n  \t  LEFT OUTER JOIN Customer.NotePerson\r\n  \t    ON NotePerson.Person_id=newnotes.document_id\r\n  \t\tAND NotePerson.note_id=note.note_id\r\n  \t\tWHERE NotePerson.note_id IS null<\/pre>\n<p>Addresses are complicated because they involve three tables. There is the address, which is the physical place, the abode, which records when and why the person was associated with the place, and a third table which constrains the type of abode. We create a table variable to support the various queries without any extra shredding.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DECLARE @addresses TABLE\r\n    (\r\n    person_id INT NOT null,\r\n    Type NVARCHAR(40) NOT null,\r\n    Full_Address NVARCHAR(200)NOT null,\r\n    County NVARCHAR(30) NOT null,\r\n    Start_Date DATETIME NOT null,\r\n    End_Date DATETIME null\r\n    );\r\n  --stock the table variable with the address information\r\n  INSERT INTO @Addresses(person_id, Type,Full_Address, County, [Start_Date], End_Date)\r\n  SELECT Document_id, Address.Type,Address.Full_Address, Address.County, \r\n           WhenLivedIn.[Start_date],WhenLivedIn.End_date\r\n      FROM dbo.JSONDocuments\r\n        CROSS APPLY\r\n      OpenJson(JSONDocuments.Addresses) AllAddresses\r\n  \t  CROSS APPLY \r\n  \t   OpenJson(AllAddresses.value)\r\n      WITH\r\n        (\r\n        Type NVARCHAR(8) N'$.type', Full_Address NVARCHAR(200) N'$.\"Full Address\"',\r\n        County VARCHAR(40) N'$.County',Dates NVARCHAR(MAX) AS json\r\n        ) Address\r\n      CROSS APPLY\r\n  \tOpenJson(Address.Dates) WITH\r\n        (\r\n        Start_date datetime N'$.\"Moved In\"',End_date datetime N'$.\"Moved Out\"'\r\n        )WhenLivedIn\r\n  --first make sure that the types of address exists and add if necessary\r\n  INSERT INTO Customer.Addresstype (TypeOfAddress)\r\n    SELECT DISTINCT NewAddresses.Type\r\n      FROM @addresses AS NewAddresses\r\n        LEFT OUTER JOIN Customer.Addresstype\r\n          ON NewAddresses.Type = Addresstype.TypeOfAddress\r\n      WHERE Addresstype.TypeOfAddress IS NULL;\r\n  --Fill the Address table with addresses ensuring uniqueness \r\n  INSERT INTO Customer.Address (Full_Address, County)\r\n  SELECT DISTINCT NewAddresses.Full_Address, NewAddresses.County\r\n    FROM @addresses AS NewAddresses\r\n      LEFT OUTER JOIN Customer.Address AS currentAddresses\r\n        ON NewAddresses.Full_Address = currentAddresses.Full_Address\r\n    WHERE currentAddresses.Full_Address IS NULL;\r\n  --and now the many-to-many Abode table\r\n  INSERT INTO Customer.Abode (Person_id, Address_ID, TypeOfAddress, Start_date,\r\n  End_date)\r\n    SELECT newAddresses.person_id, address.Address_ID, newAddresses.Type,\r\n      newAddresses.Start_Date, newAddresses.End_Date\r\n      FROM @addresses AS newAddresses\r\n        INNER JOIN customer.address\r\n          ON newAddresses.Full_Address = address.Full_Address\r\n        LEFT OUTER JOIN Customer.Abode\r\n          ON Abode.person_id = newAddresses.person_id\r\n         AND Abode.Address_ID = address.Address_ID\r\n      WHERE Abode.person_id IS NULL;<\/pre>\n<p>Credit cards are much easier since they are a simple sub-array.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  INSERT INTO customer.CreditCard (Person_id, CardNumber, ValidFrom, ValidTo, CVC)\r\n    SELECT JSONDocuments.Document_id AS Person_id, new.CardNumber, new.ValidFrom,\r\n      new.ValidTo, new.CVC\r\n      FROM dbo.JSONDocuments\r\n        CROSS APPLY OpenJson(JSONDocuments.Cards) AS TheCards\r\n        CROSS APPLY\r\n      OpenJson(TheCards.Value)\r\n      WITH\r\n        (\r\n        CardNumber VARCHAR(20), ValidFrom DATE N'$.ValidFrom',\r\n        ValidTo DATE N'$.ValidTo', CVC CHAR(3)\r\n        ) AS new\r\n        LEFT OUTER JOIN customer.CreditCard\r\n          ON JSONDocuments.Document_id = CreditCard.Person_id\r\n         AND new.CardNumber = CreditCard.CardNumber\r\n      WHERE CreditCard.CardNumber IS NULL;<\/pre>\n<p>Email Addresses are also simple. We\u2019re on the downhill slopes now.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  INSERT INTO Customer.EmailAddress (Person_id, EmailAddress, StartDate, EndDate)\r\n    SELECT JSONDocuments.Document_id AS Person_id, new.EmailAddress,\r\n      new.StartDate, new.EndDate\r\n      FROM dbo.JSONDocuments\r\n        CROSS APPLY OpenJson(JSONDocuments.EmailAddresses) AS TheEmailAddresses\r\n        CROSS APPLY\r\n      OpenJson(TheEmailAddresses.Value)\r\n      WITH\r\n        (\r\n        EmailAddress NVARCHAR(40) N'$.EmailAddress',\r\n        StartDate DATE N'$.StartDate', EndDate DATE N'$.EndDate'\r\n        ) AS new\r\n        LEFT OUTER JOIN Customer.EmailAddress AS email\r\n          ON JSONDocuments.Document_id = email.Person_id\r\n         AND new.EmailAddress = email.EmailAddress\r\n      WHERE email.EmailAddress IS NULL;<\/pre>\n<p>Now we add these customers phones. The various dates for the start and end of the use of the phone number are held in a subarray within the individual card objects. That makes things slightly more awkward<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk \">  DECLARE @phones TABLE\r\n    (\r\n    Person_id INT,\r\n    TypeOfPhone NVARCHAR(40),\r\n    DiallingNumber VARCHAR(20),\r\n    Start_Date DATE,\r\n    End_Date DATE\r\n    );\r\n  INSERT INTO @phones (Person_id, TypeOfPhone, DiallingNumber, Start_Date,\r\n  End_Date)\r\n    SELECT JSONDocuments.Document_id, EachPhone.TypeOfPhone,\r\n      EachPhone.DiallingNumber, [From], [To]\r\n      FROM dbo.JSONDocuments\r\n        CROSS APPLY OpenJson(JSONDocuments.Phones) AS ThePhones\r\n        CROSS APPLY\r\n      OpenJson(ThePhones.Value)\r\n      WITH\r\n        (\r\n        TypeOfPhone NVARCHAR(40), DiallingNumber VARCHAR(20), Dates NVARCHAR(MAX) AS JSON\r\n        ) AS EachPhone\r\n        CROSS APPLY\r\n      OpenJson(EachPhone.Dates)\r\n      WITH ([From] DATE, [To] DATE);\r\n  --insert any new phone types\r\n  INSERT INTO Customer.PhoneType (TypeOfPhone)\r\n    SELECT DISTINCT new.TypeOfPhone\r\n      FROM @phones AS new\r\n        LEFT OUTER JOIN Customer.PhoneType\r\n          ON PhoneType.TypeOfPhone = new.TypeOfPhone\r\n      WHERE PhoneType.TypeOfPhone IS NULL AND new.TypeOfPhone IS NOT null;\r\n  --insert all new phones \r\n  INSERT INTO Customer.Phone (Person_id, TypeOfPhone, DiallingNumber, Start_date,\r\n  End_date)\r\n    SELECT new.Person_id, new.TypeOfPhone, new.DiallingNumber, new.Start_Date,\r\n      new.End_Date\r\n      FROM @phones AS new\r\n        LEFT OUTER JOIN Customer.Phone\r\n          ON Phone.DiallingNumber = new.DiallingNumber\r\n         AND Phone.Person_id = new.Person_id\r\n      WHERE Phone.Person_id IS NULL AND new.TypeOfPhone IS NOT null;\t <\/pre>\n<h2>Conclusion<\/h2>\n<p>JSON support in SQL Server has been the result of a long wait, but now that we have it, it opens up several possibilities.<\/p>\n<p>No SQL Server Developer or admin needs to rule out using JSON for ETL (Extract, Transform, Load) processes to pass data between JSON-based document databases and SQL Server. The features that SQL Server has are sufficient, and far easier to use than the SQL Server XML support.<\/p>\n<p>A typical SQL Server database is far more complex than the simple example used in this article, but it is certainly not an outrageous idea that a database could have its essential static data drawn from JSON documents: These are more versatile than VALUE statements and more efficient than individual INSERT statements.<\/p>\n<p>I\u2019m inclined to smile on the idea of transferring data between the application and database as JSON. It is usually easier for front-end application programmers, and we Database folks can, at last, do all the checks and transformations to accommodate data within the arcane relational world, rather than insist on the application programmer doing it. It will also decouple the application and database to the extent that the two no longer would need to shadow each other in terms of revisions.<\/p>\n<p>JSON collections of documents represent an industry-standard way of transferring data. It is today\u2019s CSV, and it is good to know that SQL Server can support it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Microsoft introduced native support for JSON in SQL Server in version 2016. In this article, Phil Factor shows how to import JSON documents into to SQL Server database tables, even if there is a bit of complexity in the original data.&hellip;<\/p>\n","protected":false},"author":154613,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143531],"tags":[5134],"coauthors":[6813],"class_list":["post-78695","post","type-post","status-publish","format-standard","hentry","category-t-sql-programming-sql-server","tag-sql-prompt"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/78695","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/154613"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=78695"}],"version-history":[{"count":5,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/78695\/revisions"}],"predecessor-version":[{"id":78722,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/78695\/revisions\/78722"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=78695"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=78695"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=78695"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=78695"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}