{"id":83807,"date":"2019-04-09T10:30:40","date_gmt":"2019-04-09T10:30:40","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=83807"},"modified":"2019-08-15T13:43:52","modified_gmt":"2019-08-15T13:43:52","slug":"using-json-for-matrices-in-sql-server","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/blogs\/using-json-for-matrices-in-sql-server\/","title":{"rendered":"Using JSON for matrices in SQL Server."},"content":{"rendered":"<p>From SQL Server 2017, it becomes more practical to use JSON arrays for representing and processing a matrix. SQL Server can read them, and update values in them but can&#8217;t create them. To do this, you have to create the JSON as a string. The great advantage of them is that you can pass them between procedures and functions as easily as any other string, and quickly turn them into tables.<\/p>\n<p>For the SQL Server developer, matrices are probably most valuable for solving more complex string-searching problems, using <a href=\"https:\/\/www.quora.com\/How-should-I-explain-dynamic-programming-to-a-4-year-old\/answer\/Jonathan-Paulson\">Dynamic Programming<\/a>. Once you get into the mindset of this sort of technique, a number of seemingly-intractable problems become easier.\u00a0 Here are <a href=\"https:\/\/blog.usejournal.com\/top-50-dynamic-programming-practice-problems-4208fed71aa3\">fifty\u00a0common data structure problems<\/a> that can be solved using Dynamic programming. Until SQL Server 2017, these were hard to do in SQL because of the lack of support for this style of programming.\u00a0 Memoization, one of the principles behind the technique is easy to do in SQL but it is very tricky to convert existing procedural algorithms to use table variables. It is usually easier and quicker to use strings as pseudo-variables as I did\u00a0 with\u00a0<a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/string-comparisons-in-sql-edit-distance-and-the-levenshtein-algorithm\/\">Edit Distance and the Levenshtein algorithm<\/a>,\u00a0<a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/string-comparisons-in-sql-the-longest-common-subsequence\/\">the longest common subsequence<\/a>, and\u00a0 the <a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/string-comparisons-in-sql-the-longest-common-substring\/\">Longest Common Substring<\/a>. The problem with doing this is that the code to fetch the array values can be very difficult to decypher or debug. JSON can do it very easily with <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/json\/json-path-expressions-sql-server?view=sql-server-2017\">path array references<\/a>.<\/p>\n<p>Would it be possible to use JSON arrays to solve one of these problems? If so, is it much slower? I thought it might be interesting to convert the Lowest Common Subsequence problem into a json-based form and run over a few tests back-to-back. The conclusion was, for those with the TLDR habit, was that it took twice to three times as long to run, but produced code that was easier to write, understand and debug.\u00a0 I suspect there are ways and means to make it faster.<\/p>\n<pre class=\"theme:ssms2012-simple-talk font:consolas font-size:14 line-height:16 nums-toggle:false wrap:true wrap-toggle:false lang:tsql decode:true \">IF Object_Id(N'LCS') IS NOT NULL DROP FUNCTION LCS;\r\nGO\r\nCREATE FUNCTION LCS\r\n  \/**\r\nsummary:   &gt;\r\n The longest common subsequence (LCS) problem is the problem of finding the\r\n longest subsequence common to all sequences in two sequences. It differs\r\n from problems of finding common substrings: unlike substrings, subsequences\r\n are not required to occupy consecutive positions within the original\r\n sequences. For example, the sequences \"1234\" and \"1224533324\" have an LCS\r\n of \"1234\":\r\nAuthor: Phil Factor\r\nRevision: 1.0\r\ndate: 05 April 2019\r\nexample:\r\n code: |\r\n     Select dbo.lcs ('1234', '1224533324')\r\n     Select dbo.lcs ('thisisatest', 'testing123testing')\r\n     Select dbo.lcs ( 'XMJYAUZ', 'MZJAWXU') \r\n     Select dbo.lcs ( 'beginning-middle-ending',\r\n       'beginning-diddle-dum-ending')\r\nreturns:   &gt;\r\n  the longest common subsequence as a string\r\n**\/\r\n  (@xString VARCHAR(MAX), @yString VARCHAR(MAX))\r\nRETURNS VARCHAR(MAX)\r\nAS\r\n  BEGIN\r\n\r\n    DECLARE @ii INT = 1; --inner index\r\n    DECLARE @jj INT = 1; --next loop index\r\n    DECLARE @West INT; --array reference number to left\r\n    DECLARE @NorthWest INT; --array reference previous left\r\n    DECLARE @North INT; --array reference previous\r\n    DECLARE @Max INT; --holds the maximum of two values\r\n    DECLARE @Current INT; --current number of matches\r\n    DECLARE @Matrix NVARCHAR(MAX);\r\n    DECLARE @PreviousRow NVARCHAR(2000); -- the previous matrix row\r\n    DECLARE @JSON NVARCHAR(4000); --json work variable\r\n    DECLARE @Numbers TABLE (jj INT);\r\n-- SQL Prompt formatting off\r\nINSERT INTO @numbers(jj) --this is designed for words of max 40 characters\r\nVALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),\r\n      (16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),\r\n\t  (29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40)\r\n-- SQL Prompt formatting on\r\n--the to start with, the first row is all zeros.\r\n    SELECT @PreviousRow =\r\n      N'[' + Replicate('0,', Len(@xString) + 1) + N'\"'\r\n      + Substring(@yString, 1, 1) + N'\"]';\r\n    SELECT @Matrix = @PreviousRow;--add this to the matrix\r\n\t\/* we now build the matrix in bottom up fashion.  *\/\r\n    WHILE (@ii &lt;= Len(@yString))\r\n      BEGIN\r\n        SELECT @West = 0, @JSON = NULL;\r\n\t\t--now create a row in just one query\r\n        SELECT @NorthWest =\r\n          Json_Value(@PreviousRow, '$[' + Cast(jj - 1 AS VARCHAR(5)) + ']'),\r\n          @North =\r\n            Json_Value(@PreviousRow, '$[' + Cast(jj AS VARCHAR(5)) + ']'),\r\n          @Max = CASE WHEN @West &gt; @North THEN @West ELSE @North END,\r\n          @Current =\r\n            CASE WHEN Substring(@xString, jj, 1) = Substring(@yString, @ii, 1) THEN\r\n                   @NorthWest + 1 ELSE @Max END,\r\n          @JSON =\r\n            Coalesce(@JSON + ',', '[0,')\r\n            + Coalesce(Cast(@Current AS VARCHAR(5)), 'null'), @West = @Current\r\n          FROM @Numbers AS f\r\n          WHERE f.jj &lt;= Len(@xString);\r\n\t\t  --and store the result as the previous row\r\n        SELECT @PreviousRow =\r\n               @JSON + N',\"' + Substring(@yString, @ii, 1) + N'\"]';\r\n          --and add the reow to the matrix\r\n        SELECT @Matrix = Coalesce(@Matrix + ',\r\n\t\t       ', '') + @PreviousRow, @ii = @ii + 1;\r\n      END;\r\n    --we add the boundong brackets.\r\n    SELECT @Matrix = N'[' + @Matrix + N']';\r\n    SELECT @ii = Len(@yString), @jj = Len(@xString);\r\n    DECLARE @previousColScore INT, @PreviousRowScore INT, @Ychar NCHAR;\r\n    DECLARE @Subsequence NVARCHAR(4000) = '';\r\n    WHILE (@Current &gt; 0)\r\n      BEGIN\r\n        SELECT @Ychar = Substring(@yString, @ii, 1);\r\n        IF (@Ychar = Substring(@xString, @jj, 1))\r\n-- If current character in X[] and Y[] are same, then it is part of LCS\r\n          SELECT @ii = @ii - 1, @jj = @jj - 1,\r\n            @Subsequence = @Ychar + @Subsequence, @Current = @Current - 1;\r\n        ELSE\r\n--If not same, then find the larger of two and traverse in that direction \r\n          BEGIN\r\n\t\t    --find out the two scores, one to the north and one to the west\r\n            SELECT @PreviousRowScore =\r\n              Json_Value(\r\n                          @Matrix,\r\n                          'strict $[' + Convert(VARCHAR(5), @ii - 1) + ']['\r\n                          + Convert(VARCHAR(5), @jj) + ']'\r\n                        ),\r\n              @previousColScore =\r\n                Json_Value(\r\n                            @Matrix,\r\n                            'strict $[' + Convert(VARCHAR(5), @ii) + ']['\r\n                            + Convert(VARCHAR(5), @jj - 1) + ']'\r\n                          );\r\n           --either go north or west\r\n            IF @PreviousRowScore &lt; @previousColScore SELECT @jj = @jj - 1;\r\n            ELSE SELECT @ii = @ii - 1;\r\n          END;\r\n      END;\r\n    RETURN @Subsequence;\r\n  END;\r\nGO\r\n-- Now we do a quick test and timing with the old version\r\nDECLARE @timing DATETIME;\r\nSELECT @timing = GetDate();\r\n\r\nIF dbo.LongestCommonSubsequence('1234', '1224533324') &lt;&gt; '1234'\r\n  RAISERROR('test 1 failed', 16, 1);\r\nIF dbo.LongestCommonSubsequence('thisisatest', 'testing123testing') &lt;&gt; 'tsitest'\r\n  RAISERROR('test 2 failed', 16, 1);\r\nIF dbo.LongestCommonSubsequence('Patient', 'Complaint') &lt;&gt; 'Paint'\r\n  RAISERROR('test 3 failed', 16, 1);\r\nIF dbo.LongestCommonSubsequence('XMJYAUZ', 'MZJAWXU') &lt;&gt; 'MJAU'\r\n  RAISERROR('test 4 failed', 16, 1);\r\nIF dbo.LongestCommonSubsequence('yab', 'xabyrbyab') &lt;&gt; 'yab' RAISERROR(\r\n'test 5 failed', 16, 1\r\n);\r\nIF dbo.LongestCommonSubsequence(\r\n'beginning-middle-ending', 'beginning-diddle-dum-ending'\r\n) &lt;&gt; 'beginning-iddle-ending'\r\n  RAISERROR('test 6 failed', 16, 1);\r\n\r\nSELECT DateDiff(MILLISECOND, @timing, GetDate()) AS [ms FOR traditional way];\r\n--now do the same test run with the current function\r\nSELECT @timing = GetDate();\r\n\r\nIF dbo.LCS('1234', '1224533324') &lt;&gt; '1234' RAISERROR('test 1 failed', 16, 1);\r\nIF dbo.LCS('thisisatest', 'testing123testing') &lt;&gt; 'tsitest' RAISERROR(\r\n'test 2 failed', 16, 1\r\n);\r\nIF dbo.LCS('Patient', 'Complaint') &lt;&gt; 'Paint'\r\n  RAISERROR('test 3 failed', 16, 1);\r\nIF dbo.LCS('XMJYAUZ', 'MZJAWXU') &lt;&gt; 'MJAU' RAISERROR('test 4 failed', 16, 1);\r\nIF dbo.LCS('yab', 'xabyrbyab') &lt;&gt; 'yab' RAISERROR('test 5 failed', 16, 1);\r\nIF dbo.LCS('beginning-middle-ending', 'beginning-diddle-dum-ending') &lt;&gt; 'beginning-iddle-ending'\r\n  RAISERROR('test 6 failed', 16, 1);\r\n\r\nSELECT DateDiff(MILLISECOND, @timing, GetDate()) AS [ms FOR JSON-based] ;<\/pre>\n<p>This returns &#8230;<\/p>\n<pre class=\"theme:ssms2012-simple-talk font:consolas font-size:14 line-height:16 lang:yaml decode:true \">ms FOR traditional way\r\n----------------------\r\n10\r\n\r\n(1 row affected)\r\n\r\nms FOR JSON-based\r\n-----------------\r\n30\r\n\r\n(1 row affected)<\/pre>\n<p>&nbsp;<\/p>\n<p>These tests form part of the build script for the procedure to try to make sure that as few as possible mistakes are left in!\u00a0<\/p>\n<p>Obviously, I&#8217;d like a bit more speed in the JSON querying but it is acceptable unless one is doing a lot of this sort of querying. I&#8217;m happy to us JSON for doing this because it is quicker to get things up-and-running. In my article &#8216;<a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/t-sql-programming\/fuzzy-searches-sql-server\/\">Doing Fuzzy Searches in SQL Server<\/a>&#8216;, I show how it is possible to cut down on the amount of searches by filtering the likely candidates with conventional SQL Commands first.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>From SQL Server 2017, it becomes more practical to use JSON arrays for representing and processing a matrix. SQL Server can read them, and update values in them but can&#8217;t create them. To do this, you have to create the JSON as a string. The great advantage of them is that you can pass them&#8230;&hellip;<\/p>\n","protected":false},"author":154613,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[2],"tags":[],"coauthors":[6813],"class_list":["post-83807","post","type-post","status-publish","format-standard","hentry","category-blogs"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/83807","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/154613"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=83807"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/83807\/revisions"}],"predecessor-version":[{"id":83811,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/83807\/revisions\/83811"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=83807"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=83807"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=83807"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=83807"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}