Using JSON for matrices in SQL Server.

From SQL Server 2017, it becomes more practical to use JSON arrays for representing and processing a matrix. SQL Server can read them, and update values in them but can’t create them. To do this, you have to create the JSON as a string. The great advantage of them is that you can pass them between procedures and functions as easily as any other string, and quickly turn them into tables.

For the SQL Server developer, matrices are probably most valuable for solving more complex string-searching problems, using Dynamic Programming. Once you get into the mindset of this sort of technique, a number of seemingly-intractable problems become easier.  Here are fifty common data structure problems that can be solved using Dynamic programming. Until SQL Server 2017, these were hard to do in SQL because of the lack of support for this style of programming.  Memoization, one of the principles behind the technique is easy to do in SQL but it is very tricky to convert existing procedural algorithms to use table variables. It is usually easier and quicker to use strings as pseudo-variables as I did  with Edit Distance and the Levenshtein algorithmthe longest common subsequence, and  the Longest Common Substring. The problem with doing this is that the code to fetch the array values can be very difficult to decypher or debug. JSON can do it very easily with path array references.

Would it be possible to use JSON arrays to solve one of these problems? If so, is it much slower? I thought it might be interesting to convert the Lowest Common Subsequence problem into a json-based form and run over a few tests back-to-back. The conclusion was, for those with the TLDR habit, was that it took twice to three times as long to run, but produced code that was easier to write, understand and debug.  I suspect there are ways and means to make it faster.

This returns …

 

These tests form part of the build script for the procedure to try to make sure that as few as possible mistakes are left in! 

Obviously, I’d like a bit more speed in the JSON querying but it is acceptable unless one is doing a lot of this sort of querying. I’m happy to us JSON for doing this because it is quicker to get things up-and-running. In my article ‘Doing Fuzzy Searches in SQL Server‘, I show how it is possible to cut down on the amount of searches by filtering the likely candidates with conventional SQL Commands first.