{"id":307,"date":"2007-09-20T00:00:00","date_gmt":"2007-09-20T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/quantifying-text-differences-in-tsql\/"},"modified":"2021-09-29T16:22:15","modified_gmt":"2021-09-29T16:22:15","slug":"quantifying-text-differences-in-tsql","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/quantifying-text-differences-in-tsql\/","title":{"rendered":"Quantifying Text differences in TSQL"},"content":{"rendered":"<p>Last revised: Feb 3rd 2014 <\/p>\n<div id=\"pretty\">\n<p class=\"start\">When you compare two pieces of text, or strings, in SQL Server in an expression, you will get just get a value of &#8216;true&#8217; returned if the two were the same, &#8216;false&#8217; if they were not, or &#8216;null&#8217; if one of the pieces of text was null. <\/p>\n<p>A simple matter of an extra space character is often enough to tip the balance. This is quite unlike real life &#8211; If we look at two pieces of text we judge them to be the same, almost the same, quite similar, nothing like each other and many shades in-between. Surely, we need to quantify differences? <\/p>\n<div class=\"float-left\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/435-mediaevalSmall.jpg\" alt=\"435-mediaevalSmall.jpg\" \/><\/p>\n<p>Text difference algorithms are as <br \/>old as the hills- but not in TSQL<\/p>\n<\/div>\n<p>In IT applications, there are several times when one needs more of a measure of the differences between text, than a simple &#8216;yes they&#8217;re the same\/ no they are not&#8217;. A typical problem is in finding duplicates in database entries where the understanding of &#8216;duplicate&#8217; allows for minor differences. Finding a reliable algorithm for quantifying similarity in text is quite hard, especially one that is set-based. TSQL has no native means to use regular expressions and other means of making life easier for this sort of work <\/p>\n<p>I find this problem quite intriguing. I think that there is a general consensus that the Levenshtein string distance algorithm is the best for  giving you the difference on a character-by-character basis, and I provide some code at the end for doing this. The  algorithm was developed by Vladimir Levenshtein in 1965. It tells you the number of edits required to turn one string into another by breaking down string transformation into three basic operations: adding, deleting, and replacing a character. Each operation is assigned a cost of 1. Leaving a character unchanged has a cost of 0.  There are some other algorithms. I&#8217;m not at all convinced by &#8216;soundex&#8217; algorithms- they don&#8217;t seem to help much.&#160; <\/p>\n<p>I decided that what I wanted was a difference based on words rather than characters. I find that the solution, the difference counter, that I give below  pretty handy, though it sometimes gives a score for differences that I don&#8217;t like. Try it yourselves with a variety of strings and you&#8217;ll see it makes a pretty good attempt. It is, of course, slow, because of the tedium of breaking down text into words and white-space. In normal use, this is only done once, when importing text into the database, when it is placed in an &#8216;inversion table&#8217;. One can use this data to test the similarity of the original text, which is much faster. Just so as to include those stuck on SQL Server 2000, I&#8217;ve made the function use a nTEXT parameter rather than a VARCHAR(MAX) though the latter would have made for a simpler routine <\/p>\n<p class=\"quote\">&#8220;Cleaning data is not<br \/> &#160; an exact science&#8221;<\/p>\n<p>In reality, every time one comes across a requirement where one has to check for differences, there are subtle requirements that are never the same. Cleaning data is not an exact science. I generally prefer to ignore &#8216;white-space&#8217;, including new-lines and punctuation, when checking for differences. My approach is to break down text into words and &#8216;not-words&#8217;, or white-space. I refer to these as different types of token. The table function I give below allows you to define a word in terms of the characters that make up a word. This is different in other languages. The routine is generally, though not always, much faster if one uses a &#8216;number table&#8217; but I decided that creating one for this article was a distraction for the reader .<\/p>\n<p>With the use of the &#8216;parsing&#8217; table-function, I then do a simple outer join between the two collections of words, and count the number of times that the minimum &#8216;best-fit&#8217; between words changes in the sequence of words. This is of course, an approximation: I should be using sliders and other devices that use iteration. At some point one has to hand over to the computer scientists. I tend to stop at the point where the routine does the job I want. <\/p>\n<p>As a flourish, I&#8217;ve provided, at the end, a variation of the function that provides a single-row table giving the first point at which the two samples of text diverge. It is really just a by-product of the first routine but I slipped it in to give a suggestion of the many ways the routine can be adapted for particular purposes. It is surprisingly handy for applications such as summary reports of the latest changes made to stored procedures! <\/p>\n<p>First the &#8216;classic Levenshtein string distance in TSQ (using strings instead of arrays)<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tcreate FUNCTION Levenshtein_Distance(@Source nvarchar(4000), @Target nvarchar(4000))\n\tRETURNS int\n\tAS\n\t\/*\n\tThe Levenshtein string distance algorithm was developed by Vladimir Levenshtein in 1965. It tells you the number of edits required to turn one string into another by breaking down string transformation into three basic operations: adding, deleting, and replacing a character. Each operation is assigned a cost of 1. Leaving a character unchanged has a cost of 0.\n\tThis is a translation of 'Fast, memory efficient Levenshtein algorithm' By Sten Hjelmqvist, originally converted to SQL by Arnold Fribble\n\thttp:\/\/www.codeproject.com\/Articles\/13525\/Fast-memory-efficient-Levenshtein-algorithm\n\t*\/\n\tBEGIN\n\t&#160; Declare&#160; @MaxDistance int\n\t&#160; Select @MaxDistance=200\n\t&#160; DECLARE @SourceStringLength int, @TargetStringLength int, @ii int, @jj int, @SourceCharacter nchar, @Cost int, @Cost1 int,\n\t&#160;&#160;&#160;&#160;&#160; -- create two work vectors of integer distances\n\t&#160;&#160;&#160; @Current_Row nvarchar(200), @Previous_Row nvarchar(200), @Min_Cost int\n\t&#160; SELECT @SourceStringLength = LEN(@Source), \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @TargetStringLength = LEN(@Target), \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @Previous_Row = '', \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @jj = 1, @ii = 1, \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @Cost = 0, @MaxDistance=200\n\t&#160;&#160;&#160; -- do the degenerate cases\n\t&#160;&#160;&#160; if @Source = @Target return (@Cost);\n\t&#160;&#160;&#160; if @SourceStringLength= 0 return @TargetStringLength;\n\t&#160;&#160;&#160; if @TargetStringLength= 0 return @SourceStringLength;\n\t&#160;&#160;&#160; \n\t&#160;&#160;&#160; -- initialize the previous row of distances\n\t&#160;&#160;&#160; -- this row is edit distance for an empty source string\n\t&#160; &#160;&#160;-- the distance is just the number of characters to delete from the target\n\t&#160; WHILE @jj &lt;= @TargetStringLength\n\t&#160;&#160;&#160; SELECT @Previous_Row = @Previous_Row + NCHAR(@jj), @jj = @jj + 1\n\t\n\t&#160; WHILE @ii &lt;= @SourceStringLength\n\t&#160; BEGIN\n\t&#160;&#160;&#160; SELECT @SourceCharacter = SUBSTRING(@Source, @ii, 1),\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @Cost1 = @ii, \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @Cost = @ii, \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @Current_Row = '', \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @jj = 1, \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; @Min_Cost = 4000\n\t&#160;&#160;&#160; WHILE @jj &lt;= @TargetStringLength\n\t&#160;&#160;&#160; BEGIN&#160; -- use formula to fill in the rest of the row\n\t&#160;&#160;&#160;&#160;&#160; SET @Cost = @Cost + 1\n\t&#160;&#160;&#160;&#160;&#160; --v1[j + 1] = Minimum(v1[j] + 1, v0[j + 1] + 1, v0[j] + cost);\n\t&#160;&#160;&#160;&#160;&#160; SET @Cost1 = @Cost1 - CASE WHEN @SourceCharacter = SUBSTRING(@Target, @jj, 1) THEN 1 ELSE 0 END\n\t&#160;&#160;&#160;&#160;&#160; IF @Cost &gt; @Cost1 SET @Cost = @Cost1\n\t&#160;&#160;&#160;&#160;&#160; SET @Cost1 = UNICODE(SUBSTRING(@Previous_Row, @jj, 1)) + 1\n\t&#160;&#160;&#160;&#160;&#160; IF @Cost &gt; @Cost1 SET @Cost = @Cost1\n\t&#160;&#160;&#160;&#160;&#160; IF @Cost &lt; @Min_Cost SET @Min_Cost = @Cost\n\t&#160;&#160;&#160;&#160;&#160; SELECT @Current_Row = @Current_Row + NCHAR(@Cost), @jj = @jj + 1\n\t&#160;&#160;&#160; END\n\t&#160;&#160;&#160; IF @Min_Cost &gt; @MaxDistance return -1\n\t&#160;&#160;&#160; -- copy current row to previous row for next iteration\n\t&#160;&#160;&#160; SELECT @Previous_Row = @Current_Row, @ii = @ii + 1\n\t&#160; END\n\t&#160; RETURN&#160; @Cost \n\tEND\n\tGO\n\t<\/pre>\n<p>&#8230;and now the equivalent system for detecting word differences<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">IF&#160;OBJECT_ID(N'dbo.uftWordTokens')&#160;IS&#160;NOT&#160;NULL&#160; \n&#160;&#160;DROP&#160;FUNCTION&#160;dbo.uftWordTokens \nGO \n\n\/*------------------------------------------------------------*\/ \nCREATE&#160;FUNCTION&#160;[dbo].[uftWordTokens] \n&#160;&#160;( \n&#160;&#160;&#160;&#160;@string&#160;NTEXT, \n&#160;&#160;&#160;&#160;@WordStartCharacters&#160;VARCHAR(255)&#160;=&#160;'a-z', \n&#160;&#160;&#160;&#160;@WordCharacters&#160;VARCHAR(255)&#160;=&#160;'-a-z''' \n&#160;&#160;) \nRETURNS&#160;@Results&#160;TABLE \n&#160;&#160;( \n&#160;&#160;&#160;&#160;SeqNo&#160;INT&#160;IDENTITY(1,&#160;1), \n&#160;&#160;&#160;&#160;Item&#160;VARCHAR(255), \n&#160;&#160;&#160;&#160;TokenType&#160;INT \n&#160;&#160;) \nAS&#160;\/* \nThis&#160;table&#160;function&#160;produces&#160;a&#160;table&#160;which&#160;divides&#160;up&#160;the&#160;words&#160;and&#160; \nthe&#160;spaces&#160;between&#160;the&#160;words&#160;in&#160;some&#160;text&#160;and&#160;produces&#160;a&#160;table&#160;of&#160;the \ntwo&#160;types&#160;of&#160;token&#160;in&#160;the&#160;sequence&#160;in&#160;which&#160;they&#160;were&#160;found \n*\/ \n&#160;&#160;&#160;BEGIN \n&#160;&#160;&#160;&#160;DECLARE&#160;@Pos&#160;INT,&#160;&#160;&#160;&#160;--index&#160;of&#160;current&#160;search \n&#160;&#160;&#160;&#160;&#160;&#160;@WhereWeAre&#160;INT,--index&#160;into&#160;string&#160;so&#160;far \n&#160;&#160;&#160;&#160;&#160;&#160;@ii&#160;INT,&#160;&#160;&#160;&#160;--the&#160;number&#160;of&#160;words&#160;found&#160;so&#160;far \n&#160;&#160;&#160;&#160;&#160;&#160;@next&#160;INT,&#160;&#160;--where&#160;the&#160;next&#160;search&#160;starts&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;@size&#160;INT&#160;&#160;&#160;--the&#160;total&#160;size&#160;of&#160;the&#160;text \n\n&#160;&#160;&#160;&#160;SELECT&#160;&#160;@ii&#160;=&#160;0,&#160;@WhereWeAre&#160;=&#160;1,&#160;@size&#160;=&#160;DATALENGTH(@string) \n\n\n&#160;&#160;&#160;&#160;WHILE&#160;@Size&#160;&gt;=&#160;@WhereWeAre \n&#160;&#160;&#160;&#160;&#160;&#160;BEGIN \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;@pos&#160;=&#160;PATINDEX('%['&#160;+&#160;@wordStartCharacters&#160;+&#160;']%', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SUBSTRING(@string,&#160;@whereWeAre,&#160;4000)) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;IF&#160;@pos&#160;&gt;&#160;0&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;BEGIN \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;IF&#160;@pos&#160;&gt;&#160;1&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@Results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;item,&#160;tokentype&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;SUBSTRING(@String,&#160;@whereWeAre,&#160;@pos&#160;-&#160;1),&#160;2 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;@next&#160;=&#160;@WhereWeAre&#160;+&#160;@pos,&#160;@ii&#160;=&#160;@ii&#160;+&#160;1 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;@pos&#160;=&#160;PATINDEX('%[^'&#160;+&#160;@wordCharacters&#160;+&#160;']%', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SUBSTRING(@string,&#160;@next,&#160;4000)&#160;+&#160;'&#160;') \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@Results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;item,&#160;tokentype&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;SUBSTRING(@String,&#160;@next&#160;-&#160;1,&#160;@pos),&#160;1 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;@WhereWeAre&#160;=&#160;@next&#160;+&#160;@pos&#160;-&#160;1 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;END \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;ELSE&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;BEGIN \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;IF&#160;LEN(REPLACE( \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SUBSTRING(@String,&#160;@whereWeAre,&#160;4000),&#160;'&#160;',&#160;'!' \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;))&#160;&gt;&#160;0&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@Results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;item,&#160;tokentype&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;SUBSTRING(@String,&#160;@whereWeAre,&#160;4000),&#160;2 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;@whereWeAre&#160;=&#160;@WhereWeAre&#160;+&#160;4000 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;END \n&#160;&#160;&#160;&#160;&#160;&#160;END \n&#160;&#160;&#160;&#160;RETURN \n&#160;&#160;&#160;END \n\n\/*&#160;Tests: \nSELECT&#160;&#160;'['&#160;+&#160;item&#160;+&#160;']',&#160;tokentype \nFROM&#160;&#160;&#160;&#160;dbo.uftWordTokens('This&#160;&#160;&#160;&#160;&#160;has \n&#160;been&#160;relentlessly&#160; \n,^----tested',&#160;DEFAULT,&#160;DEFAULT)&#160;&#160;&#160;&#160;&#160;&#160;&#160; \nSELECT&#160;&#160;'['&#160;+&#160;item&#160;+&#160;']',&#160;tokentype \nFROM&#160;&#160;&#160;&#160;dbo.uftWordTokens('This&#160;has&#160;been&#160;relentlessly&#160;tested&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;!', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;DEFAULT,&#160;DEFAULT)&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; \nSELECT&#160;&#160;item,&#160;tokentype \nFROM&#160;&#160;&#160;&#160;dbo.uftWordTokens('This&#160;has&#160;been',&#160;DEFAULT,&#160;DEFAULT)&#160;&#160;&#160; \nSELECT&#160;&#160;'['&#160;+&#160;item&#160;+&#160;']',&#160;tokentype \nFROM&#160;&#160;&#160;&#160;dbo.uftWordTokens('&#160;&lt;!--&#160;23&#160;343.43&#160;&#160;&lt;div&gt;Hello&#160;there&#160;&#160;....&#160;--&gt;', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;DEFAULT,&#160;DEFAULT) \n\n*\/ \nGO \n\nIF&#160;OBJECT_ID(N'dbo.ufnDifferencesInText')&#160;IS&#160;NOT&#160;NULL&#160; \n&#160;&#160;DROP&#160;FUNCTION&#160;dbo.ufiDifferencesInText \nGO \n\/*------------------------------------------------------------*\/ \nCREATE&#160;FUNCTION&#160;dbo.ufiDifferencesInText \n&#160;&#160;( \n&#160;&#160;&#160;&#160;@Sample&#160;NTEXT, \n&#160;&#160;&#160;&#160;@comparison&#160;NTEXT \n&#160;&#160;) \nRETURNS&#160;INT \nAS&#160;BEGIN \n&#160;&#160;&#160;&#160;DECLARE&#160;@results&#160;TABLE \n&#160;&#160;&#160;&#160;&#160;&#160;( \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;token_ID&#160;INT&#160;IDENTITY(1,&#160;1), \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;sequenceNumber&#160;INT, \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Sample_ID&#160;INT, \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Item&#160;VARCHAR(255), \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;TokenType&#160;INT \n&#160;&#160;&#160;&#160;&#160;&#160;) \n\/* \nThis&#160;function&#160;returns&#160;the&#160;number&#160;of&#160;differences&#160;it&#160;found&#160;between&#160;two&#160;pieces \nof&#160;text \n*\/ \n&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;SequenceNumber,&#160;Sample_ID,&#160;Item,&#160;Tokentype&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;seqno,&#160;1,&#160;item,&#160;tokentype \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;dbo.uftWordTokens(@sample,&#160;DEFAULT,&#160;DEFAULT) \n\n&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;SequenceNumber,&#160;Sample_ID,&#160;Item,&#160;Tokentype&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;seqno,&#160;2,&#160;item,&#160;tokentype \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;dbo.uftWordTokens(@comparison,&#160;DEFAULT,&#160;DEFAULT) \n&#160;&#160;&#160;&#160;DECLARE&#160;@closestMatch&#160;TABLE \n&#160;&#160;&#160;&#160;&#160;&#160;( \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;sequenceNumber&#160;INT, \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;skew&#160;INT \n&#160;&#160;&#160;&#160;&#160;&#160;) \n&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@closestMatch \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;sequencenumber,&#160;skew&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;COALESCE(a.sequencenumber,&#160;b.sequencenumber), \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;COALESCEE(MIN(ABS(COALESCE(b.sequenceNumber,&#160;1000)&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;-&#160;COALESCE(a.sequencenumber,&#160;1000))), \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;-1) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;(&#160;SELECT&#160;&#160;* \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;@results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;WHERE&#160;&#160;&#160;sample_ID&#160;=&#160;1&#160;AND&#160;tokentype&#160;=&#160;1 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;)&#160;a&#160;FULL&#160;OUTER&#160;JOIN&#160;(&#160;SELECT&#160;&#160;* \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;@results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;WHERE&#160;&#160;&#160;sample_ID&#160;=&#160;2&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;AND&#160;tokentype&#160;=&#160;1 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;)&#160;b&#160;ON&#160;a.item&#160;=&#160;b.item \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;GROUP&#160;BY&#160;COALESCE(a.sequencenumber,&#160;b.sequencenumber) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;ORDER&#160;BY&#160;COALESCE(a.sequencenumber,&#160;b.sequencenumber) \n\n\n\n&#160;&#160;&#160;&#160;RETURN&#160;(&#160;SELECT&#160;SUM(CASE&#160;WHEN&#160;a.skew&#160;-&#160;b.skew&#160;=&#160;0&#160;THEN&#160;0 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;ELSE&#160;1 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;END) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;@closestmatch&#160;a&#160;INNER&#160;JOIN&#160;@closestMatch&#160;b&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;ON&#160;b.sequenceNumber&#160;=&#160;a.sequenceNumber&#160;+&#160;2 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;) \n&#160;&#160;&#160;END \nGO \n\nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;a&#160;piece&#160;of&#160;text') \n--0 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;not&#160;a&#160;piece&#160;of&#160;text') \n--1 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;piece&#160;a&#160;a&#160;a&#160;of&#160;text') \n--2 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;&#160;piece&#160;of&#160;text',&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;a&#160;piece&#160;of&#160;text') \n--1 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;&#160;am&#160;a&#160;pot&#160;of&#160;jam',&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;a&#160;piece&#160;of&#160;text') \n--3 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;&#160;am&#160;a&#160;pot&#160;of&#160;jam', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;&#160;am&#160;a&#160;pot&#160;of&#160;jam&#160;beloved&#160;by&#160;humans') \n--3 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'text&#160;of&#160;piece&#160;a&#160;am&#160;I') \n--4 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'this&#160;is&#160;completely&#160;different') \n--5 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('I&#160;am&#160;a&#160;piece&#160;of&#160;text',&#160;'') \n--5 \nSELECT&#160;&#160;dbo.ufnDifferencesInText('',&#160;'I&#160;am&#160;a&#160;piece&#160;of&#160;text') \n--5 \n\nSELECT&#160;&#160;dbo.ufnDifferencesInText('Call me Ishmael. Some years ago -- never mind how long precisely -- having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen, and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people''s hats off -- then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.'\n,&#160;'Call me Ishmael. Some years ago -- never mind how long precisely -- having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen, and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people''s hats off -- then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.') \n\n\n--&#160;============================================= \n--&#160;Description: A&#160;routine&#160;that&#160;returns&#160;a&#160;single-row&#160;which&#160; \n--&#160;gives&#160;the&#160;context&#160;of&#160;the&#160;first&#160;difference&#160;between&#160;two&#160; \n--&#160;strings \n--&#160;============================================= \nIF&#160;OBJECT_ID(N'dbo.uftShowFirstDifference')&#160;IS&#160;NOT&#160;NULL&#160; \n&#160;&#160;DROP&#160;FUNCTION&#160;dbo.uftShowFirstDifference \nGO \nCREATE&#160;FUNCTION&#160;uftShowFirstDifference \n&#160;&#160;( \n&#160;&#160;&#160;&#160;--&#160;Add&#160;the&#160;parameters&#160;for&#160;the&#160;function&#160;here \n&#160;&#160;&#160;&#160;@sample&#160;NTEXT, \n&#160;&#160;&#160;&#160;@comparison&#160;NTEXT \n&#160;&#160;) \nRETURNS&#160;@result&#160;TABLE \n&#160;&#160;( \n&#160;&#160;&#160;&#160;--&#160;Add&#160;the&#160;column&#160;definitions&#160;for&#160;the&#160;TABLE&#160;variable&#160;here \n&#160;&#160;&#160;&#160;first&#160;VARCHAR(2000), \n&#160;&#160;&#160;&#160;second&#160;VARCHAR(2000), \n&#160;&#160;&#160;&#160;[where]&#160;INT \n&#160;&#160;) \nAS&#160;BEGIN \n&#160;&#160;&#160;&#160;DECLARE&#160;@results&#160;TABLE \n&#160;&#160;&#160;&#160;&#160;&#160;( \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;token_ID&#160;INT&#160;IDENTITY(1,&#160;1), \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;sequenceNumber&#160;INT, \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Sample_ID&#160;INT, \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Item&#160;VARCHAR(255), \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;TokenType&#160;INT \n&#160;&#160;&#160;&#160;&#160;&#160;) \n&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;SequenceNumber,&#160;Sample_ID,&#160;Item,&#160;Tokentype&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;seqno,&#160;1,&#160;item,&#160;tokentype \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;dbo.uftWordTokens(@sample,&#160;DEFAULT,&#160;DEFAULT) \n\n&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;SequenceNumber,&#160;Sample_ID,&#160;Item,&#160;Tokentype&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;seqno,&#160;2,&#160;item,&#160;tokentype \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;dbo.uftWordTokens(@comparison,&#160;DEFAULT,&#160;DEFAULT) \n&#160;&#160;&#160;&#160;DECLARE&#160;@closestMatch&#160;TABLE \n&#160;&#160;&#160;&#160;&#160;&#160;( \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;sequenceNumber&#160;INT, \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;skew&#160;INT \n&#160;&#160;&#160;&#160;&#160;&#160;) \n&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@closestMatch \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;sequencenumber,&#160;skew&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;COALESCE(a.sequencenumber,&#160;b.sequencenumber), \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;COALESCE(MIN(ABS(COALESCE(b.sequenceNumber,&#160;1000)&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;-&#160;COALESCE(a.sequencenumber,&#160;1000))), \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;-1) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;(&#160;SELECT&#160;&#160;* \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;@results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;WHERE&#160;&#160;&#160;sample_ID&#160;=&#160;1&#160;AND&#160;tokentype&#160;=&#160;1 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;)&#160;a&#160;FULL&#160;OUTER&#160;JOIN&#160;(&#160;SELECT&#160;&#160;* \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;@results \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;WHERE&#160;&#160;&#160;sample_ID&#160;=&#160;2&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;AND&#160;tokentype&#160;=&#160;1 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;)&#160;b&#160;ON&#160;a.item&#160;=&#160;b.item \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;GROUP&#160;BY&#160;COALESCE(a.sequencenumber,&#160;b.sequencenumber) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;ORDER&#160;BY&#160;COALESCE(a.sequencenumber,&#160;b.sequencenumber) \n\n\n\n&#160;&#160;&#160;&#160;DECLARE&#160;@first&#160;VARCHAR(2000) \n&#160;&#160;&#160;&#160;DECLARE&#160;@firstDifference&#160;INT \n&#160;&#160;&#160;&#160;DECLARE&#160;@second&#160;VARCHAR(2000) \n&#160;&#160;&#160;&#160;SELECT&#160;&#160;@FirstDifference&#160;=&#160;MIN(sequenceNumber) \n&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;@closestMatch \n&#160;&#160;&#160;&#160;WHERE&#160;&#160;&#160;skew&#160;&lt;&gt;&#160;0 \n&#160;&#160;&#160;&#160;SELECT&#160;&#160;@first&#160;=&#160;'',&#160;@second&#160;=&#160;'' \n&#160;&#160;&#160;&#160;SELECT&#160;TOP&#160;10 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;@first&#160;=&#160;COALESCE(@First,&#160;'')&#160;+&#160;item \n&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;@results \n&#160;&#160;&#160;&#160;WHERE&#160;&#160;&#160;sample_ID&#160;=&#160;1&#160;AND&#160;sequenceNumber&#160;&gt;=&#160;@FirstDifference \n&#160;&#160;&#160;&#160;ORDER&#160;BY&#160;SequenceNumber \n&#160;&#160;&#160;&#160;SELECT&#160;TOP&#160;10 \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;@second&#160;=&#160;COALESCE(@second,&#160;'')&#160;+&#160;item \n&#160;&#160;&#160;&#160;FROM&#160;&#160;&#160;&#160;@results \n&#160;&#160;&#160;&#160;WHERE&#160;&#160;&#160;sample_ID&#160;=&#160;2&#160;AND&#160;sequenceNumber&#160;&gt;=&#160;@FirstDifference \n&#160;&#160;&#160;&#160;ORDER&#160;BY&#160;SequenceNumber \n&#160;&#160;&#160;&#160;INSERT&#160;&#160;INTO&#160;@result \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(&#160;first,&#160;Second,&#160;[where]&#160;) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SELECT&#160;&#160;[first]&#160;=&#160;@First,&#160;[second]&#160;=&#160;@second, \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;[where]&#160;=&#160;@FirstDifference \n\n&#160;&#160;&#160;&#160;RETURN&#160; \n&#160;&#160;&#160;END \nGO \n\nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;a&#160;piece&#160;of&#160;text') \n--&#160;NULL \nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;not&#160;a&#160;piece&#160;of&#160;text') \n--a&#160;piece&#160;of&#160;text&#160;&#160;&#160;&#160;&#160;not&#160;a&#160;piece&#160;of&#160;text&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;5 \nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;piece&#160;a&#160;a&#160;a&#160;of&#160;text') \n--a&#160;piece&#160;of&#160;text&#160;&#160;&#160;&#160;&#160;piece&#160;a&#160;a&#160;a&#160;of&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;5 \nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;&#160;piece&#160;of&#160;text',&#160; \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;a&#160;piece&#160;of&#160;text') \n--piece&#160;of&#160;text&#160;&#160;&#160;&#160;&#160;&#160;&#160;am&#160;a&#160;piece&#160;of&#160;text&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;3 \nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;&#160;am&#160;a&#160;pot&#160;of&#160;jam', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;am&#160;a&#160;piece&#160;of&#160;text') \n--pot&#160;of&#160;jam&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;piece&#160;of&#160;text&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;7 \nSELECT&#160;&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;&#160;am&#160;a&#160;pot&#160;of&#160;jam', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'I&#160;&#160;am&#160;a&#160;pot&#160;of&#160;jam&#160;beloved&#160;by&#160;humans') \n--&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;beloved&#160;by&#160;humans&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;13 \nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'text&#160;of&#160;piece&#160;a&#160;am&#160;I') \n--I&#160;am&#160;a&#160;piece&#160;of&#160;&#160;&#160;&#160;&#160;text&#160;of&#160;piece&#160;a&#160;am&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;1 \nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;am&#160;a&#160;piece&#160;of&#160;text', \n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;'this&#160;is&#160;completely&#160;different') \n--I&#160;am&#160;a&#160;piece&#160;of&#160;&#160;&#160;&#160;&#160;this&#160;is&#160;completely&#160;different&#160;1 \nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('I&#160;am&#160;a&#160;piece&#160;of&#160;text',&#160;'') \n--I&#160;am&#160;a&#160;piece&#160;of&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;1 \nSELECT&#160;&#160;* \nFROM&#160;&#160;&#160;&#160;dbo.uftShowFirstDifference('',&#160;'I&#160;am&#160;a&#160;piece&#160;of&#160;text') \n--&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;I&#160;am&#160;a&#160;piece&#160;of&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;1 \n\n<\/pre>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In TSQL there is a limit to the way you can compare text strings. They&#8217;re either equal or not. Sooner or later, usually when cleaning data, something more subtle is required!&hellip;<\/p>\n","protected":false},"author":213195,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143531],"tags":[4150,4252,4190,4821],"coauthors":[],"class_list":["post-307","post","type-post","status-publish","format-standard","hentry","category-t-sql-programming-sql-server","tag-sql","tag-t-sql-programming","tag-tsql","tag-tsql-sql-server-quantify-text-differences-transact"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/307","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/213195"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=307"}],"version-history":[{"count":5,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/307\/revisions"}],"predecessor-version":[{"id":92566,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/307\/revisions\/92566"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=307"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=307"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=307"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=307"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}