{"id":94805,"date":"2022-09-13T18:50:45","date_gmt":"2022-09-13T18:50:45","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=94805"},"modified":"2022-09-13T18:51:45","modified_gmt":"2022-09-13T18:51:45","slug":"t-sql-tuesday-154-sql-server-2022-is-distinct-from","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/blogs\/t-sql-tuesday-154-sql-server-2022-is-distinct-from\/","title":{"rendered":"T-SQL Tuesday #154 \u2013 SQL Server 2022, IS DISTINCT FROM"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" width=\"250\" height=\"250\" class=\"wp-image-94806 aligncenter\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/09\/word-image-1.png\" \/> Despite my recent job change from full time T-SQL code jockey\/architect to website editor for Simple-Talk, I will always be at heart at T-SQL programmer. While I will mostly be writing professionally in the foreseeable future will in support of an article, I will also continue writing code for several databases that I use for multiple personal reason housed on a SQL Server Express server on the desk beside me.<\/p>\n<p>The Invitation: <a href=\"https:\/\/glennsqlperformance.com\/2022\/09\/05\/t-sql-tuesday-154-invitation-sql-server-2022\/\">T-SQL Tuesday #154 Invitation \u2013 SQL Server 2022 (glennsqlperformance.com)<\/a><\/p>\n<p>When I saw the topic of this T-SQL Tuesday was to write about what I have been doing with SQL Server 2022, I figured I would note the two things I have been doing recently. First, I wrote a chapter about security changes in SQL Server 2002 for a forthcoming book entitled &#8220;<a href=\"https:\/\/www.amazon.com\/SQL-Server-2022-Administration-Inside\/dp\/0137899882\">SQL Server 2022 Administration Inside Out&#8221;<\/a> for Pearson with a great group of people that should be out later this year, early next at the latest. There are a few things I found out writing that chapter that I am keen to use, one of them being the more granular <code>UNMASK<\/code> permissions for the Dynamic Data Masking feature, but since I won\u2019t be writing multi-user production code in the future, I probably won\u2019t be masking any data, much less need granular masking capabilities.<\/p>\n<p>The other thing I have been doing is trying out some of the new features coming in SQL Server 2022 that I will absolutely be using even in my hobby databases. There are tons of new features in 2022, as there always is. But the ones that excite me are the T-SQL improvements. In this article I am going to highlight 1 feature that is immediately a standout.<\/p>\n<p><strong>IS (NOT) DISTINCT FROM<\/strong><\/p>\n<p>The <em><strong>MOST<\/strong> <\/em>exciting change from a T-SQL standpoint is: <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/t-sql\/queries\/is-distinct-from-transact-sql?view=sql-server-ver16\"><code>IS NOT DISTINCT FROM<\/code><\/a>. This feature solves an age-old issue for T-SQL programmers and is worth its weight in gold. It is basically an equals comparison operator like =, but treats <code>NULL<\/code> as an individual value. Unlike =, this new operator returns only <code>TRUE<\/code> or <code>FALSE<\/code>, but not <code>UNKNOWN<\/code>. Writing queries that compare to values that can contain <code>NULL<\/code> is tedious, mostly because of code like the following:<\/p>\n<pre class=\"lang:none theme:none\">SELECT CASE WHEN 1 = NULL THEN 'True' Else 'False' end,\r\n       CASE WHEN NOT(1 = NULL) THEN 'True' Else 'False' end<\/pre>\n<p>The fact that both of these comparisons return False is confusing at times even to me, and I have written on the whole <code>NULL<\/code> comparison and negating <code>NULL<\/code> values things about as many times as I have dealt with it in production code. But using <code>IS DISTINCT FROM<\/code>, this is no longer the case:<\/p>\n<pre class=\"lang:none theme:none\">SELECT CASE WHEN 1 IS DISTINCT FROM NULL \r\n               THEN 'True' Else 'False' end,\r\n       CASE WHEN NOT 1 IS DISTINCT FROM NULL\r\n               THEN 'True' Else 'False' end<\/pre>\n<p>Where this is really important is doing a query where you are looking for differences between two sets of data (often for a merge type operation). So consider the following table (from <code>WideWorldImporters<\/code>, which you can get <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/samples\/wide-world-importers-what-is?view=sql-server-ver16\">here<\/a>) :<\/p>\n<pre class=\"lang:none theme:none\">SELECT COUNT(*), \r\n       SUM(CASE WHEN LatestRecordedPopulation IS NULL \r\n           THEN 1 ELSE 0 END)\r\nFROM Application.Cities;<\/pre>\n<p>This returns 37940 total rows and 11048 rows with a NULL population value. Now, let&#8217;s join the table to itself on the PK and the population value<\/p>\n<pre class=\"lang:none theme:none\">SELECT *\r\nFROM   Application.Cities\r\n          JOIN Application.Cities AS C2\r\n            ON C2.CityID = Cities.CityID\r\nWHERE.LatestRecordedPopulation \r\n                         = Cities.LatestRecordedPopulation;<\/pre>\n<p>This returns 26892 rows, which you can do the math, is 37940-11048. Looking at this, without thinking about <code>NULL<\/code> values (who does initially?), this has to return every row in the table. But clearly not. Usually this becomes obvious when a few customers living in one of those cities isn\u2019t showing up on a report (or maybe even not getting their shipments.)<\/p>\n<p>The pre-SQL Server 2022 way of handling this properly this was to do something like this:<\/p>\n<pre class=\"lang:none theme:none \">SELECT *\r\nFROM   Application.Cities\r\n          JOIN Application.Cities AS C2\r\n               ON C2.CityID = Cities.CityID\r\n                  AND C2.LatestRecordedPopulation = \r\n                                      Cities.LatestRecordedPopulation \r\n                    OR (C2.LatestRecordedPopulation IS NULL\r\n                        AND Cities.LatestRecordedPopulation IS NULL);<\/pre>\n<p>Now we have checked the either they are the same value, or they both have a value of <code>NULL<\/code>. This query returns every row in the table, but it is kind of tricky code. And looking for differences is even more difficult, because you have to check to see if the values are different, if column 1 is null and column2 is not, and again vice versa. Another way this is often done is to change the population comparison to<\/p>\n<pre class=\"lang:none theme:none\">AND COALESCE(C2.LatestRecordedPopulation.-100) = \r\n             COALESCE(Cities.LatestRecordedPopulation,-100)<\/pre>\n<p>Which is safe from a correctness standpoint (assuming you can coalesce your values to something that is 100% not possible), but not from a performance one. This eliminates index seek utilization for these columns and makes it slower. That isn\u2019t always an issue, but for larger data sets, you may end up with more scans than you hoped.<\/p>\n<p>Using the new syntax, we can simply write this as:<\/p>\n<pre class=\"lang:none theme:none\">SELECT *\r\nFROM   Application.Cities\r\n         JOIN Application.Cities AS C2\r\n           ON C2.CityID = Cities.CityID\r\n              AND C2.LatestRecordedPopulation \r\n                    IS NOT DISTINCT FROM \r\n                           Cities.LatestRecordedPopulation;<\/pre>\n<p>The name of the operator might be a little bit confusing because of the words <code>FROM<\/code> and <code>DISTINCT<\/code>, it really makes sense. <code>DISTINCT<\/code> has a seemingly different usage here, but really it is the same meaning. If the value is the same, it is not <em>distinct<\/em> from one another, and if it is different, it is distinct. And the <code>DISTINCT<\/code> operator in the <code>SELECT<\/code> clause honors <code>NULL<\/code> values as a single bucket too. Now, go back and change the previous query to<code> IS DISTINCT FROM<\/code> and 0 rows will be returned.<\/p>\n<p>This feature would have saved me many many hours over the years! Is this alone a reason to upgrade to SQL Server 2022 alone? Since I have never been the one to write those checks, and I use the free Express edition for my hobby databases\u2026 I can say an unqualified \u201cYes\u201d to that!<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Despite my recent job change from full time T-SQL code jockey\/architect to website editor for Simple-Talk, I will always be at heart at T-SQL programmer. While I will mostly be writing professionally in the foreseeable future will in support of an article, I will also continue writing code for several databases that I use for&#8230;&hellip;<\/p>\n","protected":false},"author":56085,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[2],"tags":[],"coauthors":[19684],"class_list":["post-94805","post","type-post","status-publish","format-standard","hentry","category-blogs"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/94805","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/56085"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=94805"}],"version-history":[{"count":8,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/94805\/revisions"}],"predecessor-version":[{"id":94821,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/94805\/revisions\/94821"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=94805"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=94805"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=94805"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=94805"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}