{"id":83734,"date":"2019-04-03T14:15:20","date_gmt":"2019-04-03T14:15:20","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=83734"},"modified":"2022-04-24T15:42:32","modified_gmt":"2022-04-24T15:42:32","slug":"sql-for-cosmos-db-tips-and-tricks","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/nosql\/sql-for-cosmos-db-tips-and-tricks\/","title":{"rendered":"Cosmos DB SQL Workarounds: LEFT, RIGHT, YEAR, MONTH, DAY, and Wildcard Search via SUBSTRING and String Functions"},"content":{"rendered":"<p><strong>The series so far:<\/strong><\/p>\n<ol>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/nosql-databases\/introduction-to-sql-for-cosmos-db\/\">Introduction to SQL for Cosmos DB<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/nosql-databases\/sql-for-cosmos-db-tips-and-tricks\">SQL For Cosmos DB \u2013 Tips and Tricks<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/nosql-databases\/sql-for-cosmos-db-handling-complex-json-structures\/\">SQL For Cosmos DB \u2013 Handling Complex JSON Structures<\/a><\/li>\n<\/ol>\n\n<p>This article will help you extend your basic Cosmos DB SQL querying abilities. Here, you will learn a few of the tricks that you might need to overcome the limitations of the flavour of Cosmos DB SQL.<\/p>\n<h2>String Functions for Wildcard Searches<\/h2>\n<p>The Cosmos DB engine is not bad at searching through vast swathes of JSON data. If you need to use a wildcard search for a text inside a specific attribute, the <code>CONTAINS()<\/code> function is a good starting point. You use it rather like a T-SQL LIKE in a <code>WHERE<\/code> clause.<\/p>\n<p>NOTE: Load the <a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2019\/05\/simplecars2.zip\">SimpleCars2<\/a> data into the Cosmos DB emulator. For help with this, see the first article in this series.<\/p>\n<p>Run the examples in this article against SimpleCars2.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.CustomerName, s.InvoiceNumber\r\nFROM    s\r\nWHERE   CONTAINS(s.CustomerName, \"Wheels\")<\/pre>\n<p>This query finds the following documents in the source collection:<\/p>\n<pre class=\"lang:c# theme:vs2012\">[\r\n    {\r\n        \"CustomerName\": \"Wonderland Wheels\",\r\n        \"InvoiceNumber\": \"GBPGB011\"\r\n    },\r\n    {\r\n        \"CustomerName\": \"Wonderland Wheels\",\r\n        \"InvoiceNumber\": \"GBPGB011\"\r\n    }\r\n]<\/pre>\n<p>You need to be aware that the search text is case sensitive. What do you do to make a wildcard search case-insensitive? The answer is simply to force the attribute and the text that you are searching to be in the same case, much as you would do for case-sensitive collations in SQL Server:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.CustomerName\r\nFROM    s\r\nWHERE   CONTAINS(UPPER(s.CustomerName), \"WHEELS\")<\/pre>\n<p>Alternatively, if you need to set up a Wildcard search that applies to the start or the end of the contents of an attribute, similar to <code>LIKE \u2018%sometext\u2019<\/code> in T-SQL, you can use this approach:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.CustomerName\r\nFROM    s\r\nWHERE   STARTSWITH(s.CustomerName, \"Won\")\r\n        OR\r\n        ENDSWITH(s.CustomerName, \"Vehicles\")<\/pre>\n<p>This more focused wildcard search gives the following result:<\/p>\n<pre class=\"lang:c# theme:vs2012\">[\r\n    {\r\n        \"CustomerName\": \"Wonderland Wheels\"\r\n    },\r\n    {\r\n        \"CustomerName\": \"Wonderland Wheels\"\r\n    },\r\n    {\r\n        \"CustomerName\": \"Birmingham Executive Prestige Vehicles\"\r\n    }\r\n]<\/pre>\n<p>This query will still be case sensitive, but you can use the <code>UPPER()<\/code> or <code>LOWER()<\/code> functions to ensure that a case-insensitive search is applied.<\/p>\n<h2>Null Handling in Cosmos DB SQL<\/h2>\n<p>One reassuring shared feature between T-SQL and Cosmos DB SQL is that nulls, expressed as the unquoted lowercase word null in JSON, will \u2018infect\u2019 an entire calculation and return a null value if any single attribute is null. This is something that you will have to learn to handle in JSON documents.<\/p>\n<p>However, one of the joys of a semi-structured format like JSON is that no attribute is compulsory. The potential consequence is that attributes may be missing in certain documents, and a missing attribute will also return a null in a calculation. In essence, you have to guard against two different types of nulls, missing elements, and actual null values.<\/p>\n<p>To make this clearer, suppose that you have written some preventative code like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.InvoiceNumber\r\n       ,s.Cost + s.RepairsCost \r\n       + s.PartsCost + s.TransportInCost \r\n       + (IS_NULL(s.LineItemDiscount) ? 0 : s.LineItemDiscount) \r\n           AS Costs\r\nFROM   s<\/pre>\n<p>Assuming the query works (and there is no reason that it should not) the output should be:<\/p>\n<pre class=\"lang:c# theme:vs2012\">[\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB011\",\r\n        \"Costs\": 7425\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB011\",\r\n        \"Costs\": 66400\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB001\" ,\r\n        \"Costs\": 56425\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB002\" ,\r\n        \"Costs\": 185650\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB003\"\r\n\t  \r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"EURDE004\"\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"EURFR005\"\r\n    }\r\n]<\/pre>\n<p>Here you are combining the Cosmos DB SQL <code>IS_NULL()<\/code> function, which detects a missing attribute, with ternary logic to imitate the T-SQL <code>ISNULL()<\/code> function. What is more, you have enclosed the <code>NULL<\/code> detection ternary logic inside parentheses like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">(IS_NULL(s.LineItemDiscount) ? 0 : s.LineItemDiscount)<\/pre>\n<p>This ensures that the output works as you expect. Indeed, forgetting to enclose the <code>IS_NULL()<\/code> logic inside parentheses will cause problems.<\/p>\n<p>If you run this query against the SimpleCars2 collection, you will see that this technique works flawlessly for Invoice <em>GBPGB002<\/em> where there is a <code>LineItemDiscount<\/code> attribute that is set to null in the source JSON document. Unfortunately, the total costs are missing for three invoices, numbers <em>GBPGB003<\/em>, <em>GBPGB004<\/em>, and <em>GBPGB005<\/em>. If you look at these documents, you can see that they are all missing the <code>LineItemDiscount<\/code> attribute.<\/p>\n<p>Undeterred, you try and remedy the situation using the <code>IS_DEFINED()<\/code> function, like this (and please note that the ternary logic is reversed compared to the <code>IS_NULL()<\/code> function):<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.InvoiceNumber\r\n       ,s.Cost + s.RepairsCost \r\n       + s.PartsCost + s.TransportInCost \r\n       + (IS_DEFINED(s.LineItemDiscount) ? s.LineItemDiscount : 0) \r\n           AS Costs\r\nFROM   s<\/pre>\n<p>This time, the output looks like this:<\/p>\n<pre class=\"lang:c# theme:vs2012\">[\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB011\",\r\n        \"Costs\": 7475\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB011\",\r\n        \"Costs\": 66900\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB001 ,\r\n        \"Costs\": 59125\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB002\"\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB003\",\r\n        \"Costs\": 16410\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"EURDE004\",\r\n        \"Costs\": 10600\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"EURFR005\",\r\n        \"Costs\": 17970\r\n    }\r\n]<\/pre>\n<p>In this case, you will see costs for invoices <em>GBPGB003<\/em>, <em>GBPGB004<\/em>, and <em>GBPGB005<\/em> where there is no attribute at all for line item discount but not for <em>GBPGB002<\/em>, the document with the actual null value.<\/p>\n<p>This example shows that there are, in fact, two types of nulls in JSON, missing attributes and attributes that contain a null. One way to ensure that both of these potential traps are handled is code like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.InvoiceNumber\r\n       ,s.Cost + s.PartsCost + s.TransportInCost \r\n       + (IS_DEFINED(s.LineItemDiscount) \r\n          AND NOT IS_NULL(s.LineItemDiscount) ? s.LineItemDiscount : 0)\r\n           AS Costs\r\nFROM   s<\/pre>\n<p>Finally, you obtain the result that you were looking for:<\/p>\n<pre class=\"lang:c# theme:vs2012\">[\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB011\",\r\n        \"Costs\": 7225\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB011\",\r\n        \"Costs\": 66400\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB001\",\r\n        \"Costs\": 56950\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB002\",\r\n        \"Costs\": 180150\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"GBPGB003\",\r\n        \"Costs\": 15750\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"EURDE004\",\r\n        \"Costs\": 10100\r\n    },\r\n    {\r\n        \"InvoiceNumber\": \"EURFR005\",\r\n        \"Costs\": 16610\r\n    }\r\n]<\/pre>\n<p>Here you are using simple logic to detect, firstly, if the attribute is defined and secondly to ensure that, if it is present, it does not contain a null. This kind of approach comes close to basic null handling in T-SQL. Once you reach a certain level of proficiency in Cosmos DB, you can write your own user-defined functions in JavaScript to handle these kinds of issues.<\/p>\n<p>Given the clear limitations of Cosmos DB SQL, you will have to become inventive when writing SQL queries. You will have to write complex SQL using combinations of the available functions to achieve your goals.<\/p>\n<p>Take a look at a handful of elementary workarounds to give you some ideas.<\/p>\n<h2>SUBSTRING() for LEFT()<\/h2>\n<p>Cosmos DB SQL currently does not have an equivalent to the SQL <code>LEFT()<\/code> function. Instead, you need to use a little ingenuity and apply the <code>SUBSTRING()<\/code> function like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.CustomerName ,s.SalePrice\r\nFROM    simplecars AS s\r\nWHERE   SUBSTRING(s.CustomerName, 0, 1) = \"M\"<\/pre>\n<p>You should obtain the following result:<\/p>\n<pre class=\"lang:c# theme:vs2012\">[\r\n    {\r\n        \"CustomerName\": \"Magic Motors\",\r\n        \"SalePrice\": 65000\r\n    }\r\n]<\/pre>\n<h2>SUBSTRING() for RIGHT()<\/h2>\n<p>Just as there is no <code>LEFT()<\/code> function, there is no <code>RIGHT()<\/code> function. However, combining the <code>SUBSTRING()<\/code> and <code>REVERSE()<\/code> functions like this can solve certain challenges:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.CustomerName ,s.SalePrice\r\nFROM    simplecars AS s\r\nWHERE   SUBSTRING(REVERSE(s.CustomerName), 0, 1) = \"r\"<\/pre>\n<p>This time the result is:<\/p>\n<pre class=\"lang:c# theme:vs2012\">[\r\n    {\r\n        \"CustomerName\": \"WunderKar\",\r\n        \"SalePrice\": 11500\r\n    }\r\n]<\/pre>\n<p>Remember, that if you are looking for more than one character at the right of the string using this technique, then you will need to reverse the string that you are searching for. That is, instead of looking for \u2018red\u2019 you would have to enter \u2018der.\u2019<\/p>\n<p>If entering a search text in reverse order causes you any discomfort, then you can always add a second <code>REVERSE()<\/code> like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.CustomerName ,s.SalePrice\r\nFROM    saleselements AS s\r\nWHERE   REVERSE(SUBSTRING(REVERSE(s.CustomerName), 0, 3)) = \"Kar\"<\/pre>\n<p>Once again, this is case-sensitive unless you extend the SQL to enforce case-insensitivity, like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  s.CustomerName ,s.SalePrice\r\nFROM    saleselements AS s\r\nWHERE   SUBSTRING(REVERSE(UPPER(s.CustomerName)), 0, 1) = \"R\"<\/pre>\n<h2>Impersonate T-SQL YEAR(), MONTH(), and DAY()<\/h2>\n<p>There is no <code>YEAR()<\/code> function in Cosmos DB SQL. However, one simple alternative is to apply the <code>SUBSTRING()<\/code> function to the date string and isolate the relevant date element. To specify JSON documents for a specific year, you could write:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT s.SaleDate\r\nFROM   s\r\nWHERE  SUBSTRING(s.SaleDate, 0, 4) = \"2015\"<\/pre>\n<p>The output from this query displays the sale date for six of the seven documents in the collection.<\/p>\n<p>Extending this principle, you can isolate the month like this:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT s.SaleDate\r\nFROM   s\r\nWHERE  SUBSTRING(s.SaleDate, 5, 2) = \"02\"<\/pre>\n<p>Only two objects are returned in this case. The day can be isolated in this way:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT s.SaleDate\r\nFROM   s\r\nWHERE  SUBSTRING(s.SaleDate, 8, 2) = \"25\"<\/pre>\n<p>This filter only returns a single object.<\/p>\n<p>Although they are probably self-evident, for the sake of completeness it is worth noting that you can use:<\/p>\n<p>The operators <code>&gt;=<\/code>, <code>&lt;=<\/code>, <code>&lt;&gt;<\/code> or <code>!=<\/code><\/p>\n<p><code>IN<\/code> for a set of years, etc.<\/p>\n<p>Once you start creating more complex <code>WHERE<\/code> clauses that mix and match these elements, you will find that you can specify date and time ranges extremely easily.<\/p>\n<h2>Rounding up to nearest 1000<\/h2>\n<p>As a simple example of how to extend the Cosmos DB SQL functions, take a look at the following short code snippet that extends the <code>TRUNC()<\/code> function to round up to the nearest thousand:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">SELECT  TRUNC(ROUND(s.TotalSalePrice \/ 1000)) * 1000 AS RoundedUp\r\nFROM    s<\/pre>\n<p>Here, the output looks like this:<\/p>\n<pre class=\"lang:c# theme:vs2012\">[\r\n    {\r\n        \"RoundedUp\": 89000\r\n    },\r\n    {\r\n        \"RoundedUp\": 89000\r\n    },\r\n    {\r\n        \"RoundedUp\": 65000\r\n    },\r\n    {\r\n        \"RoundedUp\": 220000\r\n    },\r\n    {\r\n        \"RoundedUp\": 20000\r\n    },\r\n    {\r\n        \"RoundedUp\": 12000\r\n    },\r\n    {\r\n        \"RoundedUp\": 20000\r\n    }\r\n]<\/pre>\n<h2>Conclusion<\/h2>\n<p>There are many other workarounds that you can apply to overcome the limitations of the Cosmos DB implementation of SQL, and this short list could be extended to cover a range of options. However, this would be like a trip back to SQL Server circa 1994, and even this would not cover all the challenges that you are likely to face. Suffice it to say that you are likely to require some ingenuity and write extremely convoluted SQL to do something that would be simple in T-SQL or to consider other solutions.<\/p>\n<p>Cosmos DB has a short answer to this, and that it to write your own user-defined functions in JavaScript. This is a rich and immensely powerful language, and it can be fairly easy to extend Cosmos DB SQL with a plethora of functions that answer your specific requirements. However, the purpose of this article is to look at the Cosmos DB flavor of SQL, and so learning how to add extensions in a language that you may not be familiar with will have to wait for another day.<\/p>\n<p>The next article in the series will cover how to query more complex JSON documents.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Practical workarounds for common T-SQL idioms in Azure Cosmos DB SQL API &#8211; wildcard searches using CONTAINS and STARTSWITH, null handling, SUBSTRING-based equivalents for LEFT and RIGHT, string extraction for YEAR\/MONTH\/DAY, and rounding patterns. For developers migrating from SQL Server to Cosmos DB.&hellip;<\/p>\n","protected":false},"author":2181,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[137091,143535],"tags":[],"coauthors":[12351],"class_list":["post-83734","post","type-post","status-publish","format-standard","hentry","category-azure","category-nosql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/83734","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/2181"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=83734"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/83734\/revisions"}],"predecessor-version":[{"id":84225,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/83734\/revisions\/84225"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=83734"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=83734"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=83734"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=83734"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}