{"id":2143,"date":"2016-01-13T00:00:00","date_gmt":"2016-01-13T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/declarative-sql-using-check-default\/"},"modified":"2021-09-29T16:21:22","modified_gmt":"2021-09-29T16:21:22","slug":"declarative-sql-using-check-default","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/declarative-sql-using-check-default\/","title":{"rendered":"Declarative SQL: Using CHECK() &amp; DEFAULT"},"content":{"rendered":"<div id=\"pretty\">\n<p>In this series, there are &#8230; <\/p>\n<ul>\n<li> 1.&#160;<a href=\"https:\/\/www.simple-talk.com\/sql\/t-sql-programming\/declarative-sql-using-unique-constraints-\/\">Declarative SQL: Using UNIQUE Constraints<\/a><\/li>\n<li>2. D<a id=\"ctl00_ctl00_MainContent_MainContent_AllArticles1_repArticles_ctl01_lnkTitle\" href=\"https:\/\/www.simple-talk.com\/sql\/t-sql-programming\/declarative-sql-using-references\/\">eclarative SQL: Using References<\/a><\/li>\n<li> 3.&#160;<a id=\"ctl00_ctl00_MainContent_MainContent_AllArticles1_repArticles_ctl00_lnkTitle\" href=\"https:\/\/www.simple-talk.com\/sql\/t-sql-programming\/declarative-sql-using-check()--default-\/\">Declarative SQL: Using CHECK() &amp; DEFAULT<\/a><\/li>\n<\/ul>\n<p class=\"start\"> SQL signaled a major leap from a file system to a relational databases system. A file is basically passive; it just holds the data. SQL, in contrast, allows you to specify, define it, and protect its integrity. While the choice of data type for a column is an important part of this, we will, in this article, be more interested in the <code>CHECK  ()<\/code> and <code>DEFAULT <\/code>options because  they are important for ensuring that your data is correct.  <\/p>\n<p> Traditionally, file descriptions are in the application programs,  such as the DATA DIVISION in COBOL, the FORMAT statement in FORTRAN, <code>fopen()<\/code> in C, or the Pascal &#8216;<code>TYPE  &lt;file name&gt;: FILE OF &lt;record descriptor&gt;<\/code>&#8216; declaration.  <\/p>\n<p> But the Data Definition Language (DDL) in SQL is a <i>separate,  active sub-language<\/i>. It is the DDL that defines the data types, not the host  program; the DDL applies <code>CHECK<\/code>  constraints, not the host program; the DDL maintains certain relationships, not  the host program. <\/p>\n<h2> CHECK()<\/h2>\n<p> We probably all underuse the <code> CHECK()<\/code> constraints. Part of the problem is cultural: They do not exist in  other programming languages. People generally rely on &#8216;transfer of training&#8217;  when picking up new skills, and it is more difficult to assimilate things that  they have never come across; but a more subtle part of the problem is those SQL <code>NULL<\/code>s! Our language has three logical  values &#8211; &#8211; <i>True<\/i>, <i>False<\/i>, and <i>Unknown<\/i>. The <i>Unknown<\/i> value shows up when you try to  use a predicate with a <code>NULL<\/code> value.  One of the principles of SQL is that <code>NULL<\/code>s  propagate. This is why good SQL avoids <code> NULL<\/code>-able columns; the old heuristic was that fewer than 5% of the columns  in a schema should be <code>NULL<\/code>-able.  <\/p>\n<p> When you do a Data-manipulation Language (DML) statement, (<code>INSERT<\/code>, <code>MERGE<\/code>, <code>UPDATE<\/code>, <code>DELETE<\/code> and <code>SELECT<\/code>), an <i>Unknown <\/i> logical result from a predicate is treated as a <i>False<\/i>. But an <i>Unknown<\/i> result in a DDL  <code>CHECK () <\/code>clause is treated as a <i>True<\/i>. We call  this the &#8216;benefit of the doubt&#8217; option in SQL. Look at this skeleton code to see  what I mean: <\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">CREATE TABLE Test\n(test_id CHAR(3) NOT NULL PRIMARY KEY,\nnull_val INTEGER \n&#160;CONSTRAINT oversized_value \n&#160;CHECK (null_val &lt; 10));\n\t<\/pre>\n<p> Now try these test insertion statements.   <\/p>\n<pre>INSERT INTO Test VALUES ('A', 1); -- no problems\nINSERT INTO Test VALUES ('B', 11); -- caught by constraint\nINSERT INTO Test VALUES ('C', NULL); -- benefit of the doubt &#160;\n\t<\/pre>\n<p> If you want to play with a bit, try these simple queries.  \t<\/p>\n<pre>\t\tSELECT * FROM Test WHERE null_val &lt; 10;\nSELECT * FROM Test WHERE null_val &gt; 10;\nSELECT * FROM Test WHERE null_val = 10;\n\t<\/pre>\n<p> The simplest <code>CHECK<\/code>()  constraints on the column use basic predicates. They establish such rules as  minimums and\/or maximums for numeric values or allowed values for a column. They  use a <code>LIKE<\/code> predicate to define a  regular expression for strings. They can include expressions, but you would, in  general, prefer to keep these fairly simple. This information is passed on the  optimizer along with statistics.  <\/p>\n<p> The most overlooked constraint for numeric data is one that will  prevent negative or zero values in a column.  <\/p>\n<pre>CHECK (order_qty &gt; 0)\nCHECK (item_cnt &gt;= 0)\n<\/pre>\n<p> There is a horror story about a simple order entry system that  allowed negative quantities to be ordered. When it would do the price  extensions, you got a negative total and turn that into a refund to the  customer. It did not take too long for less than honest people to figure out how  to game the system.<\/p>\n<p> Another overlooked constraint is for strings. You can enforce upper  and lower case conventions with the <code>UPPER<\/code>()  or <code>LOWER<\/code>() functions. Likewise, even  simple regular expressions can save a lot of grief. <\/p>\n<pre>CHECK (xyz_code = UPPER (xyz_code))\nCHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]'\n<\/pre>\n<p> The ISO convention is to use only unaccented uppercase ASCII Latin  letters, digits and limited set of punctuation marks for standardized encoding  schemes, so this simple constraint saves you a lot of programming. This subset  of ASCII is part of all the Unicode character sets for languages, so that  everyone on earth can write ISO encoding strings in their native languages.  <\/p>\n<p> The <code>IN<\/code>() predicate is  useful for limiting the column to a fixed list of constant values. Think of it  as a &#8220;local table lookup&#8221; in DDL. But the question becomes when do you use the <code>IN<\/code>() predicate or the <code>REFERENCES<\/code> clause. My rule of thumb  is that if the list of possible values is short and static, then use the <code>IN<\/code>() predicate in a <code>CHECK<\/code>() constraint. If the list of  possible values is long or is dynamic, then it is better to reference an actual  lookup table. The definition of &#8220;long&#8221; and &#8220;static&#8221; are fuzzy; remember, this is  a heuristic, not mathematics. SQL Server supports several thousand values in the <code>IN<\/code>() predicate.  <\/p>\n<p> Some SQL products have special optimizations for long lists; they  sort the list and use a binary search, put it in a hash table, or self-organize  it. Do not depend on this in SQL Server. A really long list probably be should  be sorted in the code from most likely to least likely values.<\/p>\n<h3> Computed Checks <\/h3>\n<div class=\"indent\">\n<p><b> &#160;<\/b>Since the constraint eventually is to evaluate to a predicate, we  often forget that you can do a lot of math inside a predicate. Consider, for  example, a simple check digit like the one used in many barcode and credit card  validation applications.<\/p>\n<p> Each digit in the barcode is assigned a weight by its position in  the sequence within the barcode string. The final position holds the check  digit. If you are faced with validating the old ten digit UPC barcode where the  10th digit is the check digit then we need to use modulus 10 arithmetic:  Multiply each digit by its weight. The multiplication will automatically cast  each digit to an integer, but you could be explicit and use a <code>CAST<\/code>() function. This also adds documentation.  <\/p>\n<p> Some of these expressions will be a bit unnatural and complicated  to write without loops. However, the payoff is that it is done, one way, one  time in one place in the database, so there is no chance of an application  program forgetting to do. People too often forget that the function of the  database is to <i>maintain data integrity<\/i>, as well as to store it  <\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">CREATE TABLE Products\n(barcode CHAR(10) NOT NULL,\n&#160;CONSTRAINT All_Numeric_Barcode \n&#160;CHECK (CAST(barcode AS INTEGER) NOT LIKE '%[^0-9]%'),\n&#160;CONSTRAINT Valid_Checkdigit\n&#160;CHECK (\n&#160;SUBSTRING (CAST(barcode AS INTEGER), 10, 1) --- check digit\n&#160;= (SUBSTRING (CAST(barcode AS INTEGER), 1, 1)\n&#160;+ 2 * SUBSTRING (CAST(barcode AS INTEGER), 2, 1)\n&#160;+ 3 * SUBSTRING (CAST(barcode AS INTEGER), 3, 1)\n&#160;+ 4 * SUBSTRING (CAST(barcode AS INTEGER), 4, 1)\n&#160;+ 5 * SUBSTRING (CAST(barcode AS INTEGER), 5, 1)\n&#160;+ 6 * SUBSTRING (CAST(barcode AS INTEGER), 6, 1)\n&#160;+ 7 * SUBSTRING (CAST(barcode AS INTEGER), 7, 1)\n&#160;+ 8 * SUBSTRING (CAST(barcode AS INTEGER), 8, 1)\n&#160;+ 9 * SUBSTRING (CAST(barcode AS INTEGER), 9, 1))\n&#160;% 10)\n);&#160;\n\t<\/pre>\n<p> See how easy it is to avoid procedural code in favor of declarative  code? <\/p>\n<p> As an exercise, try doing the Luhn algorithm  (https:\/\/en.wikipedia.org\/wiki\/Luhn_algorithm) with <code>CASE<\/code> expressions instead of the look-up table. The <code>CHECK<\/code>() constraint is a little long,  but pretty fast.<\/p>\n<\/p><\/div>\n<h3> CASE Expressions<\/h3>\n<div class=\"indent\">\n<p> You can use a CASE expression in a CHECK constraint. It becomes a  replacement for if-then-else logic procedural code. The general outline for this  idiom is:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">CONSTRAINT &lt;constraint_name&gt;\nCHECK\n(CASE WHEN &lt;predicate_1&gt; THEN 'T' \n&#160;WHEN &lt;predicate_2&gt; THEN 'T'\n&#160;WHEN &lt;predicate_3&gt; THEN 'F'\n&#160;... \n&#160;ELSE NULL END = 'T');\n<\/pre>\n<p> The <code>CASE<\/code> expression  tests the <code>WHEN<\/code> clauses in the order  they are written, and stops at the first predicate that evaluates to <code>TRUE<\/code>. Therefore, you can get a  performance boost by putting the most likely test, at the front of the <code>CASE<\/code> expression. Finally, be careful  about the <code>ELSE<\/code> clause; I like to use  &#8220;<code>ELSE NULL<\/code>&#8221; explicitly so that I can  use it as a marker when I discover a need to put more test code in the  expression. <\/p>\n<\/div>\n<h3> Computed Columns <\/h3>\n<div class=\"indent\">\n<p> Computed columns are a bit like a VIEW but in it in the DDL rather  than the DML. The syntax is simply to declare the column as  <\/p>\n<pre>&lt;column_name&gt; AS (&lt;formula&gt;) [PERSISTED]<\/pre>\n<p> The optional keyword <code> PERSISTED<\/code>, says that the computation of the formula will actually be  materialized in the table. The SQL engine looks to see if any of the component  columns of the formula are changed and re-computes the value of the column anew.  Without the <code>PERSISTED<\/code> keyword, the  SQL engine performs the computation only when the column name is used in the  DML.<\/p>\n<p> Because division by zero is a common problem, you will find this  trick to be handy. The quotient of a division by zero becomes a <code>NULL<\/code>, rather than an error.<\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">&#160;CREATE TABLE Zero_Divide_Test\n(..\n&#160;numerator INTEGER NOT NULL,\n&#160;denominator INTEGER NOT NULL,\n&#160;quotient AS (numerator\/NULLIF(denominator, 0)) \n.. );\n\t<\/pre>\n<p class=\"MsoBodyText\">I would not use a user-defined  function (UDF) in a computed column. In general, they cannot help the optimizer,  they do not port to other RDBMS, and there will be problems with indexing the  column if they are not flagged as being deterministic and precise. <\/p>\n<p class=\"MsoBodyText\">Persisted computed columns can  be indexed in SQL Server. They must be deterministic; meaning that the formula  always returns the same result when given the same inputs. There are also some  precision requirements. Precision requirements are usually not a problem; we do  not use approximate data types very often.<\/p>\n<\/div>\n<h3> Multi-Column CHECK Constraints  <\/h3>\n<div class=\"indent\">\n<p><code> CHECK<\/code>()  constraints come in two flavors. So far we have talked about the simple column  level constraint. However, a <code>CHECK<\/code>()  constraint can be a multi-column predicate. Probably the most common example of  this is with temporal values. So it has to be modeled, using the ISO open  intervals approach. This means that we have (<code>start_date,  end_date<\/code>) pairs in the tables. But it also means we need a constraint for <code>CHECK<\/code>(<code>start_date &lt;= end_date<\/code>); for each interval. <\/p>\n<p> Status attributes are related to temporal constraints and have to  be modeled with multiple columns. A status is a state of being and things exist  in time as well as space. A transition constraint says that the status can  change only in certain ways over time. For example, you have to be born before  you can die, you have to get married before you can be divorced and you can be  re-married after a divorce.<\/p>\n<p> In order to enforce a transition constraint, we need the current  state and the previous state of an entity. But we also need to limit the pairs  of valid current and previous states. Transition constraints are usually modeled  with the state transition diagram. This is made up of arrows and boxes that show  the direction of flow in the model (see diagram). <\/p>\n<p> A declarative way to enforce transition constraints is put the  state transitions into a separate table and then reference the legal  transitions. This requires that the referencing table have both the previous,  and the current state in two separate columns. Using this example, we would have  something like this:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">CREATE TABLE State_Changes\n(previous_state VARCHAR(15) NOT NULL,\n&#160;current_state VARCHAR(15) NOT NULL, \nPRIMARY KEY (previous_state, current_state));\n&#160;\nINSERT INTO State_Changes \nVALUES ('Born', 'Born'), -- initial state\n&#160;('Born', 'Married'), \n&#160;('Born', 'Dead'), \n&#160;('Married', 'Divorced'), \n&#160;('Married', 'Dead'), \n&#160;('Divorced', 'Married'),\n&#160;('Divorced', 'Dead'),\n&#160;('Dead', 'Dead'); -- terminal state\n<\/pre>\n<p> <img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/2340-clip_image002.jpg\" width=\"386\" alt=\"2340-clip_image002.jpg\" \/><\/p>\n<p class=\"MsoBodyText\"> &#160;The referencing table looks like this. <\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">CREATE TABLE MyLife \n(..\n&#160;previous_state VARCHAR(15) DEFAULT 'Born' NOT NULL,\n&#160;current_state VARCHAR(15) DEFAULT 'Born' NOT NULL,\n&#160;FOREIGN KEY (previous_state, current_state)\n&#160;REFERENCES State_Changes (previous_state, current_state)\n&#160;ON UPDATE CASCADE, \n..);\nIt is not always realized by developers that a foreign key can be multi-column. \n&#160;\n<\/pre>\n<\/div>\n<h2> DEFAULT<\/h2>\n<p> The <code>DEFAULT<\/code> clause is  attached to a column in the table and gives a value that can be used to  construct a complete row with an <code>INSERT  INTO<\/code> statement. There are some other uses for defaults, but I will skip  those for now. <\/p>\n<p> The value is usually a constant of the same data type column to  which it belongs. The reason that I specified that the types must match, is that  a lot of programmers <\/p>\n<ul>\n<li>put the default value in quotation marks, even when they are  \t\t&#160;numeric types, <\/li>\n<li>do not use standard data formats <\/li>\n<li>oversize or under-size the value. <\/li>\n<\/ul>\n<p> \tBesides being sloppy programming, these habits require the database  to cast the default value to the proper data type, size and precision, which can  only waste resources.<\/p>\n<p> Many programmers do not know that the values list of an <code>INSERT INTO<\/code> statement can include the  word <code>DEFAULT<\/code>. If the column is <code>NULL<\/code>-able, then the &#8220;default <code>DEFAULT<\/code>&#8221; is <code>NULL<\/code>. It is probably less surprising that almost nobody knows about  the <code>DEFAULT VALUES<\/code> option. This is a  shorthand for a <code>VALUES<\/code> list made up  of nothing but defaults. Obviously, this assumes that all the columns have a  default value declared, which&#160;isn&#8217;t&#160;often the case.<\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">INSERT INTO &lt;table name&gt; [(&lt;column_list&gt;)] \nVALUES ({ DEFAULT | NULL | &lt;expression&gt; } [,..n]) [,..n] \n&#160;| DEFAULT VALUES \n<\/pre>\n<p> The <code>DEFAULT<\/code> option does  not have to be a simple constant. It can be a system-level function, such as <code>CURRENT_TIMESTAMP<\/code> (please stop using  the old Sybase\/UNIX <code>getdate()<\/code>  function call). This can be very handy for time-stamping a row. SQL Server has a  range of such functions that, when used as <code>DEFAULT<\/code>s, are valuable for auditing changes in the data.<\/p>\n<p> But the default can also use a sequence.&#160; First, you need to use a <code>CREATE SEQUENCE<\/code> statement. This statement creates a schema level  object that is accessible to any user. It is not part of a table. It is not a  procedure that belong to one user. The idea is that a user can invoke the  sequence with a special syntax that is used wherever an integer value of the  sequence&#8217;s data type would work.<\/p>\n<p> If you want a physical model, imagine you are in the butcher store.  You walk in and pull a service ticket number from a roll of tickets on the  counter. Sequence numbers are generated outside the scope of the current  transaction, just like the tickets. The basic options are fairly simple.&#160; Let&#8217;s look at the roll of tickets (i.e. <code>CREATE SEQUENCE<\/code>). It has to be  declared with parameters that define behavior.<\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">CREATE SEQUENCE Service_Ticket_Seq\n&#160;AS INTEGER\n&#160;START WITH 1\n&#160;INCREMENT BY 1\n&#160;MINVALUE 1\n&#160;MAXVALUE 100\n&#160;CYCLE;\n\t<\/pre>\n<p> I can put a <code>SEQUENCE<\/code> in  the <code>DEFAULT<\/code> clause of the DDL for  table:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">CREATE TABLE Butcher_Shop\n(ticket_nbr INTEGER DEFAULT NEXT VALUE FOR Service_Ticket_Seq,\n..);\n<\/pre>\n<p> The numbering will cycle when the sequence reaches 100 in this  example and start over at 1. Besides being declarative and giving you lots of  control, the <code>NEXT VALUE<\/code> can be used  anywhere that allows its data type. <\/p>\n<h1> Conclusion <\/h1>\n<p><b> &#160;<\/b>SQL has evolved to be much more declarative than when it was first  created. I would recommend that you pick a feature you do not know well, and  spend an afternoon seeing how far you can push it. You will be surprised how  much procedural code you can remove.<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>SQL is unusual is that data is not passively stored. Instead you use declarative SQL to specify the rules that underlie the data and its integrity. When used properly, constraints can avoid having to provide a lot of logic elsewhere. CHECK() and DEFAULT can do a lot to ensure that your data is correct&hellip;<\/p>\n","protected":false},"author":96214,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143531],"tags":[],"coauthors":[],"class_list":["post-2143","post","type-post","status-publish","format-standard","hentry","category-t-sql-programming-sql-server"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/2143","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/96214"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=2143"}],"version-history":[{"count":5,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/2143\/revisions"}],"predecessor-version":[{"id":92511,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/2143\/revisions\/92511"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=2143"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=2143"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=2143"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=2143"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}