{"id":1553,"date":"2013-01-10T00:00:00","date_gmt":"2013-01-10T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/unique-constraints-in-sql\/"},"modified":"2021-09-29T16:21:45","modified_gmt":"2021-09-29T16:21:45","slug":"unique-constraints-in-sql","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/unique-constraints-in-sql\/","title":{"rendered":"UNIQUE Constraints in SQL"},"content":{"rendered":"<div id=\"pretty\">\n<p class=\"start\">The Relational Model did not pop out of Dr. Codd&#8217;s head all at once. And SQL has been changing since 1986, when the first ANSI\/ISO Standard was issued. My keynote speech at an SQL conference in Prague in 2012 was on &#8220;Our Enemy, The Punch Card&#8221; and the theme was that in order to do good SQL, we had to overcome a mindset based on finding data by a physical location in a sequential file. This is where keys came in, but let me get to that in a minute. <\/p>\n<p>Dr. Codd was just as trapped in the world of punch cards, tape files and ISAM as the rest of us. The term &#8220;key&#8221; meant the sort key of a file that was required to do any reasonable processing with the original sequential file systems. A deck of punch cards is read in one and only one order; you cannot &#8220;back space&#8221; a card reader. The first magnetic tape drives mimicked that design and did not allow bi-directional access. In fact, the original Sybase SQL Server required that cursors had to &#8220;re-wind&#8221; a table to the start and move forward to read a prior row. Yes, this was much faster on a disk drive than a tape drive, but it was the same mindset. <\/p>\n<p>The model for cursors in ANSI\/ISO Standard SQL was directly copied from the IBM magnetic tape file system of the day. The idea was simple; make it possible to replace COBOL and FORTRAN tape routines with SQL commands, one for one. The<code> IDENTITY<\/code> table property (not column!) was also taken from the IBM tape file model. You cannot write to a tape in parallel, so each tape record gets its record (position) number from a physical queue. When you wanted to delete a tape record, you would set a bit flag at the start of the record. The system knew to skip over it until eventually a utility program could do some garbage collection, and make a next-generation copy of the data without the deleted records. Whether or not to re-number the new sequential file depended on how the record number has been used by the application. <\/p>\n<p>This why we have<code> PRIMARY KEY<\/code> in SQL and why a table must have one to be properly constructed. Dr. Codd had introduced the idea of a &#8220;candidate key&#8221; but it was awhile before someone asked why should we have to anoint one of the candidates as the primary? The <i>physical<\/i> reason was the <i>sequential file systems <\/i>that early RDBMS products used under the cover. Our enemy, the punch card shows up again! But there was no mathematical reason and Dr. Codd later dropped the concept.<\/p>\n<p>But SQL did not drop it and we still have it today. It is one of many legacies that punch cards have given us. At one point, ANSI X3H2 considered making it a requirement in SQL, but did not. This lets us declare a &#8220;staging table&#8221; inside a schema without a key. Instead, we picked up the<code> UNIQUE<\/code> constraint. The bad news is that<code> UNIQUE<\/code> is not well-understood. <\/p>\n<h2>Equivalence Relations, UNIQUEs and NULLs<\/h2>\n<p>The definition of a<code> PRIMARY KEY<\/code> constraint is that it is logically equivalent to &#8220;<code>UNIQUE NOT NULL<\/code>&#8221; on the column or columns involved. But a table can have only one <code>PRIMARY KEY<\/code>, as per our &#8220;punch card legacy&#8221; tradition. While a little redundant, you can write &#8220;<code>NOT NULL PRIMARY KEY<\/code>&#8221; for documentation and safety. If the primary key constraint is dropped, then the<code> NOT NULL<\/code> is left behind and still holds true. <\/p>\n<p>In mathematics, an equivalence relation takes a set of values and partitions them into disjoint subsets or equivalence classes. The most common example is, of course, plain old vanilla equality. Each number is its own class. A bit fancier are the<code> MOD(x, n) <\/code>functions that put an infinite number of integers into classes whose smallest member is 0 thru (n-1).<\/p>\n<p>SQL has &#8220;plain old vanilla equality&#8221;, but with a twist.<code> NULL<\/code>s do not compare under equality. Each <code>NULL<\/code> becomes its own class. By this time, you have done a query with a <code>GROUP BY <\/code>clause and see that all the<code> NULL<\/code>s are put into the same grouping. What is happening here? Grouping is not equality; it is another equivalence relation. Some parts of SQL use equality and use grouping. The current ANSI\/ISO Standard added the verbose &#8220;<code>IS [NOT] DISTINCT FROM<\/code>&#8221; comparison operator to test for equal values or for a pair of <code>NULL<\/code>s &#8212; a test based on grouping rather than equality. <\/p>\n<p><code>UNIQUE <\/code>is based on grouping. Here is a quick test <\/p>\n<pre>CREATE TABLE Foobar\n(foo INTEGER UNIQUE);\nINSERT INTO Foobar (foo) VALUES (1);&#160; -- just fine! \nINSERT INTO Foobar (foo) VALUES (1); --&#160; unique key violation, expected\nINSERT INTO Foobar (foo) VALUES (NULL); -- just fine\nINSERT INTO Foobar (foo) VALUES (NULL); -- unique key violation, not so expected\n<\/pre>\n<p>Remember that while<code> NULL<\/code>s cannot compare to anything, even themselves, they do group together. Let&#8217;s change this around a bit and use a <i>two column <\/i>key. This means that the <i>combination of the columns<\/i> is unique, which is not quite the same. <\/p>\n<pre>CREATE TABLE Foobar\n(foo INTEGER, \n&#160;bar INTEGER, \n&#160;UNIQUE (foo, bar));\nINSERT INTO Foobar (foo, bar) VALUES (1, 1);&#160; -- just fine\nINSERT INTO Foobar (foo, bar) VALUES (1, NULL); --&#160; just fine\nINSERT INTO Foobar (foo, bar) VALUES (NULL, 1); -- just fine\nINSERT INTO Foobar (foo, bar) VALUES (NULL, NULL); -- just fine\n&#160;\nSELECT bar, COUNT(foo) AS foo_cnt, COUNT(*) AS foo_row_cnt, SUM(foo) AS foo_tot\n&#160; FROM Foobar\n&#160;GROUP BY bar;\n<\/pre>\n<\/p>\n<table class=\"MsoNormalTable\">\n<tbody>\n<tr>\n<td valign=\"top\">\n<p>NULL<\/p>\n<\/td>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2<\/p>\n<\/td>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<td valign=\"top\">\n<p>2<\/p>\n<\/td>\n<td valign=\"top\">\n<p>1<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Uniqueness is a handy thing, but its real power comes from Declarative Referential Integrity constraints.&#160; <b><\/b><\/p>\n<h2>REFERENCES Constraints  <\/h2>\n<p>Now for a quick lesson on data modeling. A strong entity is one that exists on its own merit. A weak entity is one that exists because of a strong entity. The classic example is that of a sales order; the order header is strong and the order details are weak. If the order is dropped, then all the order details should disappear. Referential integrity is a relational property that assures that weak entities (the referencing table) has a strong entity (the referenced table). <\/p>\n<p>In the original SQL implementations, we had to use <code>TRIGGER<\/code>s to assure that referential integrity was maintained. This did not work so well, but it is what we had. This was the way we had always done it, in the world of punch cards and tape files. In fact, we could load <i>a lot<\/i> code into a trigger and have it fire when a &#8220;database event&#8221; (insertion, update or deletion) occurred. There is no way for an optimizer to read procedural code and do anything with it. Yet, when we looked at the way we were using them, it was for maintaining referential integrity in over 90% of the triggers. <\/p>\n<p>When you forgot to write the triggers, the weak entities without a strong entity were called &#8220;orphans&#8221; and they could accumulate. This wasted space, obviously, but orphans have to be cleaned before you can use the data. An improperly designed schema can choke as it tries to get back its integrity. <\/p>\n<p>By now, every SQL programmer has seen a <code>REFERENCES <\/code>constraint. The <code>REFERENCES<\/code> clause can be optimized, unlike the triggers! My analogy is that <code>TRIGGER<\/code>s were like controlling a vehicle with a different, independent steering device on every wheel, but the best way to drive is to lock the back wheels and synchronize the two front wheels with a single control. Less flexibility can mean more power and control; if it does 90% or more of your work, then life is good. <\/p>\n<p><code>REFERENCES<\/code> is under-used and two of its little-known features are:<\/p>\n<ol>\n<li><code>REFERENCES<\/code> can refer to the same table as it appears. This is a little tricky, but useful. This is why we use the terms &#8220;referenced&#8221; and &#8220;referencing&#8221; tables. In the old days of network database we had &#8220;parent&#8221; and &#8220;child&#8221; files, which had to be different, used pointers that linked in only one direction and some other differences.  <\/li>\n<li><code>REFERENCES<\/code> can refer to a <code>UNIQUE<\/code> constraint, not just a<code> PRIMARY KEY<\/code>. The columns have to be &#8220;union compatible&#8221; (i.e. columns in both tables are of the same data types in the same positions in the list). <\/li>\n<\/ol>\n<p>Perhaps some dummy code will help:<\/p>\n<pre>CREATE TABLE Sales_Orders\n(order_nbr CHAR(5) NOT NULL PRIMARY KEY);\nINSERT INTO Sales_Orders (order_nbr)\nVALUES ('12345'), ('12346'), ('12347'); \n<\/pre>\n<p>Now the weak entity table. <\/p>\n<pre>CREATE TABLE Order_Details\n(order_nbr CHAR(5) NOT NULL \n&#160;&#160; REFERENCES Sales_Orders(order_nbr)\n&#160; ON UPDATE CASCADE\n&#160; ON DELETE CASCADE, \n&#160;order_item VARCHAR(10) NOT NULL, \n&#160;PRIMARY KEY (order_nbr, order_item));\nINSERT INTO Order_Details\nVALUES \n('12345', 'Hat'), ('12345', 'Pants'), ('12345', 'Shoes'), \n('12346', 'Hat'), ('12346', 'Pants'), ('12346', 'Shoes'), \n('12347', 'Shirt'); \n<\/pre>\n<p>If I reference a table, it is assumed that I am using the primary key of a table by default. It is a better programming practice to explicitly state the column names. This example will look to changes in <code>Sales_Orders<\/code> and do updates or deletes to the <code>Order_Details<\/code> table. If I did not have these clauses, an <code>UPDATE<\/code> or <code>DELETE<\/code> might fail. When I try:<\/p>\n<pre>INSERT INTO Order_Details VALUES ('12345', 'Hat'); --fails on PK violation\n<\/pre>\n<p>&#8230;it fails because of the <code>PRIMARY KEY<\/code> violation, But I can do this, without any trouble. <\/p>\n<pre>DELETE FROM Sales_Orders \nWHERE order_nbr = '12345';\n<\/pre>\n<p>&#8230; and when I try to follow it with &#8230;<\/p>\n<pre>INSERT INTO Order_Details\nVALUES ('12345', 'Hat'), ('12345', 'Pants'), ('12345', 'Shoes'); -- fails on DRI\n<\/pre>\n<p>.. it fails the Declarative Referential Integrity&#160; (DRI) check. You can update the strong entity, <code>Sales_orders<\/code>, and invisibly to you, the <code>Order_Details<\/code> will get the change to the <code>order_nbr<\/code>. By invisible, I mean that you do not get a message about the cascade. <\/p>\n<pre>UPDATE Sales_Orders\n&#160;&#160; SET order_nbr = '1234x'\n&#160;WHERE order_nbr = '12345';\n<\/pre>\n<p>This will also perform invisibly to you, as it flushed the matching <code>Order_Details<\/code> for that <code>order_nbr.<\/code> <\/p>\n<pre>DELETE FROM Sales_Orders\n&#160;WHERE order_nbr = '1234x';\n<\/pre>\n<p>Now let&#8217;s try the schema with<code> UNIQUE<\/code> and <code>NULL<\/code>-able columns. <\/p>\n<pre>CREATE TABLE Order_Details\n(order_nbr CHAR(5) NOT NULL \n&#160; REFERENCES Sales_Orders(order_nbr)\n&#160; ON UPDATE CASCADE\n&#160; ON DELETE CASCADE, \n&#160;order_item VARCHAR(10), \n&#160;UNIQUE (order_nbr, order_item));\n<\/pre>\n<p>While this would not make sense in this example (what color<code> NULL <\/code>do you want to buy?), you can load the Details table with: <\/p>\n<pre>INSERT INTO Order_Details\nVALUES \n('12345', 'Hat'), ('12345', 'Pants'), ('12345', NULL), \n('12346', 'Hat'), ('12346', NULL), ('12346', 'Shoes'), \n('12347', 'Shirt'); \n<\/pre>\n<p>And if you try this, the<code> NULL<\/code> is treated as a value would have been. <\/p>\n<pre>INSERT INTO Order_Details VALUES ('12345', NULL); -- fails on duplicate key <\/pre>\n<p>Since a <code>PRIMARY KEY <\/code>has to be <code>NOT NULL<\/code> by definition, this is a little weird. Among the options in the <code>ON [DELETE | UPDATE]<\/code> can set the referencing columns to<code> NULL<\/code> or <code>DEFAULT<\/code>. <\/p>\n<p>This model of referential integrity is deliberately simple. We rejected some elaborations that would have used a &#8220;quorum&#8221; model (you must have at least (n) out of a list of (m) referencing tables) or special syntax for self-references. This does not mean that we cannot get fancy. <\/p>\n<p>&#160;In full ANSI\/ISO Standard SQL, we have a <code>CREATE ASSERTION<\/code> statement which is a <code>CHECK()<\/code> constraint applied over the whole schema. Unlike a simple <code>CHECK()<\/code> clause, an assertion can reference multiple tables in its <code>CHECK<\/code> predicate. <\/p>\n<p>We do not have this feature in T-SQL yet. The <code>UNIQUE<\/code> constraint is the only multiple table logical tool we have. Luckily, we can do a few things with it. <\/p>\n<h2>Multiple UNIQUEs<\/h2>\n<p>While you can only have one <code>PRIMARY KEY<\/code>, there is no limit on the number of<code> UNIQUE <\/code>constraints per table. Again, it is easier to show with an example than with a narrative. You are given a table of Students and table of Faculty, which have an associated email address each. The immediate thought is a skeleton like this: <\/p>\n<pre>CREATE TABLE Students \n(student_nbr CHAR(5) NOT NULL PRIMARY KEY, \n&#160;email_address VARCHAR(255) NOT NULL);\nINSERT INTO Students (student_nbr, email_address)\nVALUES ('Stud1', 'john@foobar.com'), ('Stud2', 'mary@foobar.com');\n\nCREATE TABLE Faculty \n(faculty_id CHAR(5) NOT NULL PRIMARY KEY, \n&#160;email_address VARCHAR(255) NOT NULL);\nINSERT INTO Faculty (faculty_id, email_address)\nVALUES ('Fac01', 'Codd@school.edu'), ('Fac02', 'Date@school.edu');\n<\/pre>\n<p>But this allows an email to belong to both a student and a teacher. Let&#8217;s assume that we want to keep the emails disjoint. You might want put the emails into one table keyed on the email. <\/p>\n<pre>CREATE TABLE Email_Addresses\n(email_address VARCHAR(255) NOT NULL PRIMARY KEY, \n&#160;student_nbr CHAR(5)&#160;&#160; \n&#160; REFERENCES Students(student_nbr), \n&#160;faculty_id CHAR(5) \n&#160; REFERENCES Faculty(faculty_id), \nCHECK ((student_nbr IS NOT NULL AND faculty_id IS NULL) \n&#160;&#160;&#160;&#160;&#160;&#160;&#160; OR (student_nbr IS NULL AND faculty_id IS NOT NULL))\n);\n<\/pre>\n<p>These four insertions work just fine<\/p>\n<pre>INSERT INTO Email_Addresses VALUES ('john@foobar.com', 'Stud1', NULL);\nINSERT INTO Email_Addresses VALUES ('mary@foobar.com', 'Stud2', NULL);\nINSERT INTO Email_Addresses VALUES ('Codd@school.edu', NULL, 'Fac01'); \nINSERT INTO Email_Addresses VALUES ('Date@school.edu', NULL, 'Fac02');\n<\/pre>\n<p>But when I try to add myself and my email as either a student or faculty member, I get failures.<\/p>\n<pre>INSERT INTO Email_Addresses \nVALUES ('Celko@school.edu', 'StudX', NULL); -- fails on student FK\nINSERT INTO Email_Addresses \nVALUES ('Celko@school.edu', NULL, 'FacX'); -- fails on faculty FK\nINSERT INTO Email_Addresses \nVALUES ('Celko@school.edu', 'StudX', 'FacX'); -- fails on check() \nINSERT INTO Email_Addresses \nVALUES ('Celko@school.edu', NULL, NULL); -- fails on check() \n<\/pre>\n<p>But if I do these two insertions, we have no problems. I have no trouble becoming both a student and faculty member with the same email. Not what I meant to do. <\/p>\n<pre>INSERT INTO Students (student_nbr, email_address) VALUES ('StudX', 'Celko@school.edu');\nINSERT INTO Faculty (faculty_id, email_address) VALUES ('FacX', 'Celko@school.edu');\n<\/pre>\n<p>Here is a second shot at it. First, we need some skeleton tables for the students and faculty; they are obvious entities. Emails are also clearly entities, too. We have decided that we have two types of emails, S = student and F = Faculty. If we decide that an email is required, then we can show the relationship of faculty to an email, and student to an email. <\/p>\n<pre>CREATE TABLE Emails\n(email_address VARCHAR(255) NOT NULL UNIQUE,\n&#160;email_type CHAR(1) NOT NULL,\n&#160;CONSTRAINT valid_email_type\n&#160; CHECK (email_type IN ('S', 'F')),\n&#160;PRIMARY KEY (email_address, email_type)); \n<\/pre>\n<p>We need to load the email directory first. <\/p>\n<pre>INSERT INTO Emails (email_address, email_type)\nVALUES ('john@foobar.com', 'S'),\n&#160;&#160;&#160;&#160;&#160;&#160; ('mary@foobar.com', 'S'),\n&#160;&#160;&#160;&#160;&#160;&#160; ('Codd@school.edu', 'F'), \n&#160;&#160;&#160;&#160;&#160;&#160; ('Date@school.edu', 'F');\n<\/pre>\n<p>The Students have a hard-wired email type of &#8216;S&#8217; in its rows and the Faculty uses &#8216;F&#8217; to keep the emails in two disjoint subsets. <\/p>\n<pre>CREATE TABLE Students \n(student_nbr CHAR(5) NOT NULL PRIMARY KEY,\n&#160;email_address VARCHAR(255) NOT NULL,\n&#160;email_type CHAR(1) DEFAULT 'S' NOT NULL\n&#160;&#160; CHECK (email_type = 'S'),\n&#160;CONSTRAINT valid_student_email\n&#160;FOREIGN KEY (email_address, email_type)\n&#160;&#160; REFERENCES Emails (email_address, email_type)); \n<\/pre>\n<p>Here is some insertions. The second fails from a bad email. When we correct it, it an be inserted. <\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">INSERT INTO Students (student_nbr, email_address)\nVALUES ('Stud1', 'john@foobar.com');&#160; -- fine\nINSERT INTO Students (student_nbr, email_address)\nVALUES ('Stud2', 'mary@school.edu'); -- bad email!\nINSERT INTO Students (student_nbr, email_address)\nVALUES ('Stud2', 'mary@foobar.com'); -- fine\n<\/pre>\n<p>This same pattern will hold for Faculty emails, so i will not show it here. In practice, you would probably create a view with the available email addresses for the type of user trying to get an account. For multiple emails, we just add another 1:M relationship table <\/p>\n<pre>CREATE TABLE Student_Emails\n(student_nbr CHAR(10) NOT NULL\n&#160; REFERENCES Students(student_nbr)\n&#160; ON UPDATE CASCADE\n&#160; ON DELETE CASCADE,\n&#160;email_address VARCHAR(255) NOT NULL,\n&#160; PRIMARY KEY (student_nbr, email_address),\n&#160;&#160; email_type CHAR(1) DEFAULT 'S' NOT NULL\nCHECK (email_type = 'S'),\nFOREIGN KEY (email_address, email_type)\nREFERENCES Email_Assignments (email_address)\nON UPDATE CASCADE);\n\n--same pattern for multiple faculty emails, too \n\nCREATE TABLE Faculty_Emails\n(faculty_id CHAR(10) NOT NULL\nREFERENCES buildings(faculty_id)\nON UPDATE CASCADE\nON DELETE CASCADE,\nemail_address VARCHAR(255) NOT NULL,\nPRIMARY KEY (faculty_id, email_address),\nemail_type CHAR(1) DEFAULT 'F' NOT NULL\nCHECK (email_type = 'F'),\nFOREIGN KEY (email_address, email_type)\nREFERENCES Email_Assignments (email_address, email_type)\nON UPDATE CASCADE);\n\n<\/pre>\n<h2><b>Self-References <\/b><\/h2>\n<p>It is possible that the referenced table and the referencing tables can be the same. This is not how the &#8220;parent-child&#8221; mode from old network databases worked with pointer chain structures.<\/p>\n<p>It is again easier to give an example than to talk theory. Let&#8217;s model a history of some generic events and we want to be sure that there are no gaps in the history. We will use the ISO half-open interval model of time. An on-going event uses a <code>NULL<\/code> as its ending time. <\/p>\n<pre>CREATE TABLE Events\n(event_id CHAR(10) NOT NULL,\nprevious_event_end_date DATE NOT NULL\nCONSTRAINT Prior_Date\nREFERENCES Events (event_end_date),\nevent_start_date DATE NOT NULL,\nevent_end_date DATE UNIQUE, -- null means event in progress\nPRIMARY KEY (event_id, event_start_date),\nCONSTRAINT Event_Order_Valid\nCHECK (event_start_date &lt;= event_end_date),\nCONSTRAINT Chained_Dates \nCHECK (DATEADD(DAY, 1, previous_event_end_date) = event_start_date)\n); \n<\/pre>\n<p>Create this table and try to add a new event. You will fail! The previous date comes into existence as part of the whole row. This is part of the set-oriented nature of SQL. I need to disable that constraint so I can put a &#8220;starter row&#8221; in the table. You can hide this in a stored procedure: <\/p>\n<pre>-- disable the constraints\nALTER TABLE Events NOCHECK CONSTRAINT Chained_Dates;\nALTER TABLE Events NOCHECK CONSTRAINT Prior_Date;\n-- insert a starter row\nINSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)\nVALUES ('Foo Fest', '2013-01-01', '2013-01-02', '2013-01-05');\n-- enable the constraints in the table\nALTER TABLE Events CHECK CONSTRAINT Chained_Dates;\nALTER TABLE Events CHECK CONSTRAINT Prior_Date;\n-- this works\nINSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)\nVALUES ('Glob Week', '2013-01-05', '2013-01-06', '2013-01-10');\n-- this fails\nINSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)\nVALUES ('Snoob Day', '2013-01-09', '2013-01-11', '2013-01-15'); \n<\/pre>\n<h2><b>Overlapping Keys <\/b><\/h2>\n<p><b><\/b>One of the basic tricks in SQL is representing a many-to-many relationship. You create a third table that references the two (or more) tables involved by their primary keys. This third table has quite a few popular names taken from older network databases, such as &#8220;junction table&#8221; or &#8220;links&#8221; that are incorrect. It is a relationship. For example, here are two skeleton tables:<\/p>\n<pre>CREATE TABLE Boys\n(boy_name VARCHAR(20) NOT NULL PRIMARY KEY\n..);\nCREATE TABLE Girls\n(girl_name VARCHAR(20) NOT NULL PRIMARY KEY,\n.. );\n<\/pre>\n<p>Yes, I know using names for a key is a bad practice, but doing so will make my examples easier to read. A relationship has a degree, which is the number of sets involved; participation by a set can be mandatory or optional; the cardinality can be one to one, one to many, or many to many. The relationship table is a table and therefore needs a key of its own. <\/p>\n<p>The &#8220;<code>Dance<\/code>&#8221; table is a proper table, with a two-column key. <\/p>\n<pre>CREATE TABLE Dance\n(boy_name INTEGER NOT NULL\nREFERENCES Boys (boy_name),\ngirl_name INTEGER NOT NULL,\nREFERENCES Girls(girl_name),\nPRIMARY KEY (boy_name, girl_name)); -- compound key\n<\/pre>\n<p>The only restriction on the rows is that a pairing appears only once. Now, I want to make a rule that guys can dance with as many gals as they want, but the gals have to stick to one guy. The way I do this is to use a <code>NOT NULL<\/code> <code>UNIQUE<\/code> constraint on the <code>girl_name<\/code> column, which makes it a key. It&#8217;s a simple key because it is only one column, but it is also a nested key because it appears as a subset of the compound <code>PRIMARY KEY<\/code>.<\/p>\n<pre>CREATE TABLE Danceboys\n(boy_name INTEGER NOT NULL\nREFERENCES Boys (boy_name),\ngirl_name INTEGER NOT NULL UNIQUE, -- nested key\nREFERENCES Girls(girl_name),\nPRIMARY KEY (boy_name, girl_name)); -- compound key\n('Joe Celko', 'Sally Jones')\n('Joe Celko', 'Jane Doe')\n<\/pre>\n<p>The ladies might want the same freedom and dance with a series of men:<\/p>\n<pre>CREATE TABLE Dancegirls\n(boy_name INTEGER NOT NULL UNIQUE -- nested key\nREFERENCES Boys (boy_name),\ngirl_name INTEGER NOT NULL,\nREFERENCES Girls(girl_name),\nPRIMARY KEY (boy_name, girl_name)); -- compound key\n<\/pre>\n<p>The <code>Dancegirls<\/code> table would permit these rows from our original set:<\/p>\n<pre>('Joe Celko', 'Jane Doe')\n('John Smith', 'Jane Doe')\n<\/pre>\n<p>If we want to stop them changing dancing partners, we can use three keys. <\/p>\n<pre>CREATE TABLE Dancingpartners\n(boy_name INTEGER NOT NULL UNIQUE -- nested key\nREFERENCES Boys (boy_name),\ngirl_name INTEGER NOT NULL UNIQUE -- nested key,\nREFERENCES Girls(girl_name),\nPRIMARY KEY(boy_name, girl_name)); -- compound key\n<\/pre>\n<p>Think about this table for a minute. The <code>PRIMARY KEY <\/code>is now redundant. If each boy appears only once in the table and each girl appears only once in the table, then each (<code>boy_name<\/code>, <code>girl_name<\/code>) pair can appear only once. From a theoretical viewpoint, I could drop the compound key and make either <code>boy_name<\/code> or <code>girl_name<\/code> the new primary key, or I could just leave them as <code>UNIQUE NOT NULL<\/code> keys. From a practical viewpoint, each of these keys will be an index. You could be trading storage space for access speed, but you should be careful about this<\/p>\n<p>SQL products and theory do not always match. Many products make the assumption that the <code>PRIMARY KEY<\/code> is in some way special in the data model and will be the way to access the table most of the time.<\/p>\n<p>Many programmers are not aware that a <code>FOREIGN KEY<\/code> constraint can also reference any<code> UNIQUE<\/code> constraint in the same table or in another table. The following nightmare will give you an idea of the possibilities. The multiple column versions follow the same syntax.<\/p>\n<p>My favorite example of a overlapping <code>UNIQUE<\/code> constraints is a teacher&#8217;s schedule kept in a table like this:<\/p>\n<pre>CREATE TABLE Schedule\n(teacher_name VARCHAR(15) NOT NULL,\n&#160;class_name CHAR(15) NOT NULL,\n&#160;room_nbr INTEGER NOT NULL,\n&#160;class_period INTEGER NOT NULL,\n-- bad primary key! \n&#160;PRIMARY KEY (teacher_name, class_name, room_nbr, class_period));\n<\/pre>\n<p>That choice of a primary key is the most obvious one &#8212; use all the columns. Typical rows would look like this: <\/p>\n<pre>('Mr. Celko', 'Database 101', 222, 6) <\/pre>\n<p>The rules you want to enforce are:<\/p>\n<ul>\n<li>A teacher is in only one room each period  <\/li>\n<li>A teacher teaches only one class each period  <\/li>\n<li>A room has only one class each period  <\/li>\n<li>A room has only one teacher in it each period. <\/li>\n<\/ul>\n<p>Stop reading and see what you come up with for an answer.<\/p>\n<pre>CREATE TABLE Schedule\n(teacher_name VARCHAR(15) NOT NULL,\nclass_name CHAR(15) NOT NULL,\nroom_nbr INTEGER NOT NULL,\nclass_period INTEGER NOT NULL,\nUNIQUE (teacher_name, room_nbr, class_period), -- rule #1\nUNIQUE (teacher_name, class_name, class_period), -- rule #2\nUNIQUE (class_name, room_nbr, class_period), -- rule #3\nUNIQUE (teacher_name, room_nbr, class_period) -- rule #4);\n<\/pre>\n<p>You know that you have 24 ways to pick three objects from a set of four in an ordered sequence (permutation). If order does not matter, then you have a combination and only four subsets, all of which I have used in the <code>UNIQUE<\/code> constraints. Although column order is important in creating an index, you can ignore it for now and then worry about index tuning later.<\/p>\n<p>I probably want to drop the first <code>PRIMARY KEY<\/code> as redundant if I have all four of these constraints in place. But what happens if I drop the <code>PRIMARY KEY<\/code> and then one of the constraints?<\/p>\n<pre>CREATE TABLE Schedule-2\n(teacher_name VARCHAR(15) NOT NULL,\nclass_name CHAR(15) NOT NULL,\nroom_nbr INTEGER NOT NULL,\nclass_period INTEGER NOT NULL,\nUNIQUE (teacher_name, room_nbr, class_period), -- rule #1\nUNIQUE (teacher_name, class_name, class_period), -- rule #2\nUNIQUE (class_name, room_nbr, class_period)); -- rule #3\n<\/pre>\n<p>I can now insert these rows in the second version of the table:<\/p>\n<pre>('Mr. Celko', 'Database 101', 222, 6)\n('Mr. Celko', 'Database 102', 223, 6)\n<\/pre>\n<p>This gives me a very tough sixth period class load because I have to be in two different rooms at the same time. Things can get even worse when another teacher is added to the schedule:<\/p>\n<pre>('Mr. Celko', 'Database 101', 222, 6)\n('Mr. Celko', 'Database 102', 223, 6)\n('Ms. Cannon', 'Database 101', 223, 6)\n<\/pre>\n<p>Ms. Cannon and I are both in room 223, trying to teach different classes at the same time. I think you get the idea that a relationship table is not simple. But if I know a <code>class_period<\/code> and<code> teacher_name<\/code>, then I can find a unique <code>room_nbr<\/code>. The same logic holds for <code>room_nbr<\/code> and <code>class_period<\/code> to determine the <code>class_name<\/code>. <\/p>\n<pre class=\"theme:ssms2012 lang:tsql highlight:0 decode:true\">CREATE TABLE Schedule\n(teacher_name VARCHAR(15) NOT NULL,\n&#160;class_name CHAR(15) NOT NULL,\n&#160;room_nbr INTEGER NOT NULL,\n&#160;class_period INTEGER NOT NULL,\n&#160;UNIQUE (teacher_name, class_period), -- rules #1 &amp; #2\n&#160;UNIQUE (room_nbr, class_period)); -- rules #3 &amp; #4 \n\n<\/pre>\n<p>The good news \/ bad news is that the <code>UNIQUE<\/code> constraints will have to create an index. We will gain both the extra data integrity and improved access speed. We will lose storage space and slow down inserts, updates and deletes.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Here is an in-depth  look at an underused constraint, UNIQUE,  that can increase the performance of queries and protect data integrity. &hellip;<\/p>\n","protected":false},"author":96214,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143531],"tags":[4150,4252],"coauthors":[],"class_list":["post-1553","post","type-post","status-publish","format-standard","hentry","category-t-sql-programming-sql-server","tag-sql","tag-t-sql-programming"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1553","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/96214"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=1553"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1553\/revisions"}],"predecessor-version":[{"id":40851,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1553\/revisions\/40851"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=1553"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=1553"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=1553"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=1553"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}