A VIEW
is a virtual table, defined by a query, that does not exist until it is invoked by name in an SQL statement. This may sound simple enough, but some developers have difficulties with the concept. Because of this, they tend to leave out VIEW
s because they do not appreciate their value. It is easy to write a query and not bother to put it into a VIEW
, because there is no performance boost. ‘If I am going to save code,’ they reason, ‘I want it in a stored procedure that can take a parameter list instead.’
In fact, a VIEW
definition can be copied as in-line text, just like a CTE. But with a good optimizer, the SQL engine can decide that enough sessions are using the same VIEW
and materialize it as a shared table. The CTE, in contrast, is strictly local to the statement in which it is declared.
VIEW
s are often named incorrectly. A VIEW
is a table, so it is named just like any other table. The name tells us what set of things it represents in the data model. The most common offender is the “Volkswagen” coder who prefixes or suffixes the VIEW
name with “vw_” or “VIEW
_” in violation of ISO-11179 rules. We do not mix data and meta data in a name. This is as silly as prefixing every noun in a novel with “n_” so the reader will know that the word is a noun in English grammar.
The ANSI Standard SQL syntax for the VIEW
definition is
1 2 3 4 |
CREATE VIEW <table name> [(<VIEW column list>)] AS <query expression> [WITH [<levels clause>] CHECK OPTION] <levels clause> ::= CASCADED | LOCAL |
the WITH CHECK OPTION
is a little known and less used feature that has been around since the SQL-86 Standards. The <levels clause>
option in the WITH CHECK OPTION
did not exist in Standards before SQL-92. It is not implemented in T-SQL dialect, so I will skip it. If you see it, just remember T-SQL defaults to CASCADED
behavior.
A VIEW
has no physical existence in the database until it is invoked. You cannot put constraints on a VIEW
for that reason. The name of the VIEW
must be unique within the entire database schema, like a base table name. The VIEW
definition cannot reference itself, since it does not exist yet. Nor can the definition reference only other VIEW
s; the nesting of VIEW
s must eventually resolve to underlying base tables. This only makes sense; if no base tables were involved, what would you be VIEW
ing?
You can either build a column name list in the VIEW
header or inherit the column names from the SELECT
statement. Building this list is usually just one quick “cut & paste” and well worth it. This is why we do not ever use “SELECT
*” in a VIEW
definition in production code. When the columns of a base tables change, the definition of the “star” will also change. If you are lucky, you will get an error when the VIEW
has too many or too few columns when it is invoked. If you are not so lucky, the VIEW
will run and give you unexpected results. If you are unlucky, the VIEW
will run and give you wrong answers that you use.
Every few months, someone will post to a SQL forum asking how to use a parameter in a VIEW
. They would never ask how to use a parameter in a base table. The sight of a SELECT
statement instead of a list of column declarations throws their mindset in the wrong direction.
Mullins Heuristics for VIEWS
Programmers have rules and standards for creating base tables. The data element names should follow the ISO-11179 rules. We have to have a key. We can have all kinds of constraints. We can have Declarative Referential Integrity actions among other base tables. But how do you design a VIEW
?
Craig Mullins, a DB2 expert and author, gave the following rule to ensure that VIEW
s are created in a responsible and useful manner. Simply stated, the VIEW
creation strategy should be goal-oriented. VIEW
s should be created only when they achieve a specific, reasonable goal. Each VIEW
should have a specific application or business requirement that it fulfills before it is created. That requirement should be documented somewhere, preferably in a data dictionary.
Although this rule seems obvious, VIEW
s are implemented at some shops without much thought as to how they will be used. This can cause the number of VIEW
s that must be supported and maintained to increase until so many VIEW
s exist that it is impossible to categorize their uses. Nobody wants to take a chance and drop a VIEW
, so they just write a new one. Whenever a base table used by a VIEW
definition is changed, then all those VIEW
s have to be re-compiled and checked. Since VIEW
s can be built on top of VIEW
s, this can be tricky.
Unlike other virtual tables, a VIEW
is defined in the schema information tables and its definition (not its content!) is persisted. This implies some privileges are needed to use, create, alter and drop VIEW
s. The first question is do you need to have privileges on the base tables that build a VIEW
? Yes, but not full privileges. The minimal privileges would be to use the base tables, so you can build the VIEW
. But that does not mean that the user needs to be able to directly query or modify the base tables.
The ideal design should give each user a set of VIEW
s that make it look as if the schema was designed for just his or her use, without regard to the rest of the enterprise.
This is most often done for security and privacy. The payroll clerk can see the salaries of other personnel and change them. But he cannot give himself a pay raise and try to get out of town before the police find out. He can see the minimum, maximum and average salary in each department, but not who is making which salary.
The Data Control Language (DCL) is the third sub-language in SQL after DDL and DML. This is where the DBA can GRANT
, REVOKE
or DENY
all kinds of schema object privileges. We spend almost no time on it in training classes, and failure to do it right can destroy your enterprise. As a generalization, the DBA ought to start with a list of roles users can play in the enterprise and create a script for the privileges each role needs. A new user can then be assigned a role and you do not have to repeat the script over and over.
Do not grant VIEW
creation privileges to everyone. The “nearly the same” VIEW
s are a special problem. One user might have read the spec “Employees must be over 21 years of age to serve alcohol” to mean strictly over 21 as of today or can they pour a drink on their 21-st birthday? If VIEW
creation had been left to just one data modeler, only one of these VIEW
s would exist and it would have the correct business rule.
Tricky Queries and Computations
Not all programers are equal, so you can make sure that the VIEW
s preserve the best work in your shop. The other advantage is that if someone finds a better query for the current state of the database, you keep the VIEW
header, drop the SELECT
statement in the body, replace it and then re-compile your code. The programmer needs no knowledge of how the VIEW
works. This technique becomes more useful as the SQL becomes more complex.
In T-SQL, we used to write complicated code to get sequence numbers and pure dates without time. This code was often hidden in VIEW
s. The numbering can now be done with ROW_NUMBER() and we have a DATE data type since SQL Server 2008. In many cases, procedures and functions that used loops and fancy string manipulations can be replaced with VIEW
s.
Updatable and Read-Only VIEW
s
Unlike base tables, VIEW
s are either updatable or read-only, but not both. INSERT
, UPDATE
, and DELETE
operations are allowed on updatable VIEW
s and base tables, subject to other constraints. INSERT
, UPDATE
, and DELETE
are not allowed on read-only VIEW
s, but you can change their base tables, as you would expect.
An updatable VIEW
is one that can have each of its rows associated with exactly one row in an underlying base table. When the VIEW
is changed, the changes pass through the VIEW
to that underlying base table unambiguously. Updatable VIEW
s in Standard SQL are defined only for queries that meet these criteria
- They are built on only one table
- No
GROUP BY
clause - No
HAVING
clause - No aggregate functions
- No calculated columns
- No
UNION
,INTERSECT
orEXCEPT
- No
SELECT DISTINCT
clause - Any columns excluded from the
VIEW
must beNULL
-able or have aDEFAULT
clause in the base table, so that a whole row can be constructed for insertion.
By implication, the VIEW
must also contain a key of the table. In short, we are absolutely sure that each row in the VIEW
maps back to one and only one row in the base table. The major advantage of this limited definition is that it is based on syntax and not semantics. For example, these VIEW
s are logically identical:
1 2 3 4 |
CREATE VIEW Foo1 (a, b, ..) -- updatable, has a key! AS SELECT (a, b, ..) FROM Foobar WHERE x IN (1,2); |
1 2 3 4 5 6 7 8 |
CREATE VIEW Foo2 (a, b, ..)-- not updateable! AS SELECT (a, b, ..) FROM Foobar WHERE x = 1 UNION ALL SELECT (a, b, ..) FROM Foobar WHERE x = 2; |
But Foo1 is updateable and Foo2 is not. While I know of no formal proof, I suspect that determining if a complex query resolves to an updatable query for allowed sets of data values possible in the table is an NP-complete problem.
The INSTEAD OF
trigger was the ANSI Standards Committee letting the Data Modeler decide on how to resolve the VIEW
updating problem. These triggers are added to a VIEW
and are executed on base tables that make up the VIEW
. The user never sees them fire and work their magic.
As an example, consider a VIEW
that builds the total compensation for each employee by joining the personnel, employee stock holdings, bonuses and salary_amt
tables in one VIEW
. An INSTEAD OF
trigger can update the total compensation using a hidden formula and complex business rules that the user never sees.
The use of INSTEAD OF
triggers gives the user the effect of a single table, but there can still be surprises. Think about three tables; A, B and C. Table C is disjoint from the other two. Tables A and B overlap. So I can always insert into C and may or may not be able to insert into A and B if I hit overlapping rows.
Going back to my Y2K consulting days, I ran into a version of such a partition by calendar periods. Their Table C was set up on Fiscal quarters and got leap year wrong because one of the fiscal quarters ended on the last day of February.
Nested VIEWs
A point that is often missed, even by experienced SQL programmers, is that a VIEW
can be built on other VIEW
s. The only restrictions are that circular references within the query expressions of the VIEW
s are illegal and that a VIEW
must ultimately be built on base tables. One problem with nested VIEW
s is that different updatable VIEW
s can reference the same base table at the same time. If these VIEW
s then appear in another VIEW
, it becomes hard to determine what has happened when the highest-level VIEW
is changed. As an example, consider a table with two keys:
1 2 3 4 5 6 |
CREATE TABLE CanadianDictionary (english_id INTEGER UNIQUE, french_id INTEGER UNIQUE, eng_word CHAR(30), french_word CHAR(30), CHECK (COALESCE (english_id, french_id) IS NOT NULL); |
The table declaration is a bit strange. It allows an English-only or French-only word to appear in the table. But the CHECK()
constraint requires that a word must fall into one or both type codes.
1 2 3 4 5 6 7 8 9 10 11 |
INSERT INTO CanadianDictionary VALUES (1, 2, 'muffins', 'croissants'), (2, 1, 'fish bait', 'escargots'); CREATE VIEW EnglishWords AS SELECT english_id, eng_word FROM CanadianDictionary WHERE eng_word IS NOT NULL; CREATE VIEW FrenchWords AS SELECT french_id, french_word FROM CanadianDictionary WHERE french_word IS NOT NULL); |
We have now tried the escargots and decided that we wish to change our opinion of them:
1 2 3 |
UPDATE EnglishWords SET eng_word = 'appetizer' WHERE english_id = 2; |
Our French user has just tried Haggis and decided to insert a new row for his experience:
1 2 3 |
UPDATE FrenchWords SET french_word = 'tripoux' WHERE french_id = 3; |
The row that is created is (NULL, 3, NULL, 'tripoux')
, since there is no way for the VIEW
FrenchWords
to get to the VIEW
EnglishWords
columns. Likewise, the English VIEW
user can construct a row to insert his translation, (3, NULL, 'Haggis', NULL),
but neither of them can consolidate the two rows into a meaningful piece of data.
To delete a row is also to destroy data; the French-speaker who drops ‘croissants’ from the table also drops ‘muffins’ from VIEW
EnglishWords.
WITH CHECK OPTION Clause
If WITH CHECK OPTION
is specified, the VIEW
ed table has to be updatable. This is actually a fast way to check how your particular SQL implementation handles updatable VIEW
s. Try to create a version of the VIEW
in question using the WITH CHECK OPTION
and see if your product will allow you to create it. The WITH CHECK OPTION
was part of the SQL-89 Standard, but nobody seems to know about it! Consider this skeleton:
1 2 3 4 |
CREATE VIEW V1 AS SELECT key_col, col1, col2 FROM Foobar WHERE col1 = 'A'; |
and now UPDATE
it with
1 |
UPDATE V1 SET col1 = 'B'; |
The UPDATE
will take place without any trouble, but the rows that were previously seen now disappear when we use V1 again. They no longer meet the WHERE
clause condition! Likewise, an INSERT INTO
statement with VALUES (col1 = 'B')
would insert just fine, but its rows would never be seen again in this VIEW
. This might be the desired behavior. For example, you can set up a VIEW
of rows in a jobs table with a status code of ‘to be done’, work on them, and change a status code to ‘finished’, and the rows will disappear from your VIEW
. The important point is that the WHERE
clause condition was checked only at the time when the VIEW
was invoked.
The WITH CHECK OPTION
makes the system check the WHERE
clause condition upon INSERT
and UPDATE
. If the new or changed row fails the test, the change is rejected and the VIEW
remains the same. The WITH CHECK OPTION
clause does not work like a CHECK constraint.
1 2 3 4 5 6 7 8 9 |
CREATE TABLE Foobar (col_a INTEGER); CREATE VIEW TestView (col_a) AS SELECT col_a FROM Foobar WHERE col_a > 0 WITH CHECK OPTION; INSERT INTO TestView VALUES (NULL); -- This fails! CREATE TABLE Foobar_2 (col_a INTEGER CHECK (col_a > 0)); INSERT INTO Foobar_2(col_a) VALUES (NULL); -- This succeeds! |
The WITH CHECK OPTION
must be TRUE
while the CHECK constraint can be either TRUE
or UNKNOWN
. This is an example of the differences in DDL and DML in SQL. Once more, you need to watch out for NULL
s.
T-SQL checks all the underlying levels that built the VIEW
, as well as the WHERE
clause condition in the VIEW
itself. If anything causes a row to disappear from the VIEW
, the UPDATE
is rejected. Consider two VIEW
s built on each other from the Personnel table:
1 2 3 4 5 6 7 8 |
CREATE VIEW Low_Paid_Personnel (emp_id, salary_amt) AS SELECT emp_id, salary_amt FROM Personnel WHERE salary_amt <= 250.00; CREATE VIEW Medium_Paid_Personnel (emp_id, salary_amt) AS SELECT emp_id, salary_amt FROM Low_Paid_Personnel WHERE salary_amt >= 100.00; |
If neither VIEW
has a WITH CHECK OPTION
, the effect of updating Medium_Paid_Personnel
by increasing every salary_amt
by $1,000 will be passed without any check to Low_Paid_Personnel
. Low_Paid_Personnel
will pass the changes to the underlying Personnel table. The next time Medium_Paid_Personnel
is used, Low_Paid_Personnel
will be rebuilt in its own right and Medium_Paid_Personnel
rebuilt from it, and all the employees will disappear from Medium_Paid_Personnel
.
If only Medium_Paid_Personnel
has a WITH CHECK OPTION
on it, the UPDATE
will fail. Medium_Paid_Personnel
has no problem with such a large salary_amt
, but it would cause a row in Low_Paid_Personnel
to disappear, so Medium_Paid_Personnel
will reject it. However, if only Medium_Paid_Personnel
has a WITH LOCAL CHECK OPTION
on it, the UPDATE
will succeed. Medium_Paid_Personnel
has no problem with such a large salary_amt
, so it passes the change along to Low_Paid_Personnel
. Low_Paid_Personnel
, in turn, passes the change to the Personnel table and the UPDATE
occurs. If both VIEW
s have a WITH CHECK OPTION
, the effect is a set of conditions, all of which have to be met. The Personnel table can accept UPDATE
s or INSERT
s only where the salary_amt
is between $100 and $250.
WITH CHECK OPTION as Constraints
Lothar Flatz, an instructor for Oracle Software Switzerland made the observation that while Oracle cannot put subqueries into CHECK()
() constraints and triggers would not be possible because of the mutating table problem, you can use a VIEW
that has a WITH CHECK OPTION
to enforce subquery constraints.
For example, consider a hotel registry that needs to have a rule that you cannot add a guest to a room that another is or will be occupying. Instead of writing the constraint directly, like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE TABLE Hotel (room_nbr INTEGER NOT NULL, arrival_date DATE NOT NULL, departure_date DATE NOT NULL, guest_name CHAR(30) NOT NULL, CONSTRAINT valid_stay_dates CHECK (H1.arrival_date <= H1.departure_date), CONSTRAINT no_overlaps CHECK (NOT EXISTS (SELECT * FROM Hotel AS H1, Hotel AS H2 WHERE H1.room_nbr = H2.room_nbr AND H2.arrival_date < H1.arrival_date AND H1.arrival_date < H2.departure_date))); |
The valid_stay_dates
constraint is fine, since it has no subquery, but will choke on the no_overlaps
constraint. Leaving the no_overlaps
constraint off the table, we can construct a VIEW
on all the rows and columns of the Hotel base table and add a WHERE
clause which will be enforced by the WITH CHECK OPTION
.
1 2 3 4 5 6 7 8 9 10 11 |
CREATE VIEW Valid_Hotel (room_nbr, arrival_date, departure_date, guest_name) AS SELECT H1.room_nbr, H1.arrival_date, H1.departure_date, H1.guest_name FROM Hotel AS H1 WHERE NOT EXISTS (SELECT * FROM Hotel AS H2 WHERE H1.room_nbr = H2.room_nbr AND H2.arrival_date < H1.arrival_date AND H1.arrival_date < H2.departure_date) AND H1.arrival_date <= H1.departure_date WITH CHECK OPTION; |
For example,
1 2 3 4 5 |
INSERT INTO Valid_Hotel VALUES (1, '2012-06-01', '2012-06-03', 'Ron Coe'); GO INSERT INTO Valid_Hotel VALUES (1, '2012-06-03', '2012-06-05', 'John Doe'); |
will give a WITH CHECK OPTION
clause violation on the second INSERT INTO
statement, as we wanted.
Dropping VIEW
s
VIEW
s, like tables, can be dropped from the schema. The T-SQL syntax for the statement is:
1 |
DROP VIEW <table name list>; |
The use of the <table name list> is dialect and it gives you a shorthand for repeating drop statements. The drop behavior depends on your vendor. The usual way of storing VIEW
s was in a schema information table is to keep the VIEW
name, the text of the VIEW
, but dependencies. When you drop a VIEW
, the engine usually removes the appropriate row from the schema information tables. You find out about dependencies when you try to use something that wants the dropped VIEW
s. Dropping a base table could cause the same problem when the VIEW
was accessed. But the primary key/foreign key dependencies among base tables will prevent dropping some base tables.
Table Expression VIEW
s
An old usage for VIEW
s was to do the work of CTEs when there were no CTEs. The programmers created VIEW
s, and then used them. Of course they wasted space, caused disk reads and were only used in one statement. It might be worth looking at old code for VIEW
s that are not shared.
Today the reverse is true. Programmers create the same CTE code over and over in different queries and give it local names for each appearance. Those local names are seldom the same and are often “place markers” like “X” or “CTE_1” and give no hint as to what the table expression means in the data model.
It can be hard to factor out common table expressions across multiple queries. One query uses an infixed JOIN operators and another uses a <span class=”mono”>FROM</span> list, the predicates are equivalent but written slightly different and so forth.
I recommend that you sit down and think of useful VIEW
s, write them and then see if you can find places where they would make the code easier to read and maintain. As an example, our hotel application will probably need to find vacant rooms by calendar date, compute an occupancy ratio by calendar date and other basic facts.
Another bad use is the one VIEW
per Base Table myth that was poplar with DB2 programmers years ago. The reasoning behind this myth was the applaudable desire to insulate application programs from database changes. All programs were to be written against VIEW
s instead of base tables. When a change is made to the base table, the programs would not need to be modified because they access a VIEW
, not the base table.
This does not work in the long run. All you do is accumulate weird orphaned VIEW
s. Consider the simplest type of database change – adding a column to a table. If you do not add the column to the VIEW
, no programs can access that column unless another VIEW
is created that contains the new column. But if you create a new VIEW
every time you add a new column it will not take long for your schema to be swamped with VIEW
s. Even more troublesome is the question of which VIEW
should be used by which program. Similar arguments can be made for any structural change to the tables.
Load comments