The VALUES clause or building tables out of nothing

Comments 0

Share to social media

The VALUES clause is probably one of the most misused features in SQL. If you look at SQL forums online, you’ll see people use it as the second clause in an insertion statement, but they only use it to construct a single row at a time, thus:

Each insertion statement ends with a semi-colon, so they will be executed separately and in the order presented. An optimizer doesn’t dare combine them because there might be a forward reference to previous insertions.

I think people write this kind of code because that this is how you would read punch cards. Each card goes into a card reader, gets buffered, and written in the order presented to the magnetic tape or disk file. Welcome to 1960! Stop mimicking old programming languages like FORTRAN or BASIC that had WRITE statements and put one record at a time into a file. Start thinking of working with entire sets.

Using VALUES to insert rows

The VALUES clause is more appropriately called a table constructor. Each row constructor within the table is a comma-separated list enclosed in parentheses. Officially, there is an optional keyword ROW. They can be placed at the start of each list. Nobody does this, and it is a bit redundant, but was required in MySQL.

One of the worst ways of constructing a table is to use the CREATE or DECLARE construct to build a temporary table, load it with insertion statements, and finally insert the table into the desired destination. This leads to multiple statements with no way to really optimize the insertion and shows that you are really not thinking in sets yet.

The entire zodiac can be inserted with a single statement like this:

Given a whole set of rows, the optimizer can deal with a single atomic statement. Not only does it save execution time as compared to the row-at-a-time model of insertion, but it presents the optimizer with an opportunity to improve things. The insertion statement can rearrange the list of new rows and pick an optimal ordering. It also means that if one of my rows had an error in it, I wouldn’t have to back out all of the other rows. If I wanted a proper ACID transaction model, I would’ve had to back out each individual insert up until I came to the insertion that gave me the error.

Here’s the basic syntax. Please note that besides including an expression of the proper data type, you can use the keywords DEFAULT or NULL in a row constructor. Obviously, those values must make sense in relation to the declaration of the table into which you are inserting.

Please remember that an expression is not always a simple constant. In fact, it’s very handy to use the CAST (<exp> AS <data type>) function as a way to assure that a column in the constructed virtual table has a known data type:

The AS keyword can also be used, to give the constructed tables each a name. Here is skeleton:

MERGE with VALUES Clause

The MERGE statement was added to Standard SQL several years ago. It was based on a proposal by ANSI representatives from Oracle and IBM, but forms of it had already existed in other products, though under a different name. The most common one was UPSERT from Postgres. Let’s jump right into it.

One table expression is the target, the table you are trying to modify. The other table expression is the source, the table that provides the modifications. Presumably, you want the target to persist, but you don’t need the source to persist after the updates and insertions are done.

The MERGE clause defines the target; the USING clause defines the source. And the ON clause matches the two tables. The WHEN [NOT] MATCHED ...THEN clauses determine the action to be taken

 

It’s easier to think of the MERGE statement as a program, written as a single statement Instead of having IF–THEN–ELSE logic or CASE expressions in multiple statements.

Obviously, updating makes sense only when there is a match, and inserting makes sense only when there is not a match. The standards allow for either of these clauses to include an optional … AND <search condition>, so you can add quite a bit of logic to this one statement. Technically, the WHEN clauses list can finish with the ELSE IGNORE; it acts as a placeholder just as the ELSE clause did in the CASE expression. Microsoft has more extensions to the syntax, and there have been some performance issues. If you are using it in SQL Server, I strongly suggest checking both the syntax and current performance in whichever version of SQL Server you’re running.

Conclusion

The VALUES clause is ANSI standard and implemented by many relational database vendors. The VALUES clause can save typing, and it’s also the rare case when easier can mean better performance because the rows will be treated as a set.

 

About the author

Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
Joe is a well-known figure on Newsgroups and Forums, and he is famous for his his dry wit. He is also interested in Science Fiction.

Joe's contributions