In the last article, I tried to show you how those programmers who are learning SQL tend to carry on trying to solve database problems in a procedural way. There is a big leap from a procedural mindset to a declarative one for most programmers. Most of them don’t quite make that leap all at once, but make a gradual step-wise transition from procedural to semi-procedural programming styles.

Procedural code can appear in procedures, but is this necessarily always true? Procedures can be no more than a BEGIN-END block with a sequence of SQL statements without any IF-THEN-ELSE or WHILE-DO loop logic in it. Is such a block procedural or declarative when all it has is one declarative statement in it? I would say it was declarative. Is a block procedural or declarative when it has IF-THEN-ELSE or WHILE-DO loop control logic? I would say ‘procedural’.

You can get rid of a lot of IF-THEN-ELSE control logic with CASE expressions. Before the CASE expression, there were unexpected pitfalls in trying to apply procedural logic to SQL. The classic example is an UPDATE statement that was part of Sybase/SQL Server classes for decades. You have a bookstore and want to change the prices of the books. Any book over $25 will be discounted by 10% (we will advertise that) and books under $25 will be increased by 15% (we will not advertise that). The immediate solution is to write this:

1 2 3 4 5 6 7 8 |
BEGIN UPDATE Books SET price = price * 1.10 WHERE price < 25.00; UPDATE Books SET price = price * 0.85 WHERE price >= 25.00; END; |

But it does not work! Look at a book that sells for $24.95 currently. Its price jumps to $27.45 when the first UPDATE is done. But when we do the second UPDATE, the price goes down to $23.33 finally. That is not what we meant to do. Flipping the updates does not help.

This was the classic argument for cursors. Hang in a loop and use an `IF-THEN-ELSE`

statement to do the update of the current row in the cursor, just like a magnetic tape file. But this is not required today. We have the CASE expression, which is declarative.

1 2 3 4 5 6 7 |
UPDATE Books SET price = CASE WHEN price < 25.00 THEN price * 1.10 ELSE price * 0.85 END; |

Loops can be replaced with various constructs most of which apply set-oriented operations to the table involved, instead of doing RBAR (pronounced “re-bar”, like the steel rods used in concrete construction; RBAR, is an acronym for ‘Row By Agonizing Row’ coined by Jeff Moden). But another common change is to use the `ROW_NUMBER() `

and other ordinal functions to replace a counting loop in procedural code.

## Procedural, Semi-Procedural and Declarative solutions: an example

If you look at Part I of this series of articles, you will see recursion being used to create a table of sequential numbers. It was replaced with a declarative look-up table. But let’s take a similar problem, the calculation of the Fibonacci series, and look at the Procedural, Semi-Procedural and Declarative approaches to it. I’m not trying to suggest that this is a practical problem: If you really needed this data, then you’d download it into a table (from The Fibonacci numbers). The problem is to build a table of Fibonacci numbers with n and the n-th Fibonacci number. I do not want to talk about the Fibonacci series. No, that is a lie. I would love to write a whole book on it, but a lot of other people beat me to it (see the references). Darn! The usual definition for the series is recursive:

1 2 3 4 5 6 7 8 9 |
FUNCTION fib(n)IF n = 0 THEN RETURN 0; ELSE IF n = 1 THEN RETURN 1; ELSE IF n > 1 THEN RETURN (fib(n-2) + fib(n-1)); END IF; END IF; END IF; |

## Using Computation to Replace a Look-up

The most extreme example I can remember of using computation to replace a look up was decades ago at Georgia Tech when we had CDC Cyber series computer. The hardware had the best floating point hardware and speed in its days. To give you an idea of what I mean, CDC had a COBOL compiler that converted COBOL picture data to floating point numbers, did the math and converted back to COBOL picture formats. It out-performed IBM machines on campus.

Rather than do table look-up in FORTRAN, one of our sales team members had just had a course on Chebyshev polynomials and fitting data to curves (Wikipedia: Chebyshev polynomials) and had a software package to create these polynomials. The formula that was impossible for a human being to understand. But it was faster than reading a disk and the data had a pattern that worked well with polynomials. Do not do this on anything but a supercomputer. I will now avoid the issue of performance versus maintenance.

There is a little debate about whether to start at (n = 0) or at (n =1), but the idea is that fib(n) = (fib(n-2) + fib(n-1)), so it is 0,1,1,2,3,5,8, etc. Here is a completely procedural loop to compute a table of Fibonacci numbers:

1 2 3 4 5 6 7 8 9 10 11 12 |
BEGIN DECLARE @n INTEGER; SET @n = 1; DECLARE @f0 INTEGER; SET @f0 = 0; DECLARE @f1 INTEGER; SET @f1 = 1; WHILE (@n < 93) BEGIN SET @f1 = @f0 + @f1 SET @f0 = @f1; SET @n = @n + 1; INSERT INTO Fibonacci (@n, @f0); END; |

Since the Fibonacci series has a recursive definition, it makes the semi-procedural programmer feel good to use it in a recursive CTE than can be used in an INSERT INTO statement.

1 2 3 4 5 6 7 8 9 10 |
WITH Fibonacci(n, f0, f1) AS (SELECT n, f, f1 --fixpoint step FROM (VALUES (CAST(1 AS BIGINT), CAST(0 AS BIGINT), CAST(1 AS BIGINT))) UNION ALL SELECT n + 1, f0 + f1, f0 -- recursion step FROM Fibonacci WHERE n < 93) -- upper limit for BIGINT SELECT n, f0 -- results FROM Fibonacci; |

But recursion is actually a procedural technique. It is also expensive since it is really a cursor under the covers. Another way to do this is with a mathematical trick called a closed form. Iteration and recursion are both avoided by a computation in a simple expression. In the case of the Fibonacci, you can use the Golden Ratio (Wikipedia: The Golden Ratio) or phi, in a formula.

1 2 3 4 5 6 |
INSERT INTO Fibonacci(n, fib) SELECT seq, ROUND (((POWER (1.6190339887, @n)- POWER (1.0 - 1.6190339887, @n)) /SQRT (5.0)), 0) FROM Series WHERE seq < 93; |

## Avoiding Integrity Triggers

Procedural code can appear in triggers and here is where you will find problems. Triggers are the real semi-procedural code in SQL. They are procedural code out of the users immediate control, attached to a table for the purpose of data integrity, computing special columns and external actions like audits. Data integrity used to be done with triggers because that is all we had.

Today, most of those integrity triggers can be replaced by declarative DRI Actions. They perform a simple action for DELETE and UPDATE action. The actions are option clauses on the DDL. The syntax is:

1 |
[ON UPDATE | ON DELETE][NO ACTION | CASCADE | SET NULL | SET DEFAULT] |

**NO ACTION:**An error message tells the user that the action is not allowed and we get a ROLLBACK.**CASCADE:**Deletes all rows containing data involved in the foreign key relationship.**SET NULL:**Sets the value to NULL if all foreign key columns for the table can accept NULLs.**SET DEFAULT:**Sets the value to the default value defined for the column if all foreign key columns for the table have defaults defined for them.

## INSTEAD OF Triggers: The Good, the Bad and the Ugly.

But some triggers cannot be converted this way. Oops! Data integrity can be more complicated. The INSTEAD OF trigger was invented to solve a problem that has no other solution. In SQL, a VIEW can be updatable if it meets certain conditions. These conditions are pretty limited. The VIEW has to be built on one base table or on views that resolve to one base table. The VIEW has to include a key for the base. The columns that are in the base table, but not exposed in the VIEW, have to have DEFAULT values. None of the columns in the VIEW can be computed.

Are more general VIEWs updatable? Yes, in theory. Can we determine if an arbitrary general VIEW is updatable? No, in theory. Standard SQL and most products played it safe; we look for the easiest case. Look at these two VIEWs, assuming that employees are assigned to one and only one department:

1 2 3 4 5 |
CREATE VIEW Personnel_In (last_name, first_name) AS SELECT last_name, first_name FROM Personnel WHERE dept_nbr IN (10, 20); |

and

1 2 3 4 5 6 7 8 9 |
CREATE VIEW Personnel_Unioned (last_name, first_name) AS SELECT last_name, first_name FROM Personnel WHERE dept_nbr = 10 UNION ALL SELECT last_name, first_name FROM Personnel WHERE dept_nbr = 20; |

The `Personnel_In`

VIEW is updatable, but the `Personnel_Union`

VIEW is not. Never mind that they are logically identical. Here is another situation:

1 2 3 |
CREATE VIEW Foobar (a, b, ab) AS SELECT a, b, (a+b) FROM Barfoo; |

It is easy to see that this makes sense and would work:

1 2 |
UPDATE Foobar SET a = 6, b = 2; |

The computation would then give us 12 for ab. But this statement will fail without a computed column construct or constraint.

1 2 |
UPDATE Foobar SET ab = 12; |

We have no rules for splitting up the computed “ab” value into columns a and b.

If you like reading a bit of theory, look at the references at the end of this article.

The solution was the `INSTEAD OF`

trigger. Instead of doing an insert, update or delete action (all of which might be ambiguous), we perform the procedural code in the underlying base tables or updatable VIEWS that make up the VIEW with the `INSTEAD OF`

trigger. The` INSTEAD OF`

trigger is a `BEFORE`

trigger, in spite of the fact that the database event that it precedes never happens.

I have a theory that you should not write more than five triggers in your career. But writing a trigger does not mean you should forget about table constraints. If you have ever worked with sales commissions, particularly in the Insurance industry, you know that they can be elaborate. We had an` INSTEAD OF`

trigger on a complicated multi-table view that did a lot of computations and enforced a lot of business rules. The rules were driven by considerations like the salesman’s level, the type of policy sold, how his sales team was doing and other things. You could often compute a commission several ways and we tried to optimize it or standardize it in the VIEW.

The users did not know what was happening under the covers, and we did not want them to know. The answer would appear in the VIEW by magic. If a rule changed before a payday, the VIEW would reflect the new rules for everyone. The previous system depended on the front end programmers doing the changed computations, and coordinating their programs was a real pain and slow. We were so proud of this trigger.

However, we spent so much time and effort on the VIEW that we forgot to look at the tables which build it. The lack of a simple check on the upper limit of one variable let an oversized value get into a base table. That lead to commissions that were greater than the price of the policy – nice for the salesmen but not for the company. The front end users could not do anything since they did not have access to the base tables.

The algebra in the VIEW was correct, so we knew we had to go through the base tables for bad data. This was harder than it sounds. We would flush out the bad data and feel good. But without a CHECK(), it would creep back in. As Graeme Simsion says, “mop the floor, then fix the leak” and is what we were failing to do.

## References:

- http://goldennumber.net/
- “The Golden Ratio: The Story of PHI, the World’s Most Astonishing Number” by Mario Livio.
- “The Golden Section: Nature’s Greatest Secret” by Scott Olsen.
- “The Divine Proportion” by Herbert Edwin Huntley.
- “The Fabulous Fibonacci Numbers” by Alfred S. Posamentier.
- “A Mathematical History of the Golden Number” by Roger Herz-Fischler.
- “The Golden Section (Spectrum)” by Hans Walser.

Load comments