Temporal Tables – Part 4 – Synchronizing changes across tables

So way back in June of last year, when I started this series on Temporal Tables: Part 1 – Simple Single Table Example, Part 2 – Changing history; and even in Part 3 – Synchronizing Multiple Modifications; I only referenced one table. In this entry, I want to get down to what will actually be a common concern. I want my objects to be consistent, not just at the current point in time, but throughout all points in time.  I won’t even try to mix this concern with changing history, but I imagine that it could be a major undertaking depending on the data you already have from any change log tables you have created, if you were not concerned with viewing data at previous points in time.

In part 3, I looked at what happens when the same row, or two rows in the same table would behave. 2 tables will behave quite the same, and there is a compelling reason to be cognizant of the temporal timestamps when you are dealing with multiple tables and want to see all tables as they existed at a given point in time.

If you were writing you own versioning, you might use a trigger and write the change. Each change you would write would have a timestamp as of the time the change was written. I won’t cover the code (in this blog, but I will in the book) that you need for keeping history of changes here, but it is a widely used pattern. When a row changes, make a copy of the deleted rows in the trigger in a table that looks like the primary table, and set the time when the change was made.

A problem with this, if you want to see how the database looks at any point in time, you would see the progression of changes over time, which could lead to inconsistent views of the data at any given point in time. So say you have a SalesOrder and SalesOrderLineItem table, and you have a ControlTotal on the SalesOrder that needs to match the sum of the line item values for the SalesOrder. There really is no way to enforce a relationship between columns in different tables as this because SQL Server does not have delayed constraints or triggers. I will use this (what could be deemed a denormalization, depending on how it is viewed in the requirements) scenario, but not all issues will be quite so obvious.

While we must trust the client to get the math correct or reject it, if we are going to keep temporal versions of data, it is highly important that we make sure that the views across time are consistent (even down to the microsecond if we are using a very high granularity in our start and end time columns,) in order to make coding easier. By making sure that our modifications are packaged in transactions, we can do this.

First, let’s create our tables (and I will drop the SalesORder table we had previously and add the ControlTotal column. Note that you can’t just drop a table with versioning on, you need to turn off versioning and drop both tables.):

ALTER TABLE Sales.SalesOrder
    SET (SYSTEM_VERSIONING = OFF);
go
DROP TABLE Sales.SalesOrder;
DROP TABLE Sales.SalesOrderHistory;
GO

Then I will recreate the table with a change from Data to ControlTotal, and then a LineItem table that has the line total, as well as foreign key constraint to the SalesOrder table:

SET NOCOUNT ON;
CREATE TABLE Sales.SalesOrder

    SalesOrderId int NOT NULL CONSTRAINT PKSalesOrder PRIMARY KEY,
    ControlTotal DECIMAL(10,2) NOT NULL,
    ValidStartTime datetime2 (7) GENERATED ALWAYS AS ROW START,
    ValidEndTime datetime2 (7) GENERATED ALWAYS AS ROW END,
    PERIOD FOR SYSTEM_TIME (ValidStartTime, ValidEndTime)

WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = Sales.SalesOrderHistory)); –Another future thought, put in a different schema? I think not, but, it intrigues me nonetheless.
GO

CREATE TABLE Sales.SalesOrderLineItem

    SalesOrderLineItemId INT NOT NULL CONSTRAINT PKSalesOrderLineItem PRIMARY KEY,
    SalesOrderId int NOT NULL CONSTRAINT FKSalesOrderLineItem$ContainsDetailsFor$SalesSalesOrder REFERENCES Sales.SalesOrder(SalesOrderId),
    LineItemNumber INT NOT NULL,
    LineItemTotal DECIMAL(10,2) NOT NULL,
    CONSTRAINT AKSalesORderLineItem UNIQUE (SalesOrderId, LineItemNumber),
    ValidStartTime datetime2 (7) GENERATED ALWAYS AS ROW START,
    ValidEndTime datetime2 (7) GENERATED ALWAYS AS ROW END,
    PERIOD FOR SYSTEM_TIME (ValidStartTime, ValidEndTime)

WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = Sales.SalesOrderLineItemHistory));  GO

Now let’s do a progression of data, leaving a two second gap between operations, slowing down time a little bit like can happen in reality.

INSERT  INTO Sales.SalesOrder (SalesOrderId, ControlTotal)
VALUES  (1, 100);

WAITFOR DELAY ’00:00:02′;

INSERT  INTO Sales.SalesOrderLineItem (SalesOrderLineItemId, SalesOrderId, LineItemNumber, LineItemTotal)
VALUES  (1, 1, 1, 50);

WAITFOR DELAY ’00:00:02′;

INSERT  INTO Sales.SalesOrderLineItem (SalesOrderLineItemId, SalesOrderId, LineItemNumber, LineItemTotal)
VALUES  (2, 1, 2, 50);

Now let’s take a look at the data, just after the insert:

SELECT  *
FROM    Sales.SalesOrder
WHERE   SalesOrderId = 1;
SELECT  *
FROM    Sales.SalesOrderHistory
WHERE   SalesOrderId = 1;

SELECT  *
FROM    Sales.SalesOrderLineItem
WHERE   SalesOrderId = 1;
SELECT  *
FROM    Sales.SalesOrderLineItemHistory
WHERE   SalesOrderId = 1;

We have 3 rows in the base tables, starting at three different times, and no history yet:

SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
———— ————————————— ————————— —————————
1            100.00                                  2016-05-01 20:36:51.1670200 9999-12-31 23:59:59.9999999

SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
———— ————————————— ————————— —————————

SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
——————– ———— ————– ————————————— ————————— —————————
1                    1            1              50.00                                   2016-05-01 20:36:53.2000286 9999-12-31 23:59:59.9999999
2                    1            2              50.00                                   2016-05-01 20:36:55.2452606 9999-12-31 23:59:59.9999999

SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
——————– ———— ————– ————————————— ————————— —————————

Now, even before we have any version history, the timestamps start to matter. Looking at current data, no problem

SELECT ControlTotal, LineItemTotal
FROM   Sales.SalesOrder
            JOIN Sales.SalesOrderLineItem
                ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId

 
The results look just like you expect. But what if we are doing temporal queries on the table?

ControlTotal                            LineItemTotal
————————————— —————————————
100.00                                  50.00
100.00                                  50.00

At the time the first operation:

DECLARE @asOfTime datetime2(7) = ‘2016-05-01 20:36:52’;

SELECT ControlTotal, LineItemTotal
FROM   Sales.SalesOrder FOR SYSTEM_TIME AS OF @asOfTime
            LEFT OUTER JOIN Sales.SalesOrderLineItem FOR SYSTEM_TIME  AS OF @asOfTime
                ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId;

The data looks wrong:

ControlTotal                            LineItemTotal
————————————— —————————————
100.00                                  NULL

And at 20:36:54:

DECLARE @asOfTime datetime2(7) = ‘2016-05-01 20:36:54’;

SELECT ControlTotal, LineItemTotal
FROM   Sales.SalesOrder FOR SYSTEM_TIME AS OF @asOfTime
            LEFT OUTER JOIN Sales.SalesOrderLineItem FOR SYSTEM_TIME  AS OF @asOfTime
                ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId;

Things are better, but not quite right:

ControlTotal                            LineItemTotal
————————————— —————————————
100.00                                  50.00

Naturally, the likelihood of coming up against such a condition in the case of one row over two seconds is pretty slim, But the more concurrent operations using temporal, the more likely that you get back some weird totals that you don’t want. And even more of a concern is that users often need to fix some data that is broken when you have these inter-table dependencies. If you are fixing data, you may need to fix history as well now (fixing history is an interesting topic that I may discuss some day. The central question will be based on requirements. It really depends if you want to see the data through time, or the information through time. Information should be correct. Data is what it was.

This time, for the second SalesOrder,  I will make sure that all of the data is inserted and updated in the same transaction to ensure that the view of the data remains consistent:

BEGIN TRANSACTION;

INSERT  INTO Sales.SalesOrder (SalesOrderId, ControlTotal)
VALUES  (2, 100);

WAITFOR DELAY ’00:00:02′;

INSERT  INTO Sales.SalesOrderLineItem (SalesOrderLineItemId, SalesOrderId, LineItemNumber, LineItemTotal)
VALUES  (3, 2, 1, 50);

WAITFOR DELAY ’00:00:02′;

INSERT  INTO Sales.SalesOrderLineItem (SalesOrderLineItemId, SalesOrderId, LineItemNumber, LineItemTotal)
VALUES  (4, 2, 2, 50);

COMMIT TRANSACTION

Then checking the data:

SELECT  *
FROM    Sales.SalesOrder
WHERE   SalesOrderId = 2;
SELECT  *
FROM    Sales.SalesOrderHistory
WHERE   SalesOrderId = 2;

SELECT  *
FROM    Sales.SalesOrderLineItem
WHERE   SalesOrderId = 2;
SELECT  *
FROM    Sales.SalesOrderLineItemHistory
WHERE   SalesOrderId = 2;

 

It is clear to see that no matter what the AS OF time used, you will not have the issue with illogical results, since all of the start and end times are the same to 7 decimal places:

SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
———— ————————————— ————————— —————————
2            100.00                                  2016-05-01 20:48:30.0154948 9999-12-31 23:59:59.9999999

SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
———— ————————————— ————————— —————————

SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
——————– ———— ————– ————————————— ————————— —————————
3                    2            1              50.00                                   2016-05-01 20:48:30.0154948 9999-12-31 23:59:59.9999999
4                    2            2              50.00                                   2016-05-01 20:48:30.0154948 9999-12-31 23:59:59.9999999

SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
——————– ———— ————– ————————————— ————————— —————————

Finally, let’s update the line item 1 row to 25, and the total to be 75

BEGIN TRANSACTION;

UPDATE Sales.SalesOrder
SET  ControlTotal = 75
WHERE SalesOrderId = 2;

UPDATE Sales.SalesOrderLineItem
SET LineItemTotal = 25
WHERE SalesOrderId = 2
AND LineItemNumber = 1;

COMMIT TRANSACTION;

Looking at the data and history:

SELECT  *
FROM    Sales.SalesOrder
WHERE   SalesOrderId = 2;
SELECT  *
FROM    Sales.SalesOrderHistory
WHERE   SalesOrderId = 2;

SELECT  *
FROM    Sales.SalesOrderLineItem
WHERE   SalesOrderId = 2;
SELECT  *
FROM    Sales.SalesOrderLineItemHistory
WHERE   SalesOrderId = 2;

We see that the SalesOrder and line time 1 start times match, but not the 4th one, as you would expect since we did not apply any modification statement to that row:

SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
———— ————————————— ————————— —————————
2            75.00                                   2016-05-01 20:52:41.6210321 9999-12-31 23:59:59.9999999

SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
———— ————————————— ————————— —————————
2            100.00                                  2016-05-01 20:48:30.0154948 2016-05-01 20:52:41.6210321

SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
——————– ———— ————– ————————————— ————————— —————————
3                    2            1              25.00                                  2016-05-01 20:52:41.6210321 9999-12-31 23:59:59.9999999
4                    2            2              50.00                                   2016-05-01 20:48:30.0154948 9999-12-31 23:59:59.9999999

SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
——————– ———— ————– ————————————— ————————— —————————
3                    2            1              50.00                                   2016-05-01 20:48:30.0154948 2016-05-01 20:52:41.6210321

Now we can check the data as of a few times, and see that things are consistent:

At the original time of insert:

DECLARE @asOfTime datetime2(7) = ‘2016-05-01 20:48:30.0154948’;

SELECT ControlTotal, LineItemTotal
FROM   Sales.SalesOrder FOR SYSTEM_TIME AS OF @asOfTime
            LEFT OUTER JOIN Sales.SalesOrderLineItem FOR SYSTEM_TIME  AS OF @asOfTime
                ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId
WHERE SalesOrder.SalesOrderId = 2;

Two rows returned, total matches line item totals:

ControlTotal                            LineItemTotal
————————————— —————————————
100.00                                  50.00
100.00                                  50.00

At the time of the update:

DECLARE @asOfTime datetime2(7) = ‘2016-05-01 20:52:41.6210321’;

SELECT ControlTotal, LineItemTotal
FROM   Sales.SalesOrder FOR SYSTEM_TIME AS OF @asOfTime
            LEFT OUTER JOIN Sales.SalesOrderLineItem FOR SYSTEM_TIME  AS OF @asOfTime
                ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId
WHERE SalesOrder.SalesOrderId = 2;

This returns:

ControlTotal                            LineItemTotal
————————————— —————————————
75.00                                   25.00
75.00                                   50.00

I will leave it to you to try other times for yourself.

One quick note, if I had updated SalesOrderLineItemId = 4, even to the same value, I would get version rows. Be really careful not to just update rows repeatedly if there is
no change. You will want to do what you can to avoid it, or you could get this to occur:

UPDATE Sales.SalesOrder
SET    SalesOrderId = SalesOrderId
WHERE  SalesOrderId = 2
GO 10

Beginning execution loop
Batch execution completed 10 times.

So now the row has been updated 10 times with no change to the data. The version history is now considerably larger:

SELECT  *
FROM    Sales.SalesOrderHistory
WHERE   SalesOrderId = 2;

SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
———— ————————————— ————————— —————————
2            100.00                                  2016-05-01 20:48:30.0154948 2016-05-01 20:52:41.6210321
2            75.00                                   2016-05-01 20:52:41.6210321 2016-05-01 20:59:53.9903127
2            75.00                                   2016-05-01 20:59:53.9903127 2016-05-01 20:59:54.0059626
2            75.00                                   2016-05-01 20:59:54.0059626 2016-05-01 20:59:54.0215896
2            75.00                                   2016-05-01 20:59:54.0215896 2016-05-01 20:59:54.0372406
2            75.00                                   2016-05-01 20:59:54.0372406 2016-05-01 20:59:54.0372406
2            75.00                                   2016-05-01 20:59:54.0372406 2016-05-01 20:59:54.0528342
2            75.00                                   2016-05-01 20:59:54.0528342 2016-05-01 20:59:54.0528342
2            75.00                                   2016-05-01 20:59:54.0528342 2016-05-01 20:59:54.0528342
2            75.00                                   2016-05-01 20:59:54.0528342 2016-05-01 20:59:54.0528342
2            75.00                                   2016-05-01 20:59:54.0528342 2016-05-01 20:59:54.0684602

Very little change in the start times, but some never the less (you can see the start and end times do change a little bit over time.) This could be a horrible feature to turn on if you have such an interface
(as most of us probably do) where if nothing has changed and the user can press save over and over, causing update after update. So definitely watch your history tables after you turn this feature on to a new table to make sure of what is occurring.