Today’s blog post is in response to Bert Wagner’s invitation here: https://bertwagner.com/2018/07/03/code-youd-hate-to-live-without-t-sql-tuesday-104-invitation/. For me, the answer was pretty simple. There is one piece of code that I have written that I find very useful, and even more… it was my (as far as I know, my own idea!) In my life, I have obviously learned a tremendous amount about building and designing databases on a relational engine. I have written books about it, as you may know. While I am proud about this, all it generally means is that I took the work of so many other people and arranged it into 700 pages.
The solution is part of my calendar/date dimension code, and it is used to do relative positioning over date periods. For example, say you have the need to get data from the 10 days. You can definitely use a simple between to filter the rows, and a bunch of date functions to group by year, month, etc., generally all of the “normal” groupings. But using a calendar table allows you to prebuild a set of date calculations that make the standard values easier to get, and non-standard groupings possible. The technique I will cover makes moving around in the groupings more easily accessible. Like if you want data from the last 3 complete months. The query to do this isn’t rocket science, but it isn’t exactly straightforward either.
For the example, I will use the calendar table that I have on my website here: https://drsql.org/code in the download SimpleDateDimensionCreateAndLoad, and will load it with data up until 2020. Here is that structure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
CREATE TABLE Tools.Calendar ( CalendarId int NOT NULL CONSTRAINT PKdate_dim PRIMARY KEY , DateValue date NOT NULL CONSTRAINT AKdate_dim__DateValue UNIQUE , DayName varchar(10) NOT NULL, MonthName varchar(10) NOT NULL, Year varchar(60) NOT NULL, Day tinyint NOT NULL, DayOfTheYear smallint NOT NULL, Month smallint NOT NULL, Quarter tinyint NOT NULL, WeekendFlag bit NOT NULL, DayInMonthCount tinyint NOT NULL, --start of fiscal year configurable in the load process, currently --only supports fiscal months that match the Calendar months. FiscalYear smallint NOT NULL, FiscalMonth tinyint NULL, FiscalQuarter tinyint NOT NULL, --used to give Relative positioning, such AS the previous 10 months --which can be annoying due to month boundries RelativeDayCount int NOT NULL, RelativeWeekCount int NOT NULL, RelativeMonthCount int NOT NULL ); |
You can join this table to your data, either on the CalendarId (which is a smart key like YYYYMMDD), or the actual date value (and you can change the key structure to meet your needs very simply. This is code I could not live without because I, and people I work with, do a lot of queries where we compare time periods, such as “compare this year’s sales period with 1, 2, and/or 3 years ago.” and sales period does not correspond to any calendar that any other organization has created. We also commonly need to do this via a selectable parameter, where the user can decide the number of periods to compare.
Now say you want to get the dates for the last 5 days. A typical (and definitely not wrong) thing you can do, is do date math on the current date and time using SYSDATETIME() or GETDATE(), though SYSDATETIME() is the newer alternative with extra precision that means nothing in this example:
1 2 3 |
SELECT DateValue FROM Tools.Calendar WHERE DateValue BETWEEN DATEADD(DAY, -5, SYSDATETIME()) AND SYSDATETIME(); |
This returns:
dateValue
----------
2018-07-01
2018-07-02
2018-07-03
2018-07-04
2018-07-05
Which works well enough (though it is a bit concerning in a quick look that DateValue is a date and we are comparing it do datetime2(7) values, as shown, it will work as expected and include the current day and 4 previous days.) Now you can join to your table, such as a sales table:
1 2 3 4 5 6 |
SELECT * FROM YourTable JOIN Tools.Calendar ON YourTable.DateColumn = Calendar.DateColumn --or, if all you have is a column with a point in time ON CAST(YourTable.PointInTime as date) = Calendar.DateColumn |
It’s a small table, so joining on the function, while not ideal, will be better than the alternative of trying to calculate the various factors in the Calendar table row by row. Now you can filter and group by the year, month, fiscal month, etc.
But now, what if you want something more complex. Like the previous last 3 full months? Using EOMONTH(), we could put together some criteria to make that happen for sure. Now consider any period of time where “real world” date math does not work at all. Like your corporate fiscal year, which doesn’t start on Jan 1. Or perhaps a fiscal month, which doesn’t start/end on the first and last day of a month. You might even have time periods that are not contiguous, like sales periods (which is what I will give as example code).
This is where my bit of code comes into play. Notice in the code it loaded:
1 2 3 |
RelativeDayCount int NOT NULL, RelativeWeekCount int NOT NULL, RelativeMonthCount int NOT NULL |
Looking at the past 14 days, which cross a month and week boundaries because of when I am writing this (July 5):
1 2 3 |
SELECT DateValue, RelativeDayCount, RelativeWeekCount, RelativeMonthCount FROM Tools.Calendar WHERE DateValue BETWEEN DATEADD(DAY, -14, SYSDATETIME()) AND SYSDATETIME(); |
This returns:
DateValue RelativeDayCount RelativeWeekCount RelativeMonthCount
---------- ---------------- ----------------- ------------------
2018-06-22 96961 13851 3185
2018-06-23 96962 13851 3185
2018-06-24 96963 13852 3185
2018-06-25 96964 13852 3185
2018-06-26 96965 13852 3185
2018-06-27 96966 13852 3185
2018-06-28 96967 13852 3185
2018-06-29 96968 13852 3185
2018-06-30 96969 13852 3185
2018-07-01 96970 13853 3186
2018-07-02 96971 13853 3186
2018-07-03 96972 13853 3186
2018-07-04 96973 13853 3186
2018-07-05 96974 13853 3186
Notice the RelativeDayCount is a unbroken sequence. The week counts include 7 days for 13852 (this started counting from 1753-01-01 for some very arbitrary, long forgotten, reason), and the month boundary changes on the first. By using these groupings, which are sortable, we can use simple integer comparisons, typically using equality or between. Like if we want the days of the previous week:
1 2 3 4 5 6 7 8 9 10 |
--get the current focal point (note that this value doesn't change in my examples. You could update the --calendar constantly such that 0 is always the current value, if desired DECLARE @currentWeek int = (SELECT RelativeWeekCount FROM Tools.Calendar WHERE dateValue = CAST(SYSDATETIME() AS date)) --then get the rows where the values are 1 less than the current SELECT DayName, DateValue FROM Tools.Calendar WHERE RelativeWeekCount = @currentWeek - 1 |
The current date is July 5, and the first day of this month was July 1. So you can see the previous week ended on the 30th.
DayName DateValue
---------- ----------
Sunday 2018-06-24
Monday 2018-06-25
Tuesday 2018-06-26
Wednesday 2018-06-27
Thursday 2018-06-28
Friday 2018-06-29
Saturday 2018-06-30
But as I said, the beauty of this technique is dealing with periods that cannot be done with simple date math. Lets set up a scenario of a company that needs to report on activity during given sales periods. I will add a couple of columns to set up this scenario. This company has a weekly sale that regularly lasts from Tues-Thurs, most weeks (on the week of June 17, the sale went from Thursday-Sunday for Father’s day instead.) You want to join to your data table to see sales on the day of the sale, as well as the days between each sale to see if a sale influenced following sales.
So I am going to add 3 columns, the first is the relative positioning, and the last two are the kind of columns you would regularly wish to have in a calendar table to indicate a time period.
1 2 3 4 |
ALTER TABLE Tools.Calendar ADD RelativeSalePeriodCount int NOT NULL CONSTRAINT DFLTCalendar__RelativeSalePeriodCount DEFAULT (0), ActualSaleDayFlag bit NOT NULL CONSTRAINT DFLTCalendar__ActualSaleDayFlag DEFAULT (0), SaleName nvarchar(30) NOT NULL --You might make this not null, and default to 'No Current Sale' |
Now, I will load the data, in a very manual manner, but if there was an algorithm that could easily be formulated, this technique probably would never have been born.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
UPDATE Tools.Calendar SET SaleName = CASE WHEN DateValue BETWEEN '2018-05-05' AND '2018-05-07' THEN 'Sale 1' WHEN DateValue BETWEEN '2018-05-14' AND '2018-05-17' THEN 'Sale 2' WHEN DateValue BETWEEN '2018-05-19' AND '2018-05-21' THEN 'Sale 3' WHEN DateValue BETWEEN '2018-05-26' AND '2018-05-28' THEN 'Sale 4' ELSE SaleName END, --Real world, you may make this 'Sale # After Period', but not germane to discussion ActualSaleDayFlag = CASE WHEN DateValue BETWEEN '2018-05-05' AND '2018-05-07' THEN 1 WHEN DateValue BETWEEN '2018-05-14' AND '2018-05-17' THEN 1 WHEN DateValue BETWEEN '2018-05-19' AND '2018-05-21' THEN 1 WHEN DateValue BETWEEN '2018-05-26' AND '2018-05-28' THEN 1 ELSE 0 END, RelativeSalePeriodCount = CASE WHEN DateValue BETWEEN '2018-05-05' AND '2018-05-13' THEN 1 WHEN DateValue BETWEEN '2018-05-14' AND '2018-05-18' THEN 2 WHEN DateValue BETWEEN '2018-05-19' AND '2018-05-25' THEN 3 WHEN DateValue >= '2018-05-26' THEN 4 ELSE 0 END WHERE DateValue >= '2018-05-01'; |
So now, with this data you can query the data pretty easily and get the most recent sale by getting the value that is currently set, or get previous sale periods by moving to the previous boundary. For example, show me the dates for the previous two sales periods:
1 2 3 4 5 6 7 8 9 |
--Fetch the current period count DECLARE @RelativeSalePeriodCount int = (SELECT RelativeSalePeriodCount FROM Tools.Calendar WHERE dateValue = CAST(SYSDATETIME() AS date)); --then query for the past few by getting the previous 2 sales periods SELECT DateValue, DayName, SaleName, ActualSaleDayFlag, RelativeSalePeriodCount FROM Tools.Calendar WHERE RelativeSalePeriodCount BETWEEN @RelativeSalePeriodCount -2 AND @RelativeSalePeriodCount -1; |
This returns the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
DateValue DayName SaleName ActualSaleDayFlag RelativeSalePeriodCount ---------- ---------- ------------------------------ ----------------- ----------------------- 2018-05-14 Monday Sale 2 1 2 2018-05-15 Tuesday Sale 2 1 2 2018-05-16 Wednesday Sale 2 1 2 2018-05-17 Thursday Sale 2 1 2 2018-05-18 Friday NULL 0 2 2018-05-19 Saturday Sale 3 1 3 2018-05-20 Sunday Sale 3 1 3 2018-05-21 Monday Sale 3 1 3 2018-05-22 Tuesday NULL 0 3 2018-05-23 Wednesday NULL 0 3 2018-05-24 Thursday NULL 0 3 2018-05-25 Friday NULL 0 3 |
From here, you should be able to pick out the data you want in your summaries. You just want to sum sales for days of the sale: ActualSaleDayFlag = 1. Non-sales days: ActualSaleDayFlag = 0. But getting a number of periods back is done simply by varying an integer factor.
Load comments