Some time ago, I published the article “Find Missing Date Ranges in SQL”: . In that article, I showed how to find missing dates, using different SQL techniques.
A few months later, I received a message from a Simple-Talk reader, who thought he had a similar problem, and asked me for help. Here is the reader’s problem:
‘There is a table, where two of the columns are DateFrom and DateTo. Both columns contain date and time values. How does one find the missing date ranges or, in other words, all the date ranges that are not covered by any of the entries in the table’.
At first glance, the reader’s problem and the one on my article look similar; they both involve finding missing, or unused, date ranges. However, they are not the same. In the article, a hypothetical application registers the moments (time) of events and stores them in a database (table). In the reader’s problem, some imaginary application records the beginning and the end of the processes, representing them as continuous date/time ranges.
The second task may become more complicated, if the processes can overlap each other in time.
The following diagram illustrates the reader’s problem (see Fig.1):
Fig.1 The reader’s problem
As shown on the diagram, there is only one missing date range: from Jan 24 to Jan 31.
Without the process that started on Jan 01 and completed on Jan 22, shown in green in the diagram, there would be three missing date ranges:
- Jan 04 – Jan 04;
- Jan 11 – Jan 19;
- Jan 24 – Jan 31;
Notice, that a special case, when the process started, but did not finish yet, can be treated as a process with the end time equal to the current time.
For example, if the process started on June 10 and still running on June 14 (when you execute the query), you can consider that process as a process that started on June 10 and finished on June 14. In other words, dates in the range from June 10 to June 14 are covered (exist).
Solution with Auxiliary Table
Before we explore the fist solution, let us create and load sample table (see Listing1):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
IF EXISTS(SELECT * FROM sysobjects WHERE ID = (OBJECT_ID('dateRanges')) AND xtype = 'U') DROP TABLE dateRanges; GO CREATE TABLE dateRanges(dateFrom DATETIME, dateTo DATETIME) GO INSERT INTO dateRanges VALUES('Jan 01, 2000','Jan 22, 2000') INSERT INTO dateRanges VALUES('Jan 01, 2000','Jan 03, 2000') INSERT INTO dateRanges VALUES('Jan 05, 2000','Jan 10, 2000') INSERT INTO dateRanges VALUES('Jan 20, 2000','Jan 23, 2000') INSERT INTO dateRanges VALUES('Feb 01, 2000','Feb 06, 2000') INSERT INTO dateRanges VALUES('Feb 02, 2000','Feb 05, 2000') INSERT INTO dateRanges VALUES('Mar 01, 2000','Mar 02, 2000') INSERT INTO dateRanges VALUES('Mar 20, 2000','Mar 23, 2000') INSERT INTO dateRanges VALUES('Mar 25, 2000','Mar 28, 2000') INSERT INTO dateRanges VALUES('Mar 29, 2000','Apr 03, 2000') INSERT INTO dateRanges VALUES('Apr 01, 2000','Apr 05, 2000') INSERT INTO dateRanges VALUES('Apr 20, 2000','Apr 23, 2000') INSERT INTO dateRanges VALUES('Apr 25, 2000','Apr 28, 2000') INSERT INTO dateRanges VALUES('May 19, 2000','May 23, 2000') INSERT INTO dateRanges VALUES('May 20, 2000','May 23, 2000') INSERT INTO dateRanges VALUES('May 24, 2000','May 25, 2000') INSERT INTO dateRanges VALUES('May 28, 2000','May 29, 2000') INSERT INTO dateRanges VALUES('May 28, 2000','Jun 10, 2000') INSERT INTO dateRanges VALUES('May 29, 2000','Jun 10, 2000') INSERT INTO dateRanges VALUES('Jun 05, 2000','Jun 15, 2000') INSERT INTO dateRanges VALUES('Jun 20, 2000','Jun 28, 2000') INSERT INTO dateRanges VALUES('Jul 02, 2000','Jul 28, 2000') |
Listing1. Create and load sample table
Fig.2 The Test data
Then, using an auxiliary table you can easily find missing dates (see Listing2):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
SET NOCOUNT ON; DECLARE @minDateFrom DATETIME, @maxDateTo DATETIME; SELECT @minDateFrom = MIN(dateFrom), @maxDateTo = MAX(dateTo) FROM dateRanges; -- create auxiliary table dateSequence IF EXISTS(SELECT * FROM sysobjects WHERE ID = (OBJECT_ID('dateSequence')) AND xtype = 'U') DROP TABLE dateSequence; CREATE TABLE dateSequence(dt DATETIME NOT NULL PRIMARY KEY); -- load table dateSequence WITH dates AS ( SELECT @minDateFrom AS initDate UNION ALL SELECT DATEADD(dd, 1, initDate) FROM dates WHERE initDate <@maxDateTo ) INSERT INTO dateSequence SELECT initDate FROM dates OPTION (MAXRECURSION 0); -- get missing dates SELECT CONVERT(VARCHAR(20),dt, 107) AS missingDates FROM dateSequence t1 WHERE NOT EXISTS (SELECT * FROM dateRanges t2 WHERE dt BETWEEN t2.dateFrom AND t2.dateTo) Results: missingDates -------------------- Jan 24, 2000 Jan 25, 2000 Jan 26, 2000 Jan 27, 2000 Jan 28, 2000 Jan 29, 2000 Jan 30, 2000 Jan 31, 2000 Feb 07, 2000 Feb 08, 2000 . . . . . . . Feb 28, 2000 Feb 29, 2000 Mar 03, 2000 Mar 04, 2000 . . . . . . . Mar 18, 2000 Mar 19, 2000 Mar 24, 2000 Apr 06, 2000 Apr 07, 2000 . . . . . . . Apr 18, 2000 Apr 19, 2000 Apr 24, 2000 Apr 29, 2000 Apr 30, 2000 May 01, 2000 May 02, 2000 . . . . . . . May 17, 2000 May 18, 2000 May 26, 2000 May 27, 2000 Jun 16, 2000 Jun 17, 2000 Jun 18, 2000 Jun 19, 2000 Jun 29, 2000 Jun 30, 2000 Jul 01, 2000 |
Listing2. Find missing dates, using an auxiliary table
The same approach can be used, if you need to find missing hours, minutes or seconds. For example, for missing minutes, you need to make a small change in DATEADD() function in common table expression (see Listing3):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
SET NOCOUNT ON; DECLARE @minDateFrom DATETIME, @maxDateTo DATETIME; SELECT @minDateFrom = MIN(dateFrom), @maxDateTo = MAX(dateTo) FROM dateRanges; -- create table dateSequence IF EXISTS(SELECT * FROM sysobjects WHERE ID = (OBJECT_ID('dateSequence')) AND xtype = 'U') DROP TABLE dateSequence; CREATE TABLE dateSequence(dt DATETIME NOT NULL PRIMARY KEY); -- load table dateSequence WITH dates AS ( SELECT @minDateFrom AS initDate UNION ALL SELECT DATEADD(mi, 1, initDate) FROM dates WHERE initDate <@maxDateTo ) INSERT INTO dateSequence SELECT initDate FROM dates OPTION (MAXRECURSION 0); -- get missing dates SELECT dt AS missingDates FROM dateSequence t1 WHERE NOT EXISTS (SELECT * FROM dateRanges t2 WHERE dt BETWEEN t2.dateFrom AND t2.dateTo) |
Listing3. Solution for missing minutes with auxiliary table
However, the solutions, as shown in Listing2 and Listing3, are not the answers for the reader’s problem. This will require finding the missing, or unused, date ranges, but not missing, or unused, dates.
One possible way to solve that problem is to transform already found missing dates into missing date ranges.
In order to test that solution, you will need to:
- Create and load an auxiliary table dateSequence as shown in the Listing2
- Run following query (see Listing 4)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
WITH datesCTE(missingDates) AS (SELECT missingDates FROM (SELECT dt AS missingDates FROM dateSequence t1 WHERE NOT EXISTS (SELECT * FROM dateRanges t2 WHERE dt BETWEEN t2.dateFrom AND t2.dateTo)) tbl1 ) SELECT CONVERT(VARCHAR(20), t1.missingDates, 107) missingFrom, CONVERT(VARCHAR(20), MIN(t2.missingDates), 107) missingTo FROM (SELECT missingDates FROM datesCTE tbl1 WHERE NOT EXISTS(SELECT * FROM datesCTE tbl2 WHERE DATEDIFF(dd, tbl2.missingDates, tbl1.missingDates) = 1)) t1 INNER JOIN (SELECT missingDates FROM datesCTE tbl1 WHERE NOT EXISTS(SELECT * FROM datesCTE tbl2 WHERE DATEDIFF(dd, tbl1.missingDates, tbl2.missingDates) = 1)) t2 ON DATEDIFF(dd, t1.missingDates, t2.missingDates) >= 0 GROUP BY t1.missingDates ORDER BY t1.missingDates Results: missingFrom missingTo ------------ ------------ Jan 24, 2000 Jan 31, 2000 Feb 07, 2000 Feb 29, 2000 Mar 03, 2000 Mar 19, 2000 Mar 24, 2000 Mar 24, 2000 Apr 06, 2000 Apr 19, 2000 Apr 24, 2000 Apr 24, 2000 Apr 29, 2000 May 18, 2000 May 26, 2000 May 27, 2000 Jun 16, 2000 Jun 19, 2000 Jun 29, 2000 Jul 01, 2000 |
Listing4. Find missing date ranges, using an auxiliary table
There is, however, at least one way to solve the reader’s problem in one query and without an auxiliary table.
The ‘Missing Date Ranges’ Puzzle
Try to find the solution for the reader’s problem. Consider it as a puzzle with the following requirements:
- You have a sample table dateRanges with two date/time columns dateFrom and dateTo (see Listing1)
- You need to find the missing date ranges (that were not used in any rows), where the format of result and the result should be the same as in the Listing4.
- The solution should be one-query solution without an auxiliary table.
- It should work in SQL Server 2005 and in SQL Server 2000.
The best solution will be awarded with a $50 Amazon voucher.
I will post my solution as comments in a couple of weeks.
Load comments