5.10 Querying Periods and Respecting Granularity
5.10.1 Problem
As in the previous recipe, you want to
find all periods that include a particular
date. However, you do not entirely trust the data within your
database. Programmers are supposed to ensure that DATETIME values
contain only dates (not times), but your experience has been that
many DATETIME values do contain time-of-day components. You wish to
write your query to avoid problems from time-of-day values.
With respect to our example, the date you are interested in is
January 11, 2001, and you wish to find all contractors who are
currently scheduled to work on that date.
5.10.2 Solution
Use the following query, which carefully avoids applying any
functions to the two database columns, JobStart and JobEnd:
SELECT
JobId, ContractorId, JobStart, JobEnd
FROM ContractorsSchedules
WHERE JobStart < (CAST('2001-1-11' AS DATETIME)+1)
AND JobEnd >= CAST('2001-1-11' AS DATETIME)
JobId ContractorId JobStart JobEnd
---------- ------------ --------------------------- ---------------------------
RF10001 Alex 2001-01-11 00:00:00.000 2001-01-20 00:00:00.000
RF10003 Bob 2001-01-05 00:00:00.000 2001-01-15 00:00:00.000
5.10.3 Discussion
Why such a convoluted query? And why the need to avoid applying
functions to database columns? Let's answer one
question at a time. First, let's talk about the
complexity. Your first thought when faced with the task of writing
such a query that involves date ranges might be to use the following,
simpler approach:
SELECT
JobId, ContractorId, JobStart, JobEnd
FROM ContractorsSchedules
WHERE JobStart <= '2001-1-11'
AND JobEnd >= '2001-1-11'
This approach is perfectly fine except for the implicit assumption
that the time-of-day components of JobStart and JobEnd are always
00:00:00.000. But, if you had a JobStart date such as the one in the
following record:
RF10001 Alex 2001-01-11 15:30:00.000 2001-01-20 00:00:00.000
the WHERE condition that SQL Server would evaluate would be:
JobStart <= '2001-1-11'
'2001-01-11 15:30:00.000' <= '2001-01-11 00:00:00.000'
The first line represents the condition as you wrote it, while the
second line represents the actual values, after the implicit
conversion of your input date, that SQL Server evaluates. Clearly,
this condition will fail, resulting in the exclusion of job RF10001
from the query results. Yet, the job does begin on January 11, 2001,
and you really do want it to appear in the list of jobs underway as
of that date. What then, do you do?
Now you might think to apply a function to the JobStart date in the
database to truncate any time-of-day component. For example:
CONVERT(CHAR(10),JobStart,120) <= CAST('2001-1-11' AS DATETIME)
This approach will work, but it has the unfortunate side effect of
applying a function (CONVERT in this case) to a database column. That
potentially precludes the use of any index that happens to exist on
the column in question. For large tables, this can be a significant
performance issue. If you happen to index the JobStart column and you
wish to allow SQL Server the possibility of using that index,
you'll need to take an approach such as the
following:
JobStart < (CAST('2001-1-11' AS DATETIME)+1)
Here, the JobStart column is left untouched. Instead, we manipulate
the date value that we are supplying to the query. First, we CAST our
date into a DATETIME value. Then, we add 1 to the result. Because we
have been sure not to include any time-of-day component in our
constant, the result will be the first instant of the following day:
(CAST('2001-1-11' AS DATETIME)+1) = '2001-1-12 00:00:00.000'
The maximum granularity of a DATETIME value is in milliseconds. There
can be no earlier time recorded for our DATETIME value than
00:00:00.000. This means that any DATETIME value that is less than
(but not less than or equal to) the value returned by our expression
must, in fact, fall sometime during or before January 11, 2001. Thus,
we use the less-than operator (<) in our WHERE condition.
For the JobEnd date, we still use greater-than-or-equal (>=)
operator:
AND JobEnd >= CAST('2001-1-11' AS DATETIME)
The comparison that SQL Server makes in this case is:
AND JobEnd >= '2001-1-11 00:00:00.000'
Again, our expression evaluates to the earliest possible time of day
on January 11, 2001. Any time-of-day component associated with a
JobEnd date of January 11, 2001 must, by necessity, be equal to or
greater than the result of our expression. Hence, we use the
greater-than-or-equal (>=) operator in this case.
The result of our careful approach is a query that ignores any
time-of-day values for JobStart and JobEnd, does not preclude the use
of any indices on those columns, but still returns any jobs underway
as of our specified date.
|