3.8 Working with Runs
3.8.1 Problem
You want to find runs in your
table.
In our example, you want to find any increasing (arithmetically and
nonarithmetically) sequences of purity values.
3.8.2 Solutions
Use the following query:
SELECT
p1.ContainerId SeqBeg, p2.ContainerId SeqEnd
FROM ProductionLine p1, ProductionLine p2
WHERE
(p1.ContainerId < p2.ContainerId) AND
NOT EXISTS(SELECT * FROM ProductionLine p3, ProductionLine p4
WHERE (
p3.Purity<=p4.Purity AND
p4.ContainerId=p3.ContainerId-1 AND
p3.ContainerId BETWEEN p1.ContainerId+1 AND p2.ContainerId)
OR (p3.ContainerId=p1.ContainerId-1 AND p3.Purity<p1.Purity)
OR (p3.ContainerId=p2.ContainerId+1 AND p3.Purity>p2.Purity))
SeqBeg SeqEnd
----------- -----------
2 5
6 9
3.8.3 Discussion
This query uses a framework similar to that which
you've seen many times before in this chapter.
Unlike a sequence, a run is a continuously increasing, though not
necessarily monotonically increasing, series of values. Unlike the
previous recipe in which we were looking for monotonically increasing
sequences, we do not have a constant difference between ContainerId
and Purity values. Consequently, we need a fourth table, p4 in this
instance, to check for rows in the middle of a candidate interval
that do not comply with the run requirement. This p4 table comes into
play in the subquery, where we join it to p3.
For every element between p1 and p2, p3 and its predecessor are
compared to see if their values are increasing:
p3.Purity<=p4.Purity AND
p4.ContainerId=p3.ContainerId-1 AND
p3.ContainerId BETWEEN p1.ContainerId+1 AND p2.ContainerId
The BETWEEN clause limits the scope to rows between the borders (p1
and p2) of the candidate run in question. The p1 border is increased
by 1, which covers all pairs within the scope. Note that there is
always one less pair than the number of rows.
In a manner similar to other queries for regions and sequences, the
last two conditions in the subquery's WHERE clause
ensure that the borders of the candidate run cannot be extended:
(p3.ContainerId=p1.ContainerId-1 AND p3.Purity<p1.Purity) OR
(p3.ContainerId=p2.ContainerId+1 AND p3.Purity>p2.Purity)
If a row can be returned to satisfy these conditions, then the run
can be extended and should be rejected in favor of the larger run.
The common framework that this solution shares with earlier recipes
allows you to take techniques presented earlier for regions and
sequences and apply them to runs.
|