Team LiB   Previous Section   Next Section

3.3 Finding Regions

3.3.1 Problem

Find a region in a list. In our example, you must find all containers in the production line that have a purity index of 100. These represent normal production output. Furthermore, you want only cases where at least two containers in succession have a purity of 100. An odd container with a purity of 100 in the midst of a number of containers with bad purity levels is not to be reported as normal production output.

3.3.2 Solution

Look at the problem as a region finding problem and use the following code:

SELECT DISTINCT p1.ContainerID
FROM ProductionLine p1, ProductionLine p2
WHERE
   p1.Purity=100 AND p2.Purity=100 AND
   abs(p1.ContainerId-p2.ContainerId)=1

ContainerID 
----------- 
1
2
10
11
12

3.3.3 Discussion

Obviously, if it weren't for the requirement to have at least two containers in a row with normal purity before reporting a container as normal, the result could be obtained by finding all containers with a purity level of 100:

SELECT * FROM ProductionLine
WHERE Purity=100

To return the correct result, we have to use a technique to find regions in the list. To find neighboring rows with the same value, we need two copies of the same table. We name them p1 and p2:

SELECT p1.ContainerID
FROM ProductionLine p1, ProductionLine p2

Then, we filter out all rows that do not match the criterion of having one neighbor of the same value. The trick here to finding neighbors is calculating the distance between the p1.ContainerId and p2.ContainerId. If the distance is 1, the two elements are neighbors. If they have the same value, they should be included in the result:

SELECT p1.ContainerID
FROM ProductionLine p1, ProductionLine p2
WHERE 
   abs(p1.ContainerId-p2.ContainerId)=1

We then add another condition to the WHERE clause to restrict the results further to only those cases where the two neighboring containers have a purity of 100:

SELECT p1.ContainerID
FROM ProductionLine p1, ProductionLine p2
WHERE 
   p1.Purity=100 AND p2.Purity=100 AND
   abs(p1.ContainerId-p2.ContainerId)=1

Finally, in the SELECT clause, we use DISTINCT to eliminate repeated references to the same container:

SELECT DISTINCT p1.ContainerID
FROM ProductionLine p1, ProductionLine p2
WHERE
   p1.Purity=100 AND p2.Purity=100 AND
   abs(p1.ContainerId-p2.ContainerId)=1

You can try to run the query without the DISTINCT clause, and, as you'll see using our sample data, it will return container ID 11 twice. This is because the 11th row has two neighbors with a purity of 100 (10 and 12) and, thus, is reported twice.

    Team LiB   Previous Section   Next Section