Team LiB   Previous Section   Next Section

3.13 Comparing Two Rows in an Array

3.13.1 Problem

You want to check to see whether two rows in an array are equal. In our example, you want to check if two production lines in the ProductionFacility table are equal.

3.13.2 Solution

To check rows in the table for equality, use the following code. The result will be a list of rows in the array that are equivalent:

SELECT p1.Line p1_Line, 'is equal to', p2.Line p2_Line
FROM ProductionFacility p1, ProductionFacility p2
WHERE p1.Purity=p2.Purity AND p1.ContainerId=p2.ContainerId AND
   p1.Line<p2.Line
GROUP BY p1.Line, p2.Line
HAVING 
   COUNT(*)=(SELECT COUNT(*) FROM ProductionFacility p3 WHERE p3.Line=p1.Line) 
   AND
   COUNT(*)=(SELECT COUNT(*) FROM ProductionFacility p4 WHERE p4.Line=p2.Line) 

p1_Line                 p2_Line     
----------- ----------- ----------- 
0           is equal to 3

3.13.3 Discussion

This query ends up being quite expensive, using four table instances; as a result, it is a good demonstration of how SQL is not very efficient in working with arrays. However, expensive as it is, it does allow you to get results using only one query.

The FROM clause creates a cross-join between the two instances of the ProductionFacility. We name the two instances p1 and p2 for easier reference. We define in the SELECT statement that the result will report one line for each pair of rows that are equal. Since the cross-join produces many rows, we use the GROUP BY statement to limit the result to just one row of output per row in the array.

The WHERE clause specifies three conditions:

  • Purity levels must be equal.

  • Container IDs must be equal.

  • Production-line numbers from p1 must be less than those from p2.

If you work with multidimensional arrays, simply add additional comparison clauses to the WHERE clause to compare parameters for equality. To compare for full equality between two rows, you must have one comparison expression for each dimension in your array. In our example, the two comparison clauses involve the ContainerId and Line columns. The comparison expression involving the Purity columns is what we use to determine whether two array elements are equal. So a match on ContainerId and Line defines two elements that need to be compared, and the test of equality involves the Purity column.

The intermediate results at this point, without the GROUP BY clause, are as follows:

SELECT p1.ContainerId, p1.Purity, p1.Line, p2.Line
FROM ProductionFacility p1, ProductionFacility p2
WHERE p1.Purity=p2.Purity AND p1.ContainerId=p2.ContainerId AND
   p1.Line<p2.Line

ContainerId Purity      Line        Line        
----------- ----------- ----------- ----------- 
3           100         0           1
1           100         0           3
2           100         0           3
3           100         0           3
3           100         1           3

Add in the GROUP BY clause and we get:

SELECT COUNT(*) ContainerCount, p1.Line, p2.Line
FROM ProductionFacility p1, ProductionFacility p2
WHERE p1.Purity=p2.Purity AND p1.ContainerId=p2.ContainerId AND
   p1.Line<p2.Line
GROUP BY p1.Line, p2.Line

ContainerCount Line        Line        
-------------- ----------- ----------- 
1              0           1
3              0           3
1              1           3

The HAVING clause is the expensive one. It compares the number of matched pairs from the WHERE clause to the number of columns in both rows. The first subquery checks for the number of rows in p1, and the second, for the number of rows in p2. The HAVING clause makes sure that only lines of equal size are reported in the final result. In our example, each production line has produced three containers. Looking at the intermediate results shown here, you can see that the only two production lines with a container count of three are lines 0 and 3. The HAVING clause ensures that those are reported as the final output from the query.

    Team LiB   Previous Section   Next Section