3.9 Cumulative Aggregates in Lists

3.9.1 Problem

You need to report cumulative totals and averages. With respect to our example, assume that the Purity value is, instead, a measure of weight, say kilograms. For packaging purposes, you want to see at which container the total weight of a production line's output rises above 1,000. Likewise, you are interested to see how each additional container affects the average weight in a shipment.

3.9.2 Solution

Use the following query to calculate both a cumulative total and a running average weight in one pass:

SELECT 
   p1.ContainerId, SUM(p2.Purity) Total, AVG(p2.Purity) Average 
FROM ProductionLine p1, ProductionLine p2
WHERE 
   p1.ContainerId >= p2.ContainerId
GROUP BY p1.ContainerId 

ContainerId Total       Average     
----------- ----------- ----------- 
1           100         100
2           200         100
3           301         100
4           403         100
5           506         101
6           606         101
7           709         101
8           817         102
9           926         102
10          1026        102
11          1126        102
12          1226        102

3.9.3 Discussion

The code uses an old SQL trick for ordering. You take two instances of the ProductionLine table, named p1 and p2, and you cross-join them. Then you group the results by p1.ContainerId, and you limit the second table's (p2's) rows so that they have ContainerId values smaller than the p1 row to which they are joined. This forces the server to produce an intermediate result set that looks as follows:

p1_Id       p1_Purity   p2_Id       p2_Purity   
----------- ----------- ----------- ----------- 
1           100         1           100
2           100         1           100
2           100         2           100
3           101         1           100
3           101         2           100
3           101         3           101
4           102         1           100
4           102         2           100
4           102         3           101
4           102         4           102
5           103         1           100
...

Each group, identified by p1.ContainerId, includes all rows from p2 with lower or equivalent ContainerId values. The AVG and SUM functions are then applied to the p2_Purity column. The two functions work on p2 rows in each group and, thus, calculate cumulative results.