8.10 Using a Simple Moving Average8.10.1 ProblemYou want to develop a moving-average tool for your analysis. For example, you want to smooth out monthly deviations in light-bulb sales using a six-month moving average. This will allow you to get a good handle on the long-term trend. 8.10.2 SolutionCalculating a moving average turns out to be a fairly easy task in SQL. With respect to the example we are using in this chapter, the query shown in the following example returns a six-month moving average of light-bulb sales: SELECT x.Id, AVG(y.Sales) moving_average FROM BulbSales x, BulbSales y WHERE x.Id>=6 AND x.Id BETWEEN y.Id AND y.Id+5 GROUP BY x.Id ORDER BY x.Id id moving_average ----------- --------------- 6 9425.83 7 9532.17 8 9613.00 9 9649.50 10 9562.00 11 9409.33 12 9250.67 13 9034.83 14 8812.83 15 8609.83 16 8447.83 17 8386.33 18 ... The query shown in this example was used to generate the chart shown earlier in this chapter in Figure 8-2. Please refer to that chart to see the graphical representation of the query results that you see here. 8.10.3 DiscussionIn our solution, the moving average is calculated through the use of a self-join. This self-join allows us to join each sales record with itself and the sales records from the preceding six periods. For example: SELECT x.Id xid, y.Id yid, y.Sales FROM BulbSales x, BulbSales y WHERE x.Id>=6 AND x.Id BETWEEN y.Id AND y.Id+5 ORDER BY x.Id xid yid Sales ----------- ----------- ----------------------------------------------------- 6 1 9536.0 6 2 9029.0 6 3 8883.0 6 4 10227.0 6 5 9556.0 6 6 9324.0 7 2 9029.0 7 3 8883.0 7 4 10227.0 7 5 9556.0 7 6 9324.0 7 7 10174.0 8 3 8883.0 8 4 10227.0 8 5 9556.0 8 6 9324.0 8 7 10174.0 8 8 9514.0 To compute a 6-month moving average, we need at least 6 months worth of data. Thus, the WHERE clause specifies x.Id>=6. Period 6 represents the first period for which we can access 6 months of data. We cannot compute a 6-month moving average for periods 1 through 5. The WHERE clause further specifies x.Id BETWEEN y.Id AND y.Id+5. This actually represents the join condition and results in each x row being joined with the corresponding y row, as well as the five prior y rows. You can see that for x.Id=6, the query returns sales data from periods 1 through 6. For x.Id=7, the query returns sales data from periods 2 through 7. For x.Id=8, the window shifts again to periods 3 through 8. To compute the moving average, the Id value from the first table is used as a reference — the results are grouped by the x.Id column. Each grouping represents six rows from the second table (aliased as y), and the moving average is computed by applying the AVG function to the y.Sales column. |