Team LiB   Previous Section   Next Section

2.7 Classifying Subsets

2.7.1 Problem

You want to classify aggregated results from subsets into classes of common properties. For example, you want to give each student a grade for each course they are taking. Grades are based on the average Score calculated from term papers in each course. The query has to implement the following grading rules shown in Table 2-1.

Table 2-1. Average scores required for a given grade

Grade

Average score

A

22

B

19

C

16

D

13

E

10

2.7.2 Solution

You have to calculate an average term paper Score per student and then classify that average according to the grading rules laid down in Table 2-1. The following query does this:

SELECT CourseId, StudentName, AVG(Score) Score,(
   CASE WHEN AVG(Score)>=22 THEN 'A' 
      WHEN AVG(Score)>=19 THEN 'B'
      WHEN AVG(Score)>=16 THEN 'C'
      WHEN AVG(Score)>=13 THEN 'D'
      WHEN AVG(Score)>=10 THEN 'E'
      ELSE 'F' END) Grade
FROM Students s
GROUP BY CourseId, StudentName

This query will return the following results:

CourseId  StudentName  Score grade 
--------- ------------ ----- ----- 
ACCN101   Andrew       12.33   E
MGMT120   Andrew       21.66   B
ACCN101   Bert         12.53   E
ACCN101   Cindy        14.15   D
MGMT120   Cindy        14.16   D

2.7.3 Discussion

The solution uses the CASE function to classify the results. It groups term papers together into sets based on student and course. It then calculates an average Score for the term papers present in a particular set. Then, in the CASE statement, the calculated average is compared to the grading rules, and appropriate grades are returned.

You'll notice that the AVG function occurs several times in the query. Don't worry about that. The function has to be repeated for syntactical purposes — columns not listed in the GROUP BY clause must be enclosed by an aggregate function. Rest assured, however, that the database engine will calculate the average just once for each group.

It is important to keep the grading rules in the proper order in the CASE statement used for this solution. Because the greater-than-or-equal-to operator (>=) is used, the first WHEN clause must correspond to the highest interval from Table 2-1. The second WHEN clause must correspond to the second highest interval, and so on.

If you don't want to list your grading rules in order, or if you do not have contiguous ranges, then you can modify your WHEN clauses to specify both the upper and lower limit for each range. For example, the following WHEN clause could be used to compute a D grade:

WHEN AVG(Score)>=13 AND AVG(Score)<16 THEN 'D' 

As you can see, the solution described in this recipe needs to make only one pass through the table to obtain the result. An alternative solution would be to combine the result set from several queries using the UNION operator. The following union query, for example, is equivalent to the nonunion query shown earlier:

SELECT CourseId, StudentName, AVG(Score) Score, 'A' grade
FROM Students s
GROUP BY CourseId, StudentName
HAVING AVG(Score)<=25 and AVG(Score)>=22
UNION
SELECT CourseId, StudentName, AVG(Score) Score, 'B' grade
FROM Students s
GROUP BY CourseId, StudentName
HAVING AVG(Score)<22 and AVG(Score)>=19
UNION
SELECT CourseId, StudentName, AVG(Score) Score, 'C' grade
FROM Students s
GROUP BY CourseId, StudentName
HAVING AVG(Score)<19 and AVG(Score)>=16
UNION
SELECT CourseId, StudentName, AVG(Score) Score, 'D' grade
FROM Students s
GROUP BY CourseId, StudentName
HAVING AVG(Score)<16 and AVG(Score)>=13
UNION
SELECT CourseId, StudentName, AVG(Score) Score, 'E' grade
FROM Students s
GROUP BY CourseId, StudentName
HAVING AVG(Score)<13 and AVG(Score)>=10
UNION
SELECT CourseId, StudentName, AVG(Score) Score, 'F' grade
FROM Students s
GROUP BY CourseId, StudentName
HAVING AVG(Score)<10

As you can see, the union query is long and ugly. Even if there were no other issues with this query, artistic reasons might prevent you from wanting to use it. However, there are more than artistic reasons for not using a union query such as the one shown here. The UNION query represents the execution of several SELECT statements, each of which requires a complete pass through the Students table. Consequently, it will be much less efficient than the alternative query that uses a CASE statement and that only requires one pass through the table.

    Team LiB   Previous Section   Next Section