Team LiB   Previous Section   Next Section

2.8 Summarizing Classes of Sets

2.8.1 Problem

You want to calculate the number of times subsets fall into different classes, and you want to measure the sizes of those classes when classification is performed on nonaggregated data. As an example of this type of problem, let's say you want to count the number of A papers, B papers, and so forth for each student.

2.8.2 Solution

An alternative way of stating the problem is to say that you need to count the number of times each student is given each grade. The following query does this:

SELECT s.StudentName,(
   CASE WHEN s.Score>=22 THEN 'A' 
      WHEN s.Score>=19 THEN 'B'
      WHEN s.Score>=16 THEN 'C'
      WHEN s.Score>=13 THEN 'D'
      WHEN s.Score>=10 THEN 'E'
      ELSE 'F' END) Grade,
   COUNT(*) NoOfPapers
FROM Students s
GROUP BY s.StudentName,
   CASE WHEN s.Score>=22 THEN 'A' 
      WHEN s.Score>=19 THEN 'B'
      WHEN s.Score>=16 THEN 'C'
      WHEN s.Score>=13 THEN 'D'
      WHEN s.Score>=10 THEN 'E'
      ELSE 'F' END
ORDER BY s.StudentName

The results returned by this query will resemble these:

StudentName Grade NoOfPapers  
----------- ----- ----------- 
Andrew      A     1
Andrew      B     2
Andrew      D     1
Andrew      E     2
Bert        D     2
Bert        E     1
Cindy       C     2
Cindy       D     1
Cindy       E     2

2.8.3 Discussion

The code demonstrates an interesting feature of SQL that is not used very often — the use of a CASE statement within a GROUP BY clause. This not often seen, though it is a fairly powerful construct. The first CASE statement, the one in the main part of the query, assigns a grade to each term paper. Instead of aggregating the results by student and paper, the results are then aggregated by student and grade. By counting the number of records in each group, we find out how many A grades a student has, how many B grades, and so forth. The following example illustrates this grouping and counting process:

StudentName Grade NoOfPapers  
----------- ----- ----------- 
Andrew      B
Andrew      B
COUNT(*)          2

Bert        D  
Bert        D
COUNT(*)          2

If you wanted to, you could extend the query so that it also calculated the percentage that each grade represented out of the total number of term papers taken. The following SELECT list shows the extra column that you would need to add to do this:

SELECT s.StudentName,(
   CASE WHEN s.Score>=22 THEN 'A' 
      WHEN s.Score>=19 THEN 'B'
      WHEN s.Score>=16 THEN 'C'
      WHEN s.Score>=13 THEN 'D'
      WHEN s.Score>=10 THEN 'E'
      ELSE 'F' END) Grade,
   COUNT(*) NoOfPapers,
   100*count(*)/(
   SELECT count(*) FROM Students s1
      WHERE s1.StudentName=s.StudentName) Per
...

This modified query will return the following results:

StudentName Grade NoOfPapers  Per         
----------- ----- ----------- ----------- 
Andrew      A     1           16
Andrew      B     2           33
Andrew      D     1           16
Andrew      E     2           33
Bert        D     2           66
Bert        E     1           33
Cindy       C     2           40
Cindy       D     1           20
Cindy       E     2           40
    Team LiB   Previous Section   Next Section