2.8 Summarizing Classes of Sets
2.8.1 Problem
You want to calculate the number of times
subsets fall into different classes,
and you want to measure the sizes of those classes when
classification is performed on nonaggregated data. As an example of
this type of problem, let's say you want to count
the number of A papers, B papers, and so forth for each student.
2.8.2 Solution
An alternative way of stating the problem is to say that you need to
count the number of times each student is given each grade. The
following query does this:
SELECT s.StudentName,(
CASE WHEN s.Score>=22 THEN 'A'
WHEN s.Score>=19 THEN 'B'
WHEN s.Score>=16 THEN 'C'
WHEN s.Score>=13 THEN 'D'
WHEN s.Score>=10 THEN 'E'
ELSE 'F' END) Grade,
COUNT(*) NoOfPapers
FROM Students s
GROUP BY s.StudentName,
CASE WHEN s.Score>=22 THEN 'A'
WHEN s.Score>=19 THEN 'B'
WHEN s.Score>=16 THEN 'C'
WHEN s.Score>=13 THEN 'D'
WHEN s.Score>=10 THEN 'E'
ELSE 'F' END
ORDER BY s.StudentName
The results returned by this query will resemble these:
StudentName Grade NoOfPapers
----------- ----- -----------
Andrew A 1
Andrew B 2
Andrew D 1
Andrew E 2
Bert D 2
Bert E 1
Cindy C 2
Cindy D 1
Cindy E 2
2.8.3 Discussion
The code demonstrates an interesting feature of SQL that is not used
very often — the use of a CASE statement within a GROUP BY
clause. This not often seen, though it is a fairly powerful
construct. The first CASE statement, the one in the main part of the
query, assigns a grade to each term paper. Instead of aggregating the
results by student and paper, the results are then aggregated by
student and grade. By counting the number of records in each group,
we find out how many A grades a student has, how many B grades, and
so forth. The following example illustrates this grouping and
counting process:
StudentName Grade NoOfPapers
----------- ----- -----------
Andrew B
Andrew B
COUNT(*) 2
Bert D
Bert D
COUNT(*) 2
If you wanted to, you could extend the query so that it also
calculated the percentage that each grade represented out of the
total number of term papers taken. The following SELECT list shows
the extra column that you would need to add to do this:
SELECT s.StudentName,(
CASE WHEN s.Score>=22 THEN 'A'
WHEN s.Score>=19 THEN 'B'
WHEN s.Score>=16 THEN 'C'
WHEN s.Score>=13 THEN 'D'
WHEN s.Score>=10 THEN 'E'
ELSE 'F' END) Grade,
COUNT(*) NoOfPapers,
100*count(*)/(
SELECT count(*) FROM Students s1
WHERE s1.StudentName=s.StudentName) Per
...
This modified query will return the following results:
StudentName Grade NoOfPapers Per
----------- ----- ----------- -----------
Andrew A 1 16
Andrew B 2 33
Andrew D 1 16
Andrew E 2 33
Bert D 2 66
Bert E 1 33
Cindy C 2 40
Cindy D 1 20
Cindy E 2 40
|