Team LiB   Previous Section   Next Section

7.9 Joining Arrays with Tables

7.9.1 Problem

You want to join non-normalized data that has been imported and stored in an array-like format with data from one or more normalized tables. You want the results of the join to be normalized, so, in essence, this is a recipe for folding a table and doing a join at the same time.

Your task is to produce a report listing from the StudentThesis table, listing each student together with the three members of his dissertation committee.

7.9.2 Solution

To produce this report, you need to combine the techniques of folding and joining data. Since the query does not need to return professor-specific information from the Thesis table, you can use the following query:

SELECT t.StudentId,p.Name 
FROM StudentThesis t JOIN Professors p
ON 
   t.Member1=p.Id OR 
   t.Member2=p.Id OR 
   t.Member3=p.Id

This query works because for any given combination of rows from the StudentThesis and Professor tables, a maximum of only one member Id from the StudentThesis table will be relevant. Each StudentThesis row will end up being joined with three different Professor rows. The first few lines of output from this query look like this:

StudentId    Name
------------ -----
1            Baird, Frances
1            Anderson, Terry
1            Newton, Isaac
2            Newton, Isaac 

Once you've come this far, it's trivial to extend this query to also return the student names from the Students master table. Since the current query result and the master student records are both normalized, you just need to add some additional code to join the Students table to the previous results. For example:

SELECT t.StudentId, s.StudentName, p.Name 
FROM StudentThesis t JOIN Professors p 
   ON(t.Member1=p.Id OR 
      t.Member2=p.Id OR 
      t.Member3=p.Id)
   JOIN Students s
   ON t.StudentId=s.StudentId

7.9.3 Discussion

The first query shown in this recipe transforms the non-normalized StudentThesis table into a normalized result set and, at the same time, joins the StudentThesis data to the Professor table to retrieve the professors' names. The same result could be achieved by using the earlier recipe in this chapter as a basis for a subquery to fold the data, and then by joining that subquery with the Professor table to get the names. Such a query would look like this:

SELECT m.StudentId, p.Name
FROM
(SELECT t.StudentId StudentId,
      (CASE WHEN f.i=1 THEN t.Member1
            WHEN f.i=2 THEN t.Member2
            WHEN f.i=3 THEN t.Member3 END) Member
FROM StudentThesis t JOIN Pivot f
      ON f.i BETWEEN 1 AND 3) m 
   JOIN Professors p
      ON m.Member = p.Id

This query, however, requires two SELECT statements to be executed, while the solution proposed in this recipe requires only one.

We do not recommend the use of this recipe's solution for anything other than batch procedures and infrequently generated reports. If you require real-time performance, you should restructure your data so that it is normalized.

    Team LiB   Previous Section   Next Section