Team LiB   Previous Section   Next Section

4.6 Implementing General Hierarchies

4.6.1 Problem

You want to implement a project-tracking system based on a general hierarchy, and you want to run some basic, single-level hierarchical queries against that hierarchy.

4.6.2 Solution

Use the general parent/child model described earlier in the project-tracking example in Section 4.1 as the basis for the project-tracking system. In this model, a project is composed of subprojects or tasks, tasks may be broken into subtasks, and subtasks may themselves be broken up into subtasks. There is no limit to the level of nesting that may occur, and it all occurs within one database table named Projects. Conceptually, a project is simply the highest-level task. The Projects table is defined as follows:

CREATE TABLE Projects(
   Name VARCHAR(20),
   Cost INTEGER,
   Parent INTEGER,
   VertexId INTEGER,
   Primary key(VertexId)
)

With this solution in mind, let's look at a few problems that are common to hierarchical structures, such as the one used here to define projects:

  • List all leaf nodes

  • List all nodes that are not leaf nodes

  • Find the most expensive vertices

  • List the immediate children of a node

  • Find the root

The next few sections present SQL Server solutions to each of these problems.

4.6.2.1 List all leaf nodes

Leaf nodes are nodes without children. The subquery in the following SELECT statement checks each node to see if any other nodes claim it as a parent. If none do, then that node is a leaf node:

SELECT Name FROM Projects p 
WHERE NOT EXISTS(
   SELECT * FROM Projects 
   WHERE Parent=p.VertexId)

The result of executing this query is a list of all subprojects (or tasks if you prefer) that have no children:

Name                 
-------------------- 
Interviews
Drafts
Consolidations
Presentation
UI Design
Correctness Testing
Database
UI Implementation
Coding
Initial Testing
Final Adjustments
Production Testing
4.6.2.2 List all nodes that are not leaf nodes

To list all nodes that are not leaf nodes — in other words, all nodes that have children — begin with the same query used to list leaf nodes and just convert the NOT EXISTS into an EXISTS:

SELECT Name FROM Projects p 
WHERE EXISTS(
   SELECT * FROM Projects 
   WHERE Parent=p.VertexId)

This query lists all nodes with children:

Name                 
-------------------- 
New SW
Specifications
Final Document
Prototype
Calculating
Development
Beta Testing
4.6.2.3 Find the most expensive vertices

Each task in the project has a cost associated with it. Each task is also a vertex. To find the N most expensive tasks, simply query the table, order the results by cost, and use the TOP operator to limit the results to the top N rows. For example:

SELECT TOP 5 Name, Cost 
FROM Projects 
ORDER BY cost DESC

Name                 Cost        
-------------------- ----------- 
Inital testing       40
Beta testing         40
Development          30
Coding               20
Production testing   20

This is a good example of a case where it is more productive to issue straight SQL queries, rather than attempt to navigate the hierarchy.

4.6.2.4 Find the immediate children of a node

Given a particular node, you can write a query to return that node's immediate children. The following query, for example, returns the subprojects of the project named Specifications:

SELECT Name FROM Projects 
WHERE Parent=(
   SELECT VertexId FROM Projects 
   WHERE Name='Specifications' )

Name                 
-------------------- 
Interviews
Drafts
Consolidations
Final document

Only immediate children are returned. The parent record for each of the four records shown here is the record for the "Specifications" task. These records may, themselves, have children, but those children won't show up in the results for this query.

4.6.2.5 Find the root

The root is the vertex with no parents. The query to return it is simple, because all you need is to return the one node with no parent pointer. For example:

SELECT Name FROM Projects WHERE Parent=0

Name                 
-------------------- 
New SW

In our example, we use a value of zero to indicate that a node does not have a parent. You could just as easily use a NULL to indicate the same thing, in which case you would specify Parent IS NULL as your WHERE clause condition.

4.6.3 Discussion

As you can see, these queries work on a single level where, at most, they refer to one level above (to parents) or below (to children). For these types of queries, you don't need any additional algorithms or data structures to retrieve information. The queries are pretty straightforward, and you just have to remember the definition of your hierarchy and its elements to write them.

    Team LiB   Previous Section   Next Section