4.6 Implementing General Hierarchies4.6.1 ProblemYou want to implement a project-tracking system based on a general hierarchy, and you want to run some basic, single-level hierarchical queries against that hierarchy. 4.6.2 SolutionUse the general parent/child model described earlier in the project-tracking example in Section 4.1 as the basis for the project-tracking system. In this model, a project is composed of subprojects or tasks, tasks may be broken into subtasks, and subtasks may themselves be broken up into subtasks. There is no limit to the level of nesting that may occur, and it all occurs within one database table named Projects. Conceptually, a project is simply the highest-level task. The Projects table is defined as follows: CREATE TABLE Projects( Name VARCHAR(20), Cost INTEGER, Parent INTEGER, VertexId INTEGER, Primary key(VertexId) ) With this solution in mind, let's look at a few problems that are common to hierarchical structures, such as the one used here to define projects:
The next few sections present SQL Server solutions to each of these problems. 4.6.2.1 List all leaf nodesLeaf nodes are nodes without children. The subquery in the following SELECT statement checks each node to see if any other nodes claim it as a parent. If none do, then that node is a leaf node: SELECT Name FROM Projects p WHERE NOT EXISTS( SELECT * FROM Projects WHERE Parent=p.VertexId) The result of executing this query is a list of all subprojects (or tasks if you prefer) that have no children: Name -------------------- Interviews Drafts Consolidations Presentation UI Design Correctness Testing Database UI Implementation Coding Initial Testing Final Adjustments Production Testing 4.6.2.2 List all nodes that are not leaf nodesTo list all nodes that are not leaf nodes — in other words, all nodes that have children — begin with the same query used to list leaf nodes and just convert the NOT EXISTS into an EXISTS: SELECT Name FROM Projects p WHERE EXISTS( SELECT * FROM Projects WHERE Parent=p.VertexId) This query lists all nodes with children: Name -------------------- New SW Specifications Final Document Prototype Calculating Development Beta Testing 4.6.2.3 Find the most expensive verticesEach task in the project has a cost associated with it. Each task is also a vertex. To find the N most expensive tasks, simply query the table, order the results by cost, and use the TOP operator to limit the results to the top N rows. For example: SELECT TOP 5 Name, Cost FROM Projects ORDER BY cost DESC Name Cost -------------------- ----------- Inital testing 40 Beta testing 40 Development 30 Coding 20 Production testing 20 This is a good example of a case where it is more productive to issue straight SQL queries, rather than attempt to navigate the hierarchy. 4.6.2.4 Find the immediate children of a nodeGiven a particular node, you can write a query to return that node's immediate children. The following query, for example, returns the subprojects of the project named Specifications: SELECT Name FROM Projects WHERE Parent=( SELECT VertexId FROM Projects WHERE Name='Specifications' ) Name -------------------- Interviews Drafts Consolidations Final document Only immediate children are returned. The parent record for each of the four records shown here is the record for the "Specifications" task. These records may, themselves, have children, but those children won't show up in the results for this query. 4.6.2.5 Find the rootThe root is the vertex with no parents. The query to return it is simple, because all you need is to return the one node with no parent pointer. For example: SELECT Name FROM Projects WHERE Parent=0 Name -------------------- New SW In our example, we use a value of zero to indicate that a node does not have a parent. You could just as easily use a NULL to indicate the same thing, in which case you would specify Parent IS NULL as your WHERE clause condition. 4.6.3 DiscussionAs you can see, these queries work on a single level where, at most, they refer to one level above (to parents) or below (to children). For these types of queries, you don't need any additional algorithms or data structures to retrieve information. The queries are pretty straightforward, and you just have to remember the definition of your hierarchy and its elements to write them. |