8.3 Oracle SQL ExtensionsIn the last few examples, we saw how we can perform some operations on the hierarchical tree by using simple SQL techniques. Operations such as traversing a tree, finding levels, etc., require more complex SQL statements, and also require the use of features designed specifically for working with hierarchical data. Oracle provides some extensions to ANSI SQL to facilitate these operations. But before moving to the Oracle SQL extensions, let's look at how we can traverse a tree using ANSI SQL, and at the problems we'll encounter when doing that. For example, let's say we want to list each employee with his manager. Using regular Oracle SQL, we can perform self outer joins on the EMPLOYEE table, as shown here: SELECT E_TOP.LNAME, E_2.LNAME, E_3.LNAME, E_4.LNAME FROM EMPLOYEE E_TOP, EMPLOYEE E_2, EMPLOYEE E_3, EMPLOYEE E_4 WHERE E_TOP.MANAGER_EMP_ID IS NULL AND E_TOP.EMP_ID = E_2.MANAGER_EMP_ID (+) AND E_2.EMP_ID = E_3.MANAGER_EMP_ID (+) AND E_3.EMP_ID = E_4.MANAGER_EMP_ID (+); LNAME LNAME LNAME LNAME ---------- ---------- ---------- ---------- KING BLAKE ALLEN KING BLAKE WARD KING BLAKE MARTIN KING JONES SCOTT ADAMS KING BLAKE TURNER KING BLAKE JAMES KING JONES FORD SMITH KING CLARK MILLER 8 rows selected. The query returns eight rows, corresponding to the eight branches of the tree. To get those results, the query performs a self join on four instances of the EMPLOYEE table. Four EMPLOYEE table instances are needed in this statement because there are four levels to the hierarchy. Each level is represented by one copy of the EMPLOYEE table. The outer join is required because one employee (KING) has a NULL value in the MANAGER_EMP_ID column. This type query has several drawbacks. First of all, we need to know the number of levels in an organization chart when we write the query, and it's not realistic to assume that we will know that information. It's even less realistic to think that the number of levels will remain stable over time. Moreover, we need to join four instances of the EMPLOYEE table together for a four level hierarchy. Imagine an organization with 20 levels—we'd need to join 20 tables. This would cause a huge performance problem. To circumvent problems such as these, Oracle has provided some extensions to ANSI SQL. Oracle provides the following three constructs to effectively and efficiently perform hierarchical queries:
The following sections discuss these three Oracle extensions in detail. 8.3.1 START WITH...CONNECT BY and PRIORWe can extract information in hierarchical form from a table containing hierarchical data by using the SELECT statement's START WITH...CONNECT BY clause. The syntax for this clause is: [[START WITH condition1] CONNECT BY condition2] The syntax elements are:
PRIOR is a built-in Oracle SQL operator that is used with hierarchical queries only. In a hierarchical query, the CONNECT BY clause specifies the relationship between parent and child rows. When we use the PRIOR operator in an expression in the CONNECT BY condition, the expression following the PRIOR keyword is evaluated for the parent row of the current row in the query. In the following example, PRIOR is used to connect each row to its parent by connecting MANAGER_EMP_ID in the child to EMP_ID in the parent: SELECT LNAME, EMP_ID, MANAGER_EMP_ID FROM EMPLOYEE START WITH MANAGER_EMP_ID IS NULL CONNECT BY PRIOR EMP_ID = MANAGER_EMP_ID; LNAME EMP_ID MANAGER_EMP_ID -------------------- ---------- -------------- KING 7839 JONES 7566 7839 SCOTT 7788 7566 ADAMS 7876 7788 FORD 7902 7566 SMITH 7369 7902 BLAKE 7698 7839 ALLEN 7499 7698 WARD 7521 7698 MARTIN 7654 7698 TURNER 7844 7698 JAMES 7900 7698 CLARK 7782 7839 MILLER 7934 7782 14 rows selected. The PRIOR column does not need to be listed first. The previous query could be restated as: SELECT LNAME, EMP_ID, MANAGER_EMP_ID FROM EMPLOYEE START WITH MANAGER_EMP_ID IS NULL CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID; Since the CONNECT BY condition specifies the parent-child relationship, it cannot contain a loop. If a row is both parent (direct ancestor) and child (direct descendent) of another row, then we have a loop. For example, if the EMPLOYEE table had the following two rows, they would represent a loop: EMP_ID LNAME DEPT_ID MANAGER_EMP_ID SALARY HIRE_DATE ------ ---------- --------- -------------- --------- --------- 9001 SMITH 20 9002 1800 15-NOV-61 9002 ALLEN 30 9001 11600 16-NOV-61 When a parent-child relationship involves two or more columns, we need to use the PRIOR operator before each parent column. Let's take as an example an assembly in a manufacturing plant. An assembly may consist of several subassemblies, and a given subassembly may further contain one or more subassemblies. All of these are stored in a table, ASSEMBLY: DESC ASSEMBLY Name Null? Type -------------------------- -------- -------------- ASSEMBLY_TYPE NOT NULL VARCHAR2(4) ASSEMBLY_ID NOT NULL NUMBER(6) DESCRIPTION NOT NULL VARCHAR2(20) PARENT_ASSEMBLY_TYPE VARCHAR2(4) PARENT_ASSEMBLY_ID NUMBER(6) ASSEMBLY_TYPE and ASSEMBLY_ID constitute the primary key of this table, and the columns PARENT_ASSEMBLY_TYPE and PARENT_ASSEMBLY_ID together constitute the self-referential foreign key. Therefore, if we want to perform a hierarchical query on this table, we need to include both columns in the START WITH and the CONNECT BY clauses. Also, we need to use the PRIOR operator before each parent column, as shown in the following example: SELECT * FROM ASSEMBLY START WITH PARENT_ASSEMBLY_TYPE IS NULL AND PARENT_ASSEMBLY_ID IS NULL CONNECT BY PARENT_ASSEMBLY_TYPE = PRIOR ASSEMBLY_TYPE AND PARENT_ASSEMBLY_ID = PRIOR ASSEMBLY_ID; ASSE ASSEMBLY_ID DESCRIPTION PARE PARENT_ASSEMBLY_ID ---- ----------- -------------------- ---- ------------------ A 1234 Assembly A#1234 A 1256 Assembly A#1256 A 1234 B 6543 Part Unit#6543 A 1234 A 1675 Part Unit#1675 B 6543 X 9943 Repair Zone 1 X 5438 Repair Unit #5438 X 9943 X 1675 Readymade Unit #1675 X 5438 7 rows selected. 8.3.2 The LEVEL PseudocolumnIn a hierarchy tree, the term level refers to one layer of nodes. For example, in Figure 8-1, the root node (consisting of employee KING) is level 1. The next layer (employees JONES, BLAKE, CLARK) is at level 2, and so forth. Oracle provides a pseudocolumn, LEVEL, to represent these levels in a hierarchy tree. Whenever we use the START WITH...CONNECT BY clauses in a hierarchical query, we can use the pseudocolumn LEVEL to return the level number for each row returned by the query. The following example illustrates the use of the LEVEL pseudocolumn: SELECT LEVEL, LNAME, EMP_ID, MANAGER_EMP_ID FROM EMPLOYEE START WITH MANAGER_EMP_ID IS NULL CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID; LEVEL LNAME EMP_ID MANAGER_EMP_ID ---------- -------------------- ---------- -------------- 1 KING 7839 2 JONES 7566 7839 3 SCOTT 7788 7566 4 ADAMS 7876 7788 3 FORD 7902 7566 4 SMITH 7369 7902 2 BLAKE 7698 7839 3 ALLEN 7499 7698 3 WARD 7521 7698 3 MARTIN 7654 7698 3 TURNER 7844 7698 3 JAMES 7900 7698 2 CLARK 7782 7839 3 MILLER 7934 7782 14 rows selected. Note that each employee is now associated with a number, represented by the pseudocolumn LEVEL, that corresponds to its level in the organization chart (see Figure 8-1). |