Team LiB   Previous Section   Next Section

8.3 Oracle SQL Extensions

In the last few examples, we saw how we can perform some operations on the hierarchical tree by using simple SQL techniques. Operations such as traversing a tree, finding levels, etc., require more complex SQL statements, and also require the use of features designed specifically for working with hierarchical data. Oracle provides some extensions to ANSI SQL to facilitate these operations. But before moving to the Oracle SQL extensions, let's look at how we can traverse a tree using ANSI SQL, and at the problems we'll encounter when doing that.

For example, let's say we want to list each employee with his manager. Using regular Oracle SQL, we can perform self outer joins on the EMPLOYEE table, as shown here:

SELECT E_TOP.LNAME, E_2.LNAME, E_3.LNAME, E_4.LNAME
FROM EMPLOYEE E_TOP, EMPLOYEE E_2, EMPLOYEE E_3, EMPLOYEE E_4
WHERE E_TOP.MANAGER_EMP_ID IS NULL
AND E_TOP.EMP_ID = E_2.MANAGER_EMP_ID (+)
AND E_2.EMP_ID = E_3.MANAGER_EMP_ID (+)
AND E_3.EMP_ID = E_4.MANAGER_EMP_ID (+);

LNAME      LNAME      LNAME      LNAME   
---------- ---------- ---------- ----------
KING       BLAKE      ALLEN
KING       BLAKE      WARD
KING       BLAKE      MARTIN
KING       JONES      SCOTT      ADAMS
KING       BLAKE      TURNER
KING       BLAKE      JAMES
KING       JONES      FORD       SMITH
KING       CLARK      MILLER

8 rows selected.

The query returns eight rows, corresponding to the eight branches of the tree. To get those results, the query performs a self join on four instances of the EMPLOYEE table. Four EMPLOYEE table instances are needed in this statement because there are four levels to the hierarchy. Each level is represented by one copy of the EMPLOYEE table. The outer join is required because one employee (KING) has a NULL value in the MANAGER_EMP_ID column.

This type query has several drawbacks. First of all, we need to know the number of levels in an organization chart when we write the query, and it's not realistic to assume that we will know that information. It's even less realistic to think that the number of levels will remain stable over time. Moreover, we need to join four instances of the EMPLOYEE table together for a four level hierarchy. Imagine an organization with 20 levels—we'd need to join 20 tables. This would cause a huge performance problem.

To circumvent problems such as these, Oracle has provided some extensions to ANSI SQL. Oracle provides the following three constructs to effectively and efficiently perform hierarchical queries:

  • The START WITH...CONNECT BY clause

  • The PRIOR operator

  • The LEVEL pseudocolumn

The following sections discuss these three Oracle extensions in detail.

8.3.1 START WITH...CONNECT BY and PRIOR

We can extract information in hierarchical form from a table containing hierarchical data by using the SELECT statement's START WITH...CONNECT BY clause. The syntax for this clause is:

[[START WITH condition1]  CONNECT BY condition2]

The syntax elements are:

START WITH condition1

Specifies the root row(s) of the hierarchy. All rows that satisfy condition1 are considered root rows. If we don't specify the START WITH clause, all rows are considered root rows, which is usually not desirable. We can include a subquery in condition1.

CONNECT BY condition2

Specifies the relationship between parent rows and child rows in the hierarchy. The relationship is expressed as a comparison expression, where columns from the current row are compared to corresponding parent columns. condition2 must contain the PRIOR operator, which is used to identify columns from the parent row. condition2 cannot contain a subquery.

PRIOR is a built-in Oracle SQL operator that is used with hierarchical queries only. In a hierarchical query, the CONNECT BY clause specifies the relationship between parent and child rows. When we use the PRIOR operator in an expression in the CONNECT BY condition, the expression following the PRIOR keyword is evaluated for the parent row of the current row in the query. In the following example, PRIOR is used to connect each row to its parent by connecting MANAGER_EMP_ID in the child to EMP_ID in the parent:

SELECT LNAME, EMP_ID, MANAGER_EMP_ID
FROM EMPLOYEE
START WITH MANAGER_EMP_ID IS NULL
CONNECT BY PRIOR EMP_ID = MANAGER_EMP_ID;

LNAME                    EMP_ID MANAGER_EMP_ID
-------------------- ---------- --------------
KING                       7839
JONES                      7566           7839
SCOTT                      7788           7566
ADAMS                      7876           7788
FORD                       7902           7566
SMITH                      7369           7902
BLAKE                      7698           7839
ALLEN                      7499           7698
WARD                       7521           7698
MARTIN                     7654           7698
TURNER                     7844           7698
JAMES                      7900           7698
CLARK                      7782           7839
MILLER                     7934           7782

14 rows selected.

The PRIOR column does not need to be listed first. The previous query could be restated as:

SELECT LNAME, EMP_ID, MANAGER_EMP_ID
FROM EMPLOYEE
START WITH MANAGER_EMP_ID IS NULL
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;

Since the CONNECT BY condition specifies the parent-child relationship, it cannot contain a loop. If a row is both parent (direct ancestor) and child (direct descendent) of another row, then we have a loop. For example, if the EMPLOYEE table had the following two rows, they would represent a loop:

EMP_ID LNAME        DEPT_ID MANAGER_EMP_ID    SALARY HIRE_DATE 
------ ---------- --------- -------------- --------- --------- 
  9001 SMITH             20           9002      1800 15-NOV-61 
  9002 ALLEN             30           9001     11600 16-NOV-61 

When a parent-child relationship involves two or more columns, we need to use the PRIOR operator before each parent column. Let's take as an example an assembly in a manufacturing plant. An assembly may consist of several subassemblies, and a given subassembly may further contain one or more subassemblies. All of these are stored in a table, ASSEMBLY:

DESC ASSEMBLY
 Name                       Null?    Type
 -------------------------- -------- --------------
 ASSEMBLY_TYPE              NOT NULL VARCHAR2(4)
 ASSEMBLY_ID                NOT NULL NUMBER(6)
 DESCRIPTION                NOT NULL VARCHAR2(20)
 PARENT_ASSEMBLY_TYPE                VARCHAR2(4)
 PARENT_ASSEMBLY_ID                  NUMBER(6)

ASSEMBLY_TYPE and ASSEMBLY_ID constitute the primary key of this table, and the columns PARENT_ASSEMBLY_TYPE and PARENT_ASSEMBLY_ID together constitute the self-referential foreign key. Therefore, if we want to perform a hierarchical query on this table, we need to include both columns in the START WITH and the CONNECT BY clauses. Also, we need to use the PRIOR operator before each parent column, as shown in the following example:

SELECT * FROM ASSEMBLY
START WITH PARENT_ASSEMBLY_TYPE IS NULL
AND PARENT_ASSEMBLY_ID IS NULL
CONNECT BY PARENT_ASSEMBLY_TYPE = PRIOR ASSEMBLY_TYPE
AND PARENT_ASSEMBLY_ID = PRIOR ASSEMBLY_ID;

ASSE ASSEMBLY_ID DESCRIPTION          PARE PARENT_ASSEMBLY_ID
---- ----------- -------------------- ---- ------------------
A           1234 Assembly A#1234
A           1256 Assembly A#1256      A                  1234
B           6543 Part Unit#6543       A                  1234
A           1675 Part Unit#1675       B                  6543
X           9943 Repair Zone 1
X           5438 Repair Unit #5438    X                  9943
X           1675 Readymade Unit #1675 X                  5438

7 rows selected.

8.3.2 The LEVEL Pseudocolumn

In a hierarchy tree, the term level refers to one layer of nodes. For example, in Figure 8-1, the root node (consisting of employee KING) is level 1. The next layer (employees JONES, BLAKE, CLARK) is at level 2, and so forth. Oracle provides a pseudocolumn, LEVEL, to represent these levels in a hierarchy tree. Whenever we use the START WITH...CONNECT BY clauses in a hierarchical query, we can use the pseudocolumn LEVEL to return the level number for each row returned by the query. The following example illustrates the use of the LEVEL pseudocolumn:

SELECT LEVEL, LNAME, EMP_ID, MANAGER_EMP_ID
FROM EMPLOYEE
START WITH MANAGER_EMP_ID IS NULL
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;

     LEVEL LNAME                    EMP_ID MANAGER_EMP_ID
---------- -------------------- ---------- --------------
         1 KING                       7839
         2 JONES                      7566           7839
         3 SCOTT                      7788           7566
         4 ADAMS                      7876           7788
         3 FORD                       7902           7566
         4 SMITH                      7369           7902
         2 BLAKE                      7698           7839
         3 ALLEN                      7499           7698
         3 WARD                       7521           7698
         3 MARTIN                     7654           7698
         3 TURNER                     7844           7698
         3 JAMES                      7900           7698
         2 CLARK                      7782           7839
         3 MILLER                     7934           7782

14 rows selected.

Note that each employee is now associated with a number, represented by the pseudocolumn LEVEL, that corresponds to its level in the organization chart (see Figure 8-1).

    Team LiB   Previous Section   Next Section