12.3 The GROUPING FunctionROLLUP and CUBE produce extra rows in the output that contain subtotals and totals. These rows contain NULL values for one or more columns. An output containing NULLs and indicating subtotals doesn't make sense to an ordinary person who is unware of the behavior of ROLLUP and CUBE operations. Does your VP care about whether you used ROLLUP or CUBE or any other operation to get him the monthly total sales for each region? Obviously, he doesn't. That's exactly why you are reading this page and not your VP. If you know your way around the NVL function, you would probably attempt to translate each NULL value from CUBE and ROLLUP to some descriptive value, as in the following example: SELECT NVL(TO_CHAR(O.YEAR), 'All Years') YEAR, NVL(TO_CHAR(TO_DATE(O.MONTH, 'MM'), 'Month'), 'First Quarter') MONTH, NVL(R.NAME, 'All Regions') REGION, SUM(O.TOT_SALES) FROM ORDERS O, REGION R WHERE R.REGION_ID = O.REGION_ID AND O.MONTH BETWEEN 1 AND 3 GROUP BY ROLLUP (O.YEAR, O.MONTH, R.NAME); YEAR MONTH REGION SUM(O.TOT_SALES) ---------------- ------------- -------------------- ---------------- 2000 January Mid-Atlantic 1221394 2000 January New England 1018430 2000 January SouthEast US 758042 2000 January All Regions 2997866 2000 February Mid-Atlantic 857352 2000 February New England 1231492 2000 February SouthEast US 1236846 2000 February All Regions 3325690 2000 March Mid-Atlantic 1274062 2000 March New England 1132966 2000 March SouthEast US 1311986 2000 March All Regions 3719014 2000 First Quarter All Regions 10042570 2001 January Mid-Atlantic 610697 2001 January New England 509215 2001 January SouthEast US 379021 2001 January All Regions 1498933 2001 February Mid-Atlantic 428676 2001 February New England 615746 2001 February SouthEast US 618423 2001 February All Regions 1662845 2001 March Mid-Atlantic 637031 2001 March New England 566483 2001 March SouthEast US 655993 2001 March All Regions 1859507 2001 First Quarter All Regions 5021285 All Years First Quarter All Regions 15063855 27 rows selected. The NVL function works pretty well for this example. However, if the data itself contains some NULL values, it becomes impossible to distinguish whether a NULL value represents unavailable data or a subtotal row. The NVL function will cause a problem in such a case. The following data can be used to illustrate this problem: SELECT * FROM CUST_ORDER; ORDER_NBR CUST_NBR SALES_EMP_ID SALE_PRICE ORDER_DT EXPECTED_ STATUS ---------- -------- ------------ ---------- --------- --------- ---------- 1001 231 7354 99 22-JUL-01 23-JUL-01 DELIVERED 1000 201 7354 19-JUL-01 24-JUL-01 1002 255 7368 12-JUL-01 25-JUL-01 1003 264 7368 56 16-JUL-01 26-JUL-01 DELIVERED 1004 244 7368 34 18-JUL-01 27-JUL-01 PENDING 1005 288 7368 99 22-JUL-01 24-JUL-01 DELIVERED 1006 231 7354 22-JUL-01 28-JUL-01 1007 255 7368 25 20-JUL-01 22-JUL-01 PENDING 1008 255 7368 25 21-JUL-01 23-JUL-01 PENDING 1009 231 7354 56 18-JUL-01 22-JUL-01 DELIVERED 1012 231 7354 99 22-JUL-01 23-JUL-01 DELIVERED 1011 201 7354 19-JUL-01 24-JUL-01 1015 255 7368 12-JUL-01 25-JUL-01 1017 264 7368 56 16-JUL-01 26-JUL-01 DELIVERED 1019 244 7368 34 18-JUL-01 27-JUL-01 PENDING 1021 288 7368 99 22-JUL-01 24-JUL-01 DELIVERED 1023 231 7354 22-JUL-01 28-JUL-01 1025 255 7368 25 20-JUL-01 22-JUL-01 PENDING 1027 255 7368 25 21-JUL-01 23-JUL-01 PENDING 1029 231 7354 56 18-JUL-01 22-JUL-01 DELIVERED 20 rows selected. Note that the column STATUS contains NULL values. If you want the summary status of orders for each customer, and you executed the following query (note the application of NVL to the STATUS column), the output might surprise you. SELECT NVL(TO_CHAR(CUST_NBR), 'All Customers') CUSTOMER, NVL(STATUS, 'All Status') STATUS, COUNT(*) FROM CUST_ORDER GROUP BY CUBE(CUST_NBR, STATUS); CUSTOMER STATUS COUNT(*) ------------------- -------------------- ---------- 201 All Status 2 201 All Status 2 231 DELIVERED 4 231 All Status 2 231 All Status 6 244 PENDING 2 244 All Status 2 255 PENDING 4 255 All Status 2 255 All Status 6 264 DELIVERED 2 264 All Status 2 288 DELIVERED 2 288 All Status 2 All Customers DELIVERED 8 All Customers PENDING 6 All Customers All Status 6 All Customers All Status 20 18 rows selected. This output doesn't make any sense. You stand a good chance of losing your job if you send this output to your VP. The problem is that any time the STATUS column legitimately contains a NULL value, the NVL function returns the string "All Status". Obviously, NVL isn't useful in this situation. However, don't worry—Oracle8i provides a solution to this problem through the GROUPING function. The GROUPING function is used only in conjunction with either a ROLLUP or a CUBE operation. The GROUPING function takes a grouping column name as input and returns either 1 or 0. A 1 is returned if the value is NULL as the result of aggregation (ROLLUP or CUBE); otherwise, 0 is returned. The general syntax of the GROUPING function is: SELECT ... [GROUPING(grouping_column_name)] ... FROM ... GROUP BY ... {ROLLUP | CUBE} (grouping_column_name) The following example illustrates the use of GROUPING function in a simple way by returning the GROUPING function results for the three columns passed to ROLLUP: SELECT O.YEAR, TO_CHAR(TO_DATE(O.MONTH, 'MM'), 'Month') MONTH, R.NAME REGION, SUM(O.TOT_SALES), GROUPING(O.YEAR) Y, GROUPING(O.MONTH) M, GROUPING(R.NAME) R FROM ORDERS O, REGION R WHERE R.REGION_ID = O.REGION_ID AND O.MONTH BETWEEN 1 AND 3 GROUP BY ROLLUP (O.YEAR, O.MONTH, R.NAME); YEAR MONTH REGION SUM(O.TOT_SALES) Y M R ---------- --------- -------------------- ---------------- ---- ---- ---- 2000 January Mid-Atlantic 1221394 0 0 0 2000 January New England 1018430 0 0 0 2000 January SouthEast US 758042 0 0 0 2000 January 2997866 0 0 1 2000 February Mid-Atlantic 857352 0 0 0 2000 February New England 1231492 0 0 0 2000 February SouthEast US 1236846 0 0 0 2000 February 3325690 0 0 1 2000 March Mid-Atlantic 1274062 0 0 0 2000 March New England 1132966 0 0 0 2000 March SouthEast US 1311986 0 0 0 2000 March 3719014 0 0 1 2000 10042570 0 1 1 2001 January Mid-Atlantic 610697 0 0 0 2001 January New England 509215 0 0 0 2001 January SouthEast US 379021 0 0 0 2001 January 1498933 0 0 1 2001 February Mid-Atlantic 428676 0 0 0 2001 February New England 615746 0 0 0 2001 February SouthEast US 618423 0 0 0 2001 February 1662845 0 0 1 2001 March Mid-Atlantic 637031 0 0 0 2001 March New England 566483 0 0 0 2001 March SouthEast US 655993 0 0 0 2001 March 1859507 0 0 1 2001 5021285 0 1 1 15063855 1 1 1 27 rows selected. Look at the Y, M, and R columns in this output. Row 4 is a region-level subtotal for a particular month and year, and therefore, the GROUPING function results in a value of 1 for the region and a value 0 for the month and year. Row 26 (the second to last) is a subtotal for all regions and months for a particular year, and therefore, the GROUPING function prints 1 for the month and the region and 0 for the year. Row 27 (the grand total) contains 1 for all the GROUPING columns. With a combination of GROUPING and DECODE, you can produce more readable query output when using CUBE and ROLLUP, as in the following example: SELECT DECODE(GROUPING(O.YEAR), 1, 'All Years', O.YEAR) Year, DECODE(GROUPING(O.MONTH), 1, 'All Months', TO_CHAR(TO_DATE(O.MONTH, 'MM'), 'Month')) Month, DECODE(GROUPING(R.NAME), 1, 'All Regions', R.NAME) Region, SUM(O.TOT_SALES) FROM ORDERS O, REGION R WHERE R.REGION_ID = O.REGION_ID AND O.MONTH BETWEEN 1 AND 3 GROUP BY ROLLUP (O.YEAR, O.MONTH, R.NAME); YEAR MONTH REGION SUM(O.TOT_SALES) ------------ ---------- -------------------- ---------------- 2000 January Mid-Atlantic 1221394 2000 January New England 1018430 2000 January SouthEast US 758042 2000 January All Regions 2997866 2000 February Mid-Atlantic 857352 2000 February New England 1231492 2000 February SouthEast US 1236846 2000 February All Regions 3325690 2000 March Mid-Atlantic 1274062 2000 March New England 1132966 2000 March SouthEast US 1311986 2000 March All Regions 3719014 2000 All Months All Regions 10042570 2001 January Mid-Atlantic 610697 2001 January New England 509215 2001 January SouthEast US 379021 2001 January All Regions 1498933 2001 February Mid-Atlantic 428676 2001 February New England 615746 2001 February SouthEast US 618423 2001 February All Regions 1662845 2001 March Mid-Atlantic 637031 2001 March New England 566483 2001 March SouthEast US 655993 2001 March All Regions 1859507 2001 All Months All Regions 5021285 All Years All Months All Regions 15063855 27 rows selected. By using DECODE with GROUPING, you produced the same result that was produced by using NVL at the beginning of the section. However, the risk of mistreating a NULL data value as a summary row is eliminated by using GROUPING and DECODE. You will notice this in the following example, in which NULL data values in subtotal and total rows are treated differently by the GROUPING function than the NULL values in the summary rows. SELECT DECODE(GROUPING(CUST_NBR), 1, 'All Customers', CUST_NBR) CUSTOMER, DECODE(GROUPING(STATUS), 1, 'All Status', STATUS) STATUS, COUNT(*) FROM CUST_ORDER GROUP BY CUBE(CUST_NBR, STATUS); CUSTOMER STATUS COUNT(*) ---------------------------------------- -------------------- ---------- 201 2 201 All Status 2 231 DELIVERED 4 231 2 231 All Status 6 244 PENDING 2 244 All Status 2 255 PENDING 4 255 2 255 All Status 6 264 DELIVERED 2 264 All Status 2 288 DELIVERED 2 288 All Status 2 All Customers DELIVERED 8 All Customers PENDING 6 All Customers 6 All Customers All Status 20 18 rows selected.
|