9.3 DECODE and CASE ExamplesThe following sections present a variety of examples illustrating the uses of conditional logic in SQL statements. While we recommend that you use the CASE expression rather than the DECODE function, where feasible we provide both DECODE and CASE versions of each example to help illustrate the differences between the two approaches. 9.3.1 Result Set TransformationsYou may have run into a situation where you are performing aggregations over a finite set of values, such as days of the week or months of the year, but you want the result set to contain one row with N columns rather than N rows with two columns. Consider the following query, which aggregates sales data for each quarter of 2001: SELECT TO_CHAR(order_dt, 'Q') sales_quarter, SUM(sale_price) tot_sales FROM cust_order WHERE order_dt >= TO_DATE('01-JAN-2001','DD-MON-YYYY') AND order_dt < TO_DATE('01-JAN-2002','DD-MON-YYYY') GROUP BY TO_CHAR(order_dt, 'Q') ORDER BY 1; S TOT_SALES - ---------- 1 9739328 2 10379833 3 9703114 4 9772633 In order to transform this result set into a single row with four columns, we need to fabricate a column for each quarter of the year and, within each column, sum only those records whose order date falls in the desired quarter. We can do that with DECODE: SELECT SUM(DECODE(TO_CHAR(order_dt, 'Q'), '1', sale_price, 0)) Q_1, SUM(DECODE(TO_CHAR (order_dt, 'Q'), '2', sale_price, 0)) Q_2, SUM(DECODE(TO_CHAR (order_dt, 'Q'), '3', sale_price, 0)) Q_3, SUM(DECODE(TO_CHAR (order_dt, 'Q'), '4', sale_price, 0)) Q_4 FROM cust_order WHERE order_dt >= TO_DATE('01-JAN-2001','DD-MON-YYYY') AND order_dt < TO_DATE('01-JAN-2002','DD-MON-YYYY'); Q_1 Q_2 Q_3 Q_4 ---------- ---------- ---------- ---------- 9739328 10379833 9703114 9772633 Each of the four columns in the previous query are identical, except for the quarter being checked by the DECODE function. For the Q_1 column, for example, a value of 0 is returned unless the order falls in the first quarter, in which case the sale_price column is returned. When the values from all orders in 2001 are summed, only the first quarter orders are added to the total (for Q_1), which has the effect of summing all first quarter orders while ignoring orders for quarters 2, 3, and 4. The same logic is used for Q_2, Q_3, and Q_4 to sum orders for quarters 2, 3, and 4 respectively. The CASE version of this query is as follows: SELECT SUM(CASE WHEN TO_CHAR(order_dt, 'Q') = '1' THEN sale_price ELSE 0 END) Q_1, SUM(CASE WHEN TO_CHAR(order_dt, 'Q') = '2' THEN sale_price ELSE 0 END) Q_2, SUM(CASE WHEN TO_CHAR(order_dt, 'Q') = '3' THEN sale_price ELSE 0 END) Q_3, SUM(CASE WHEN TO_CHAR(order_dt, 'Q') = '4' THEN sale_price ELSE 0 END) Q_4 FROM cust_order WHERE order_dt >= TO_DATE('01-JAN-2001','DD-MON-YYYY') AND order_dt < TO_DATE('01-JAN-2002','DD-MON-YYYY'); Q_1 Q_2 Q_3 Q_4 ---------- ---------- ---------- ---------- 9739328 10379833 9703114 9772633 Obviously, such transformations are only practical when the number of values is relatively small. Aggregating sales for each quarter or month works fine, but expanding the query to aggregate sales for each week, with a column for each week, would quickly become tedious. 9.3.2 Selective Function ExecutionImagine you're generating an inventory report. Most of the information resides in your local database, but a trip across a gateway to an external, non-Oracle database is required to gather information for parts supplied by Acme Industries. The round trip from your database through the gateway to the external server and back takes 1.5 seconds on average. There are 10,000 parts in your database, but only 100 require information via the gateway. You create a user-defined function called get_resupply_date to retrieve the resupply date for parts supplied by ACME, and include it in your query: SELECT s.name supplier_name, p.name part_name, p.part_nbr part_number p.inventory_qty in_stock, p.resupply_date resupply_date, my_pkg.get_resupply_date(p.part_nbr) acme_resupply_date FROM part p, supplier s WHERE p.supplier_id = s.supplier_id; You then include logic in your reporting tool to use the acme_resupply_date instead of the resupply_date column if the supplier's name is Acme Industries. You kick off the report, sit back, and wait for the results. And wait. And wait... Unfortunately, the server is forced to make 10,000 trips across the gateway when only 100 are required. In these types of situations, it is far more efficient to call the function only when necessary, instead of always calling the function and discarding the results when not needed: SELECT s.name supplier_name, p.name part_name, p.part_nbr part_number, p.inventory_qty in_stock, DECODE(s.name, 'Acme Industries', my_pkg.get_resupply_date(p.part_nbr), p.resupply_date) resupply_date FROM part p, supplier s WHERE p.supplier_id = s.supplier_id; The DECODE function checks if the supplier name is 'Acme Industries'. If so, it calls the function to retrieve the resupply date via the gateway; otherwise, it returns the resupply date from the local part table. The CASE version of this query is as follows: SELECT s.name supplier_name, p.name part_name, p.part_nbr part_number, p.inventory_qty in_stock, CASE WHEN s.name = 'Acme Industries' THEN my_pkg.get_resupply_date(p.part_nbr) ELSE p.resupply_date END resupply_date FROM part p, supplier s WHERE p.supplier_id = s.supplier_id; Now the user-defined function is only executed if the supplier is Acme, reducing the query's execution time drastically. For more information on calling user-defined functions from SQL, see Chapter 11. 9.3.3 Conditional UpdateIf your database design includes denormalizations, you may run nightly routines to populate the denormalized columns. For example, the part table contains the denormalized column status, the value for which is derived from the inventory_qty and resupply_date columns. To update the status column, you could run four separate UPDATE statements each night, one for each of the four possible values for the status column For example: UPDATE part SET status = 'INSTOCK' WHERE inventory_qty > 0; UPDATE part SET status = 'ENROUTE' WHERE inventory_qty = 0 AND resupply_date < SYSDATE + 5; UPDATE part SET status = 'BACKORD' WHERE inventory_qty = 0 AND resupply_date > SYSDATE + 5; UPDATE part SET status = 'UNAVAIL' WHERE inventory_qty = 0 and resupply_date IS NULL; Given that columns such as inventory_qty and resupply_date are unlikely to be indexed, each of the four UPDATE statements would require a full table-scan of the part table. By adding conditional expressions to the statement, however, the four UPDATE statements can be combined, resulting in a single scan of the part table: UPDATE part SET status = DECODE(inventory_qty, 0, DECODE(resupply_date, NULL, 'UNAVAIL', DECODE(LEAST(resupply_date, SYSDATE + 5), resupply_date, 'ENROUTE', 'BACKORD')), 'INSTOCK'); The CASE version of this UPDATE is as follows: UPDATE part SET status = CASE WHEN inventory_qty > 0 THEN 'INSTOCK' WHEN resupply_date IS NULL THEN 'UNAVAIL' WHEN resupply_date < SYSDATE + 5 THEN 'ENROUTE' WHEN resupply_date > SYSDATE + 5 THEN 'BACKORD' ELSE 'UNKNOWN' END; The readability advantage of the CASE expression is especially apparent here, since the DECODE version requires three nested levels to implement the same conditional logic handled by a single CASE expression. 9.3.4 Optional UpdateIn some situations, you may need to modify data only if certain conditions exist. For example, you have a table that records information such as the total number of orders and the largest order booked during the current month. Here's the table definition: describe mtd_orders; Name Null? Type ----------------------------------------- -------- ------------ TOT_ORDERS NOT NULL NUMBER(7) TOT_SALE_PRICE NOT NULL NUMBER(11,2) MAX_SALE_PRICE NOT NULL NUMBER(9,2) Each night, the table is updated with that day's order information. While most of the columns will be modified each night, the column for the largest order, which is called max_sale_price, will only change if one of the day's orders exceeds the current value of the column. The following PL/SQL block shows how this might be accomplished using a procedural language: DECLARE tot_ord NUMBER; tot_price NUMBER; max_price NUMBER; prev_max_price NUMBER; BEGIN SELECT COUNT(*), SUM(sale_price), MAX(sale_price) INTO tot_ord, tot_price, max_price FROM cust_order WHERE cancelled_dt IS NULL AND order_dt >= TRUNC(SYSDATE); UPDATE mtd_orders SET tot_orders = tot_orders + tot_ord, tot_sale_price = tot_sale_price + tot_price RETURNING max_sale_price INTO prev_max_price; IF max_price > prev_max_price THEN UPDATE mtd_orders SET max_sale_price = max_price; END IF; END; After calculating the total number of orders, the aggregate order price, and the maximum order price for the current day, the tot_orders and tot_sale_price columns of the mtd_orders table are modified with today's sales data. After the update is complete, the maximum sale price is returned from mtd_orders so that it can be compared with today's maximum sale price. If today's max_sale_price exceeds that stored in the mtd_orders table, a second UPDATE statement is executed to update the field. Using DECODE or CASE, however, we can update the tot_orders and tot_sale_price columns and optionally update the max_sale_price column in the same UPDATE statement. Additionally, since we now have a single UPDATE statement, we can aggregate the data from the cust_order table within a subquery and eliminate the need for PL/SQL: UPDATE mtd_orders mtdo SET (mtdo.tot_orders, mtdo.tot_sale_price, mtdo.max_sale_price) = (SELECT mtdo.tot_orders + day_tot.tot_orders, mtdo.tot_sale_price + NVL(day_tot.tot_sale_price, 0), DECODE(GREATEST(mtdo.max_sale_price, NVL(day_tot.max_sale_price, 0)), mtdo.max_sale_price, mtdo.max_sale_price, day_tot.max_sale_price) FROM (SELECT COUNT(*) tot_orders, SUM(sale_price) tot_sale_price, MAX(sale_price) max_sale_price FROM cust_order WHERE cancelled_dt IS NULL AND order_dt >= TRUNC(SYSDATE)) day_tot); In this statement, the max_sale_price column is set equal to itself unless the value returned from the subquery is greater than the current column value, in which case the column is set to the value returned from the subquery. The next statement uses CASE to perform the same optional update: UPDATE mtd_orders mtdo SET (mtdo.tot_orders, mtdo.tot_sale_price, mtdo.max_sale_price) = (SELECT mtdo.tot_orders + day_tot.tot_orders, mtdo.tot_sale_price + day_tot.tot_sale_price, CASE WHEN day_tot.max_sale_price > mtdo.max_sale_price THEN day_tot.max_sale_price ELSE mtdo.max_sale_price END FROM (SELECT COUNT(*) tot_orders, SUM(sale_price) tot_sale_price, MAX(sale_price) max_sale_price FROM cust_order WHERE cancelled_dt IS NULL AND order_dt >= TRUNC(SYSDATE)) day_tot); One thing to keep in mind when using this approach is that setting a value equal to itself is still seen as a modification by the database and may trigger an audit record, a new value for the last_modified_date column, etc. 9.3.5 Selective AggregationTo expand on the mtd_orders example in the previous section, imagine that you also want to store total sales for particular regions such as Europe and North America. You could modify the mtd_orders table to look as follows. Note the addition of three columns for European sales, and three columns for North American Sales. Name Null? Type ----------------------------------------- -------- ------------ TOT_ORDERS NOT NULL NUMBER(7) TOT_SALE_PRICE NOT NULL NUMBER(11,2) MAX_SALE_PRICE NOT NULL NUMBER(9,2) EUROPE_TOT_ORDERS NOT NULL NUMBER(7) EUROPE_TOT_SALE_PRICE NOT NULL NUMBER(11,2) EUROPE_MAX_SALE_PRICE NOT NULL NUMBER(9,2) NORTHAMERICA_TOT_ORDERS NOT NULL NUMBER(7) NORTHAMERICA_TOT_SALE_PRICE NOT NULL NUMBER(11,2) NORTHAMERICA_MAX_SALE_PRICE NOT NULL NUMBER(9,2) For the new columns, individual orders will affect one set of columns or the other, but not both. An order will either be for a European or North American customer, but not for both at the same time. To populate these new columns, you could generate two more update statements, each targeted to a particular region, as in: /* Europe buckets */ UPDATE mtd_orders mtdo SET (mtdo.europe_tot_orders, mtdo.europe_tot_sale_price, mtdo.europe_max_sale_price) = (SELECT mtdo.europe_tot_orders + eur_day_tot.tot_orders, mtdo.europe_tot_sale_price + nvl(eur_day_tot.tot_sale_price, 0), CASE WHEN eur_day_tot.max_sale_price > mtdo.europe_max_sale_price THEN eur_day_tot.max_sale_price ELSE mtdo.europe_max_sale_price END FROM (SELECT COUNT(*) tot_orders, SUM(co.sale_price) tot_sale_price, MAX(co.sale_price) max_sale_price FROM cust_order co, customer c WHERE co.cancelled_dt IS NULL AND co.order_dt >= TRUNC(SYSDATE) AND co.cust_nbr = c.cust_nbr AND c.region_id IN (SELECT region_id FROM region START WITH name = 'Europe' CONNECT BY PRIOR region_id = super_region_id)) eur_day_tot); /* North America buckets */ UPDATE mtd_orders mtdo SET (mtdo.northamerica_tot_orders, mtdo. northamerica_tot_sale_price, mtdo.northamerica_max_sale_price) = (SELECT mtdo.northamerica_tot_orders + na_day_tot.tot_orders, mtdo.northamerica_tot_sale_price + nvl(na_day_tot.tot_sale_price, 0), CASE WHEN na_day_tot.max_sale_price > mtdo.northamerica_max_sale_price THEN na_day_tot.max_sale_price ELSE mtdo.northamerica_max_sale_price END FROM (SELECT COUNT(*) tot_orders, SUM(co.sale_price) tot_sale_price, MAX(co.sale_price) max_sale_price FROM cust_order co, customer c WHERE co.cancelled_dt IS NULL AND co.order_dt >= TRUNC(SYSDATE) - 60 AND co.cust_nbr = c.cust_nbr AND c.region_id IN (SELECT region_id FROM region START WITH name = 'North America' CONNECT BY PRIOR region_id = super_region_id)) na_day_tot); However, why not save yourself a trip through the cust_order table and aggregate the North American and European totals at the same time? The trick here is to put conditional logic within the aggregation functions so that only the appropriate rows influence each calculation. This approach is similar to Section 9.3.1 earlier in the chapter, in that it selectively aggregates data based on data stored in the table: UPDATE mtd_orders mtdo SET (mtdo.northamerica_tot_orders, mtdo. northamerica_tot_sale_price, mtdo.northamerica_max_sale_price, mtdo.europe_tot_orders, mtdo.europe_tot_sale_price, mtdo.europe_max_sale_price) = (SELECT mtdo.northamerica_tot_orders + nvl(day_tot.na_tot_orders, 0), mtdo.northamerica_tot_sale_price + nvl(day_tot.na_tot_sale_price, 0), CASE WHEN day_tot.na_max_sale_price > mtdo.northamerica_max_sale_price THEN day_tot.na_max_sale_price ELSE mtdo.northamerica_max_sale_price END, mtdo.europe_tot_orders + nvl(day_tot.eur_tot_orders, 0), mtdo.europe_tot_sale_price + nvl(day_tot.eur_tot_sale_price, 0), CASE WHEN day_tot.eur_max_sale_price > mtdo.europe_max_sale_price THEN day_tot.eur_max_sale_price ELSE mtdo.europe_max_sale_price END FROM (SELECT SUM(CASE WHEN na_regions.region_id IS NOT NULL THEN 1 ELSE 0 END) na_tot_orders, SUM(CASE WHEN na_regions.region_id IS NOT NULL THEN co.sale_price ELSE 0 END) na_tot_sale_price, MAX(CASE WHEN na_regions.region_id IS NOT NULL THEN co.sale_price ELSE 0 END) na_max_sale_price, SUM(CASE WHEN eur_regions.region_id IS NOT NULL THEN 1 ELSE 0 END) eur_tot_orders, SUM(CASE WHEN eur_regions.region_id IS NOT NULL THEN co.sale_price ELSE 0 END) eur_tot_sale_price, MAX(CASE WHEN eur_regions.region_id IS NOT NULL THEN co.sale_price ELSE 0 END) eur_max_sale_price FROM cust_order co, customer c, (SELECT region_id FROM region START WITH name = 'North America' CONNECT BY PRIOR region_id = super_region_id) na_regions, (SELECT region_id FROM region START WITH name = 'Europe' CONNECT BY PRIOR region_id = super_region_id) eur_regions WHERE co.cancelled_dt IS NULL AND co.order_dt >= TRUNC(SYSDATE) AND co.cust_nbr = c.cust_nbr AND c.region_id = na_regions.region_id (+) AND c.region_id = eur_regions.region_id (+)) day_tot); This is a fairly robust statement, so let's break it down. Within the day_tot inline view, you are joining the cust_order table to the customer table, and then outer-joining from customer.region_id to each of two inline views (na_regions and eur_regions) that perform hierarchical queries on the region table. Thus, orders from European customers will have a non-null value for eur_regions.region_id, since the outer join would find a matching row in the eur_regions inline view. Six aggregations are performed on this result set; three check for a join against the na_regions inline view (North American orders), and three check for a join against the eur_regions inline view (European orders). The six aggregations are then used to modify the six columns in mtd_orders. This statement could (and should) be combined with the statement from the previous example (which updated the first three columns) to create an UPDATE statement that touches every column in the mtd_orders table via one pass through the cust_order table. For data warehouse applications, where large data sets must be manipulated each night within tight time constraints, such an approach can often make the difference between success and failure. 9.3.6 Division by Zero ErrorsAs a general rule, you should write your code so unexpected data values are handled gracefully. One of the more common arithmetic errors is ORA-01476: divisor is equal to zero. Whether the value is retrieved from a column, passed in via a bind variable, or returned by a function call, always wrap divisors with DECODE or CASE, as illustrated by the following example: SELECT p.part_nbr, SYSDATE + (p.inventory_qty / DECODE(my_pkg.get_daily_part_usage(p.part_nbr), NULL, 1, 0, 1, my_pkg.get_daily_part_usage(p.part_nbr))) anticipated_shortage_dt FROM part p WHERE p.inventory_qty > 0; The DECODE function ensures that the divisor is something other than zero. Here is the CASE version of the statement: SELECT p.part_nbr, SYSDATE + (p.inventory_qty / CASE WHEN my_pkg.get_daily_part_usage(p.part_nbr) > 0 THEN my_pkg.get_daily_part_usage(p.part_nbr) ELSE 1 END) anticipated_shortage_dt FROM part p WHERE p.inventory_qty > 0; Of course, if you are bothered by the fact that the get_daily_part_usage function is called a second time for each part that yields a positive response, simply wrap the function call in an inline view, as in: SELECT parts.part_nbr, SYSDATE + (parts.inventory_qty / CASE WHEN parts.daily_part_usage > 0 THEN parts.daily_part_usage ELSE 1 END) anticipated_shortage_dt FROM (SELECT p.part_nbr part_nbr, p.inventory_qty inventory_qty, my_pkg.get_daily_part_usage(p.part_nbr) daily_part_usage FROM part p WHERE p.inventory_qty > 0) parts; 9.3.7 State TransitionsIn certain cases, the order in which the values may be changed is constrained as well as the allowable values for a column. Consider the diagram shown in Figure 9-1, which shows the allowable state transitions for an order. Figure 9-1. Order processing state transitionsAs you can see, an order currently in the Processing state should only be allowed to move to either Delayed or Filled. Rather than allowing each application to implement logic to change the state of an order, write a user-defined function that returns the appropriate state depending on the current state of the order and the transition type. In this example, two transition types are defined: positive (POS) and negative (NEG). For example, an order in the Delayed state can make a positive transition to Processing or a negative transition to Cancelled. If an order is in one of the final states (Rejected, Cancelled, Shipped), the same state is returned. Here is the DECODE version of our PL/SQL function: FUNCTION get_next_order_state(ord_nbr in NUMBER, trans_type in VARCHAR2 DEFAULT 'POS') RETURN VARCHAR2 is next_state VARCHAR2(20) := 'UNKNOWN'; BEGIN SELECT DECODE(status, 'REJECTED', status, 'CANCELLED', status, 'SHIPPED', status, 'NEW', DECODE(trans_type, 'NEG', 'AWAIT_PAYMENT', 'PROCESSING'), 'AWAIT_PAYMENT', DECODE(trans_type, 'NEG', 'REJECTED', 'PROCESSING'), 'PROCESSING', DECODE(trans_type, 'NEG', 'DELAYED', 'FILLED'), 'DELAYED', DECODE(trans_type, 'NEG', 'CANCELLED', 'PROCESSING'), 'FILLED', DECODE(trans_type, 'POS', 'SHIPPED', 'UNKNOWN'), 'UNKNOWN') INTO next_state FROM cust_order WHERE order_nbr = ord_nbr; RETURN next_state; EXCEPTION WHEN NO_DATA_FOUND THEN RETURN next_state; END get_next_order_state; As of Oracle8i Version 8.1.7, the PL/SQL language does not include the CASE expression in its grammar, so you would need to be running Oracle9i to use the CASE version of the function: FUNCTION get_next_order_state(ord_nbr in NUMBER, trans_type in VARCHAR2 DEFAULT 'POS') RETURN VARCHAR2 is next_state VARCHAR2(20) := 'UNKNOWN'; BEGIN SELECT CASE WHEN status = 'REJECTED' THEN status WHEN status = 'CANCELLED' THEN status WHEN status = 'SHIPPED' THEN status WHEN status = 'NEW' AND trans_type = 'NEG' THEN 'AWAIT_PAYMENT' WHEN status = 'NEW' AND trans_type = 'POS' THEN 'PROCESSING' WHEN status = 'AWAIT_PAYMENT' AND trans_type = 'NEG' THEN 'REJECTED' WHEN status = 'AWAIT_PAYMENT' AND trans_type = 'POS' THEN 'PROCESSING' WHEN status = 'PROCESSING' AND trans_type = 'NEG' THEN 'DELAYED' WHEN status = 'PROCESSING' AND trans_type = 'POS' THEN 'FILLED' WHEN status = 'DELAYED' AND trans_type = 'NEG' THEN 'CANCELLED' WHEN status = 'DELAYED' AND trans_type = 'POS' THEN 'PROCESSING' WHEN status = 'FILLED' AND trans_type = 'POS' THEN 'SHIPPED' ELSE 'UNKNOWN' END INTO next_state FROM cust_order WHERE order_nbr = ord_nbr; RETURN next_state; EXCEPTION WHEN NO_DATA_FOUND THEN RETURN next_state; END get_next_order_state; This example only handles the simple case where there are just two paths out of each state, but it does demonstrate one strategy for managing state transitions in your database. To demonstrate how the previous function could be used, here is the UPDATE statement used to change the status of an order once it has made a successful state transition: UPDATE cust_order SET status = my_pkg.get_next_order_state(order_nbr, 'POS') WHERE order_nbr = 1107; |