Team LiB   Previous Section   Next Section

10.5 Extracting DDL and Data

Oracle supplies the Export (exp) and Import (imp) utilities to export Oracle database objects and then import them back into Oracle databases. The Export utility extracts not only the data for these objects but the DDL (Data Definition Language) to create them.

These venerable utilities work well enough, but sometimes fall short in a number of important ways:

Performance

Exports are fairly quick, but imports can be unbearably slow. Importing more than 10 gigabytes of data can be too time-consuming to consider.

Portability

The Import file format is proprietary to Oracle. This makes it virtually impossible to load other databases with exported data. We suppose that's fair enough, because Oracle Corporation is, after all, in the business of Oracle databases. What's really frustrating though, is Import's inability to work with Oracle's own superb high-speed data handler, SQL*Loader.

Limited DDL extraction

It's possible to extract most (but not all) of the DDL from an export file via the indexfile feature of imp. The following command, for example, extracts most of the DDL from an export file, but fails to retrieve stored procedures:

$ imp userid=scott/tiger file=mydata.dmp indexfile=myddl.sql
Compatibility

The Oracle export utility is highly version-dependent. Trying to export data from an 8.0.5 database with an 8.1.7 export utility results in the error message:

EXP-00037: Export views not compatible with export version.

We've included two Perl/DBI scripts in the toolkit to help fill these gaps. Using sqlunldr.pl and ddl_oracle.pl you can dump all schema data to comma-delimited files, generate SQL*Loader control and parameter files, and then generate DDL for all user tablespaces and schemas.

10.5.1 Extracting Data With sqlunldr.pl

Sometimes you need raw portable data — to populate another database, build a customer's spreadsheet, or perform some other data operation. Unfortunately, Oracle's Export utility is the wrong mousetrap. One popular solution to such problems is to build SQL*Plus dump scripts. This approach works for single tables, but grows cumbersome when dumping entire schemas or even a handful of selected tables. What's needed is a single dump utility that creates portable output. It would also be nice if the data field separators were configurable and if enclosed quote characters were both configurable and optional. The sqlunldr.pl script fits the bill on all counts. Here are some of its main features. In the following sections we'll include several examples that show you to use this script.

SQL*Loader support

Generates parameter and control files for SQL*Loader.

Configurable characters

The default for field separation is a comma. The default for enclosing fields is the double quote. Each is configurable via the command line.

LONG column support

Long columns of arbitrary length are supported.

Binary data support (with limitations)

Binary data of LONG, CHAR, VARCHAR, and VARCHAR2 types can be dumped as hexadecimal by the script and reloaded into binary format. This feature is limited to data 32K in length. Note that the Oracle function UTL_RAW.CAST_TO_VARCHAR2 converts data from hex to binary within the SQL*Loader control script, which is also limited to strings of 32K.

Binary data limits

The sqlunldr.pl script only partially supports LOB (large object) data. Both CLOB (character large object) and BLOB (binary large object) columns may be dumped to output files, but you will need to manually edit the generated SQL*Loader control .ctl script to load the data. Binary data is subject to the 32K-byte limit because of the Oracle software predefined limitation. This is probably fine for 95% of systems, however.

The script's command-line options are summarized in Table 10-8.

Table 10-8. Command-line options — sqlunldr.pl

Option

Description

-machine

Server where the target database resides.

-database

Target database.

-username

DBA account.

-password

DBA password (optional if password server in use).

-owner

Owner of tables to dump.

-directory

Directory in which to unload data. Defaults to <owner>.dump.

-dateformat

NLS_DATE_FORMAT — for example, -dateformat `mm/dd/yyyy'

-header

Includes the column names as the first line of output.

-noheader

Outputs without column names.

-table

Dumps tables. May be repeated as often as necessary — for example, -table emp -table dept -table salary

-schemadump

Dumps entire schema. Makes sqlunldr.pl ignore -table.

-rowlimit

Limits number of rows output for each table to N rows.

-fieldsep

Separates row fields, defaults to comma. If used, you probably need to escape the character — for example, -fieldsep \|.

-quotechar

Character used to enclose each field. Defaults to a double quote. A literal value of none will disable quotes.

-longlen

Maximum length of LONG datatypes you expect to encounter. Defaults to 65535.

-bincol

Columns of binary data which should be translated to hex format before dumping. Maximum length is 32767 bytes. Specified as <table>=<column1, column2,...>, etc.

10.5.1.1 Dumping and reloading SCOTT's schema

In this first example, we'll use sqlunldr.pl to dump the entire scott schema. This time we can ignore the -dateformat option, as we'll simply reload the data straight back into the same database. However, we'd need this option if the data were to be loaded into a database with a different NLS_DATE_FORMAT. Here's the command to dump the scott schema:

$ sqlunldr.pl -machine watson -database ts99 -username system \
     -owner scott -noheader -schemadump

Once this entirely portable command completes, you'll find all the output in the scott.dump directory. Example 10-19 displays the output on a Unix system.

Example 10-19. Dumping the SCOTT schema with sqlunldr.pl
%oramon> sqlunldr.pl -machine watson -database ts99 -username system \
  -owner scott -noheader -schemadump
-rw-r--r--    1 jkstill  dba            56 May 20 01:17 bincol_test.txt
-rw-r--r--    1 jkstill  dba           129 May 20 01:17 bonus.ctl
-rw-r--r--    1 jkstill  dba            67 May 20 01:17 bonus.par
-rw-r--r--    1 jkstill  dba             0 May 20 01:17 bonus.txt
-rw-r--r--    1 jkstill  dba           123 May 20 01:17 dept.ctl
-rw-r--r--    1 jkstill  dba            64 May 20 01:17 dept.par
-rw-r--r--    1 jkstill  dba           104 May 20 01:17 dept.txt
-rw-r--r--    1 jkstill  dba           167 May 20 01:17 emp.ctl
-rw-r--r--    1 jkstill  dba            61 May 20 01:17 emp.par
-rw-r--r--    1 jkstill  dba        661709 May 20 01:17 emp.txt
-rw-r--r--    1 jkstill  dba           132 May 20 01:17 salgrade.ctl
-rw-r--r--    1 jkstill  dba            76 May 20 01:17 salgrade.par
-rw-r--r--    1 jkstill  dba            89 May 20 01:17 salgrade.txt
%oramon>

Let's see what's going on in this code:

  1. We'll delete all the table rows from SCOTT's schema in our test database:

    SQL> DELETE FROM BONUS;
    SQL> DELETE FROM SALGRADE;
    SQL> DELETE FROM EMP;
    SQL> DELETE FROM DEPT;
    SQL> COMMIT;
  2. We're now ready to reload the data via SQL*Loader using the control and parameter files generated by sqlunldr.pl. Here's how to reload DEPT:

    $ cd scott.dump
    $ sqlldr parfile=dept.par
  3. We're asked for SCOTT's password. After we supply it, SQL*Loader reloads the DEPT table and generates dept.log. The contents of this log should be similar to what's shown in Example 10-20. We can now reload the other tables:

    $ sqlldr parfile=emp.par
    $ sqlldr parfile=salgrade.par
    $ sqlldr parfile=bonus.par
Example 10-20. SQL*Loader log file — dept.log
Table DEPT, loaded from every logical record.
  
   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ----------------
DNAME                                NEXT     *   ,  O(") CHARACTER            
LOC                                  NEXT     *   ,  O(") CHARACTER            
  
Table DEPT:
  4 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
  
Space allocated for bind array:                  49536 bytes(64 rows)
Space allocated for memory besides bind array:        0 bytes
  
Total logical records skipped:          0
Total logical records read:             4
Total logical records rejected:         0
Total logical records discarded:        0
  
Run began on Sun Feb 24 17:18:17 2002
Run ended on Sun Feb 24 17:18:24 2002
  
Elapsed time was:     00:00:07.40
CPU time was:         00:00:00.10
10.5.1.2 Dumping binary data

Now we'll look at a more complex example. We'll create a table with one column of plain text, convert it to unreadable binary form, and then dump the table with sqlunldr.pl. We'll then delete all of the data from the table, reload it with the generated SQL*Loader scripts, and then validate it. Example 10-21 shows a test of this operation.

Example 10-21. Binary data test
  1    DROP TABLE BINCOL_TEST;
  2    CREATE TABLE BINCOL_TEST 
  3       (CLEAR_TEXT VARCHAR2(10), BINARY_DATA VARCHAR2(10));
  4    
  5    INSERT INTO BINCOL_TEST(CLEAR_TEXT) VALUES('Post-Dated');
  6    INSERT INTO BINCOL_TEST(CLEAR_TEXT) VALUES('Check');
  7    INSERT INTO BINCOL_TEST(CLEAR_TEXT) VALUES('Loan');
  8    COMMIT;
  9    
 10    VAR xorstr VARCHAR2(10)
 11    
 12    BEGIN
 13       :xorstr := RPAD(CHR(127),10,CHR(127));
 14    END;
 15    /
 16    
 17    UPDATE BINCOL_TEST
 18       SET BINARY_DATA = 
 19           UTL_RAW.CAST_TO_VARCHAR2(
 20              UTL_RAW.BIT_XOR(
 21                 UTL_RAW.CAST_TO_RAW(CLEAR_TEXT),
 22                 UTL_RAW.CAST_TO_RAW(SUBSTR(:xorstr,1,LENGTH(CLEAR_TEXT)))
 23              )
 24           );
 25    COMMIT;
 26    SET TERM OFF
 27    SPOOL BINCOL_TEST.LOG
 28    SELECT * FROM BINCOL_TEST;
 29    SPOOL OFF
 30    SET TERM ON
 31    
 32    ED bincol_test.log
 33    SELECT UTL_RAW.CAST_TO_VARCHAR2(
 34              UTL_RAW.BIT_XOR(
 35                UTL_RAW.CAST_TO_RAW(BINARY_DATA),
 36                UTL_RAW.CAST_TO_RAW(:XORSTR)
 37              )
 38           )
 39      FROM BINCOL_TEST;

You'll find the example scripts bincol_test.sql and bincol_test2.sql in the following locations, so you can run these tests yourself if you wish:

Unix (assuming a build directory of /u01/build )

/u01/build/PDBA-1.00/routine_tasks

Win32

c:\Perl\site\lib\PDBA\sql

Let's see what's going on in this example:

  1. In lines 2-3 in Example 10-21 we create BINCOL_TEST.

  2. In lines 5-7 we insert "Post-Dated Check Loan" into three of its rows.

  3. In lines 10-15 we build a 10-character string, :xorstr, from ASCII character 127 elements. In lines 17-24, :xorstr is used with the Oracle built-in functions UTL_RAW.CAST_TO_VARCHAR2, UTL_RAW.CAST_TO_RAW, and UTL_RAW.BIT_XOR to create binary data that is unreadable by humans.[8]

    [8] Except half-Vulcans and machine-code gods, of course.

  4. In line 27 the data is spooled to a file, and in line 32 the output is sent to our favorite vi text editor. (Notice that in line 26 the console output was turned off. This stops the binary data displaying to our SQL*Plus session, possibly making it unreadable.)

  5. Here is the result of this edit, as it would appear viewed safely in vi.

    CLEAR_TEXT  BINARY_DATA  
    ----------  -----------
    Post-Dated  /^P^L^KR;^^^K^Z^[
    Check       <^W^Z\^T
    Loan        3^P^^^Q
  6. After closing the editor, the SQL script continues. Lines 33-39 convert the newly created binary data back into human-readable form, and if you're running this test yourself, the output should be similar to that shown here:

    PL/SQL procedure successful completed.
    3 rows updated.
    Commit complete.
    UTL_RAW.CAST_TO_VARCHAR2(UTL_RAW.BIT_XOR(UTL_RAW.CAST_TO_RAW(BINARY_DATA),UTL_RA
    --------------------------------------------------------------------------------
    Post-Dated
    Check
    Loan
    SQL>
  7. We're now ready to dump the data, delete the rows from BINCOL_TEST, reload from the output of sqlunldr.pl, and then validate the results. Use sqlunldr.pl to dump the table:

    $ sqlunldr.pl -machine watson -database ts99 -username system \   
         -owner scott -noheader -table bincol_test \
         -bincol bincol_test=binary_data
  8. Now we'll delete the test rows from BINCOL_TEST:

    $ SQLPLUS SCOTT/TIGER
    SQL> DELETE FROM BINCOL_TEST;
    SQL> COMMIT;
    SQL> EXIT
  9. We can now reload the table from the sqlunldr.pl dump. The data will be in directory scott.dump, one level below your current directory. Go there and examine the files bincol_test.par, bincol_test.ctl, and bincol_test.txt. You can see how they appeared in our tests in Example 10-22. Notice that the binary data contained in BINCOL_TEST.BINARY_DATA has been converted to hexadecimal format. When loaded back, the Oracle built-in procedure UTL_RAW.CAST_TO_VARCHAR2 will convert it back into binary.

    Example 10-22. Files generated by sqlunldr.pl
    %oramon > cat bincol_test.par
    userid = scott
    control = bincol_test.ctl
    log = bincol_test.log
    bad = bincol_test.bad
      
    %oramon > cat bincol_test.ctl
    load data
    infile 'bincol_test.txt'
    into table BINCOL_TEST
    fields terminated by ','  optionally enclosed by '"'
    (
    CLEAR_TEXT,
    BINARY_DATA "utl_raw.cast_to_varchar2(:BINARY_DATA)"
    )
      
    %oramon > cat bincol_test.txt
    "Post-Dated","2F100C0B523B1E0B1A1B"
    "Check","3C171A1C14"
    "Loan","33101E11"
    %oramon >
  10. We can now reload the data using SQL*Loader and the parameter files generated by sqlunldr.pl:

    $ cd scott.dump
    $ sqlldr parfile=bincol_test.par
  11. The most important line to observe in bincol_test.log is the one saying 3 Rows successfully loaded:

    SQL*Loader: Release 8.1.7.0.1 - Production on Sun Feb 24 16:43:54 2002
    (c) Copyright 2000 Oracle Corporation.  All rights reserved.
    ...
    Table BINCOL_TEST:
      3 Rows successfully loaded.
      0 Rows not loaded due to data errors.
      0 Rows not loaded because all WHEN clauses were failed.
      0 Rows not loaded because all fields were null.
  12. Now log back into SQL*Plus and run the script bincol_test2.sql. The output should be identical to that shown earlier. We can now drop the test table from SCOTT's account:

    $ SQLPLUS SCOTT/TIGER
    SQL> DROP TABLE BINCOL_TEST;
    SQL> EXIT

10.5.2 Extracting DDL with ddl_oracle.pl

Although Perl-based Oracle tools like Orac and Oracletool supply options to regenerate DDL, and although the DDL::Oracle module was designed to explicitly perform this operation (we describe all of these applications in Chapter 3 and Chapter 4), it would sometimes be convenient to extract DDL from a database in one easy operation. We could then recreate all the objects in user and application schemas. Oracle's Export utility does extract all of this information, but often in unusable form.

It is possible to use the indexfile=myddl.sql construct with the Export utility to extract the DDL for tables, indexes, and constraints from an Oracle export file, but this utility fails to cover packages, procedures, functions, and triggers. To fill the gap, we've developed the ddl_oracle.pl script to generate the DDL to recreate the following schema database elements; in the list we've noted any exceptions to what can be generated:

Tablespaces

Generates DDL to recreate all tablespaces except the SYSTEM tablespace.

Rollback segments

Generates DDL to recreate all rollback segments.

Public database links

Generates DDL to recreate all public database links.

Public synonyms

Generates DDL to recreate public synonyms with the exception of public synonyms referring to a configurable list of user accounts. This exception prevents the inclusion of public synonyms created as part of a standard database.

User profiles

Generates DDL to recreate all user profiles with the exception of DEFAULT.

Roles

Generates DDL to recreate all database roles with the exception of a configurable list of roles. This exception prevents the script from recreating roles created as part of a standard database.

User accounts

Generates DDL to recreate all user accounts with the exception of those found in a configurable list of user accounts.

Schemas

Generates DDL to recreate all schema objects, including PL/SQL, with the exception of objects belonging to users in a configurable list.

Grants from schemas

Generates DDL to recreate all grants made from all user accounts.

Before going any further, let's examine the configurationfile exp_exclude.conf. This file is simply a list of user accounts and roles that we don't care to preserve in DDL files, often because they are created for us when creating a new database.

The file included in the PDBA distribution is shown in Example 10-23. Most of the generic Oracle database users and roles are included.

Example 10-23. exp_exclude.conf
package expexclude;
use vars qw{ @users };
  
@users = qw{ SYS                        SYSTEM
             OUTLN                      DBSNMP
             TRACESVR                   ORDSYS
             ORDPLUGINS                 MDSYS
             AURORA$JIS$UTILITY$        OSE$HTTP$ADMIN
             AURORA$ORB$UNAUTHENTICATED };
  
@roles = qw { CONNECT                  RESOURCE
              DBA SELECT_              CATALOG_ROLE
              EXECUTE_CATALOG_ROLE     DELETE_CATALOG_ROLE
              EXP_FULL_DATABASE        IMP_FULL_DATABASE
              RECOVERY_CATALOG_OWNER   AQ_ADMINISTRATOR_ROLE
              AQ_USER_ROLE             SNMPAGENT
              OEM_MONITOR              HS_ADMIN_ROLE
              JAVAUSERPRIV             JAVAIDPRIV
              JAVASYSPRIV              JAVADEBUGPRIV
              JAVA_ADMIN               JAVA_DEPLOY
              PLUSTRACE                TIMESERIES_DEVELOPER
              TIMESERIES_DBA           CTXAPP };
1;

Make sure that the exp_exclude.conf file is in your PDBA_HOME directory, as with other configuration files described earlier in this chapter. There's no need to edit exp_exclude.conf unless you need to edit the roles or user lists. For instance, if you want OUTLN included in DDL generation, remove it from @users. Most of the DDL generation is accomplished via Richard Sutherland's DDL::Oracle module, which we discussed in Chapter 3. If that application is not available on your system, you will need to install it (making sure to use at least Version 1.10). Here's a Perl one-liner to determine what version of DDL::Oracle you are using:

perl -e "use DDL::Oracle 1.10; print qq{OK!\n}"

If the proper version is installed, you'll see OK! printed on the screen; otherwise, you'll need to install the latest version. This is easier for Win32 users. Simply start the ActiveState PPM package manager and install directly as follows:

C:\> ppm
PPM> install DDL::Oracle

Unix users will need to download the latest version and install it. If you want to do a manual install (as we described in Chapter 2), you can get the file at http://search.cpan.org/search?dist=DDL-Oracle. For a direct CPAN install, do the following:

$ perl -MCPAN -e "shell"
cpan> install DDL::Oracle
...
Running make for R/RV/RVSUTHERL/DDL-Oracle-1.10.tar.gz
...

Although we make use of DDL::Oracle to generate much of the DDL output from ddl_oracle.pl, we don't use it for all of the DDL. Let's see why.

One of the goals of the script was to be able to generate a single file of all object grants made on a schema's objects from the perspective of the grantor, or owner, of the objects. If the SCOTT schema owns 10 tables and SCOTT has issued SELECT grants on all of his tables to the JONES, ADAMS, and CLARK accounts, we wanted a single script to contain all of those grants. This way, the DBA needs to issue only a single Oracle logon to the SCOTT account so that the DDL script containing those grants can be run.

The DDL::Oracle module generates grants from the perspective of the grantee, or recipient of the granted privileges. In the case of generating the DDL required to grant SELECT privileges to the JONES, ADAMS, and CLARK accounts, this would require three separate logons by the SCOTT account to create those grants.

In our script, we crafted our own DDL generation for certain aspects to get just the output we wanted. For example, we did this in generating the DDL to create users and roles, and in generating the GRANT statements by grantor rather than grantee.[9]

[9] If you want to see the details of this, examine the %ddl hash and the ddl, _userPrivs, _rolePrivs, and _grantorPrivs methods in the PDBA::DBA toolkit module.

The command line for ddl_oracle.pl is rather basic, with the usual command-line options summarized in Table 10-9.

Table 10-9. Command-line options — ddl_oracle.pl

Option

Description

-machine

Server where the target database resides.

-database

Target database.

-username

DBA account.

-password

DBA password (optional if password server in use).

-conf

Config file. The default is exp_exclude.conf.

Running ddl_oracle.pl is very simple, as shown here:

$ ddl_oracle.pl -machine watson -database ts99 -username system
Building List
working on profiles
working on public database links
working on public synonyms
working on rollback segments
working on tablespaces
working on users
working on SCOTT
working on COMMON
working on JKSTILL
working on PDBA_ROLE
...

The output consists of a number of SQL scripts; each script name is prefixed with a number, which indicates the order in which the scripts need to be run (assuming that all are to be used). An abbreviated version of the 1_create.sql script is shown in Example 10-24. See Table 10-10 for all of the script names.

Table 10-10. SQL scripts generated by ddl_oracle.pl

SQL Script

Description

1_create.sql

Used to call all of the other scripts. See Example 10-24.

2_tbs_ddl.sql

DDL for all tablespaces other than SYSTEM.

3_rbs_ddl.sql

DDL for all rollback segments.

4_pub_db_link.sql

DDL for all public database links.

5_pub_synonyms.sql

DDL for all public synonyms for all objects other than those owned by accounts in the @users array of exp_exclude.conf.

6_user_profiles.sql

DDL for all user profiles except for DEFAULT.

7_role_ddl.sql

DDL for all database roles except those listed in the @roles array of exp_exclude.conf.

8_user_ddl.sql

DDL to create all accounts not listed in @users. Includes grants for all privileges, roles, profiles, and quotas.

9_schema_<USER>.sql

One file generated for each account. Includes the DDL for all database objects owned by the account: tables, indexes, constraints, views, sequences, stored procedures, stored functions, packages, etc.

10_grant_<USER>.sql

One of these files generated for each account. It includes all grants made by the grantor to other accounts and roles.

The output is designed so it could be run sequentially by running 1_create.sql, but you'll rarely do things this way. By grouping objects and privileges by owner, it's a bit easier for you to recreate a single schema. This approach also reduces the number of files you must deal with.

Example 10-24. The 1_create.sql script
@@2_tbs_ddl.sql
@@3_rbs_ddl.sql
@@4_pub_db_link.sql
@@5_pub_synonyms.sql
@@6_user_profiles.sql
@@7_role_ddl.sql
@@8_user_ddl.sql
  
PROMPT connecting to SCOTT - please enter the password
CONNECT SCOTT
@@9_schema_scott.sql
@@10_grant_scott.sql
  
PROMPT connecting to PDBAREP - please enter the password
CONNECT PDBAREP
@@9_schema_pdbarep.sql
@@10_grant_pdbarep.sql
...
    Team LiB   Previous Section   Next Section