13.2 Modifying a Module in the ToolkitIn Chapter 9 we introduced the supporting modules included in the Perl DBA Toolkit and described briefly what they do. Just as you might wish to modify or extend the toolkit's scripts, as we described earlier in this chapter, you might also find a good reason to modify the modules. In an effort to anticipate the kinds of changes you might want to make to these modules, in the following sections we'll provide a quick guide on how to modify the modules in the toolkit. We'll show two examples here and hope you can extrapolate to many more:
13.2.1 Modifying the PDBA Module to Add a MethodIn this section we'll essentially modify the PDBA module in order to add a method that will help us in doing our work. In reality, though, rather than modifying the existing code, we're going to create our own parallel, modified module. We'll explain why we've taken this approach as we work through the example. We've created a separate downloadable module, called PDBAx, for "PDBA Extensions," that contains the code we describe in this section. If you want to do so, you can download and install this code in the same manner as you would the ordinary PDBA module. There's no absolute need to download and install PDBAx, but you may wish to do so to help follow the rest of this chapter or simply for your own experimentation. You can download PDBAx-1.00.tar.gz, or its latest derivative, from our book's page on the O'Reilly site: Installation is straightforward, as we describe in the following sections. 13.2.1.1 Installing PDBAx on UnixRun the following to install PDBAx on Unix systems: $ gunzip -c PDBAx-1.00.tar.gz | tar xvf - $ cd PDBAx-1.00 $ perl Makefile.PL $ make install There are no tests to run for this module. 13.2.1.2 Installing PDBAx on Win32Download the PDBAx.ppd PPM file from the web site and save it in a location such as C:\TEMP. You probably know the rest of the drill: DOS> ppm PPM> install --location c:\temp PDBAx 13.2.2 Adding a Usage MethodMost of the scripts included in our toolkit employ a usage subroutine that is called for various reasons. Perhaps the -help option was included on the command line, or perhaps required options were missing. The usage routine generally looks something like this: sub usage { my $exitVal = shift; use File::Basename; my $basename = basename($0); print qq/ $basename usage: $basename -machine database_server -database database_instance -username account /; exit $exitVal; }; A common PDBA.pm method would eliminate the need for this subroutine in other scripts. Writing such a method is one approach to solving your problem. Some suggested code for such a method is shown in Example 13-4. Example 13-4. The PDBA usage methodsub usage { my ($exitVal,$helpStrRef) = @_; use File::Basename; my $basename = basename($0); print qq{ usage: $basename ${$helpStrRef} }; exit $exitVal; }; Here's how you might use it in a script: use PDBA; my $help = q{ -database database to connect to -username database account -password password for the account }; ... if ( $optctl{help} ) { PDBA::usage(1,\$help) } ... So now you place your new usage method in the PDBA.pm file and try it out in a few scripts. And it works great. However, there's one small problem. What happens if you install a newer version of the PDBA Toolkit module library? That's right — your carefully crafted usage method will no longer be in the PDBA module, and all of your scripts calling PDBA::usage will break. Ouch! Rather than modifying the PDBA module, why not create your own subclassed module? Doing so will allow you to extend the PDBA module without fear of breaking scripts that use it in its current incarnation. Creating your own module also eliminates the problems that would occur if you download a new version of the PDBA module and it overwrites your carefully crafted extensions. Perl lets you do this with relative ease, and we'll show you how.[3] Let's call our new subclassed module PDBAx. This module will take the place of PDBA in your scripts. The full code for PDBAx appears in Example 13-5.
Example 13-5. The entire PDBAx modulepackage PDBAx; our $VERSION=1.00; use PDBA; our @ISA = qw{PDBA}; sub usage { my ($exitVal,$helpStrRef) = @_; use File::Basename; my $basename = basename($0); print qq{ usage: $basename ${$helpStrRef} }; exit $exitVal; }; 1; You may be surprised at how little code there is in Example 13-5. Yet all the features of the PDBA module are available through PDBAx. That's because of the magic of the @ISA array. The methods and attributes of modules placed in @ISA are inherited by calling modules — in this case, PDBAx. Try running the code shown in Example 13-6. The PDBAx osname method is inherited directly from the PDBA module. Example 13-6. Testing PDBAx#!/usr/bin/perl -w
use warnings;
use strict;
use PDBAx;
print "$PDBA::VERSION\n";
print PDBAx->osname, "\n";
my $help = q{
-database database to connect to
-username user to connect as
-password password for user
};
PDBAx::usage(1,\$help);
The benefit of extending PDBA in this way is that when your intrepid authors release the latest version of the PDBA Toolkit, your usage method will be safely encapsulated within its own PDBAx module. Even if we come up with similarly named methods, yours will override them. It's a kind of magic. 13.2.3 Modifying the PDBA::GQ Module to Deal with NULL ColumnsIn the following sections we'll describe how you can modify the PDBA::GQ module to deal with Oracle NULL values. First, though, let's take a look at the problems involved in using NULLs. 13.2.3.1 Oracle and NULL valuesWhen you first start using the Oracle database, NULL values may take a little getting used to. A NULL is never equivalent to any other value, including another NULL. The truth table in Table 13-1 sums up the results of comparing NULL to NULL, with various SQL operators. Note that only one True is returned with the special IS NULL comparison.
One problem you discover when dealing with NULLs in an Oracle database is that Oracle treats empty strings and NULL values the same way. This is different from other databases, and is readily apparent when you use NULLs with Perl. We'll see this in the next section. 13.2.3.2 Testing the use of NULLsThe null_test.pl script in Example 13-7 builds a table, NULL_TEST, and populates it with two rows of data. Example 13-7. null_test.pl01: #!/usr/bin/perl 02: 03: use warnings; 04: use strict; 05: use PDBA::CM; 06: use PDBA::GQ; 07: use PDBA::DBA; 08: 09: my ( $database, $username, $password ) = qw{ts01 scott tiger}; 10: 11: my $dbh = new PDBA::CM( 12: DATABASE => $database, 13: USERNAME => $username, 14: PASSWORD => $password 15: ); 16: 17: eval { 18: local $dbh->{PrintError} = 0; 19: $dbh->do(q{drop table null_test}); 20: }; 21: 22: $dbh->do(q{create table null_test 23: ( 24: first_name varchar2(20) not null, 25: middle_initial varchar2(1) null, 26: last_name varchar2(20) not null 27: ) 28: }); 29: 30: my $insHandle = $dbh->prepare(q{insert into null_test values(?,?,?)}); 31: 32: $insHandle->execute('Alfred','E','Neuman'); 33: $insHandle->execute('Peter',undef,'Parker'); 34: $insHandle->execute('Clark','','Kent'); 35: $dbh->commit; 36: 37: my $gq = new PDBA::GQ ( $dbh, 'null_test'); 38: 39: while ( my $row = $gq->next ) { 40: printf("Last: %-20s First: %-20s MI: %1s\n", 41: $row->{LAST_NAME}, 42: $row->{FIRST_NAME}, 43: $row->{MIDDLE_INITIAL} 44: ); 45: } 46: $dbh->disconnect; Note the following about this example:
Using the following SQL, we can now prove that Oracle treats both the empty string and undef as NULL values: SQL> select * from null_test where middle_initial is null; FIRST_NAME M LAST_NAME -------------------- - -------------------- Peter Parker Clark Kent 2 rows selected. This problem becomes apparent in Perl when a NULL column is retrieved from a database and an attempt is made to reference the value in a statement. You can see this in lines 40 to 44 of Example 13-7, which we reproduce here: 40: printf("Last: %-20s First: %-20s MI: %1s\n", 41: $row->{LAST_NAME}, 42: $row->{FIRST_NAME}, 43: $row->{MIDDLE_INITIAL} 44: );
Whoa! The second row returned a NULL for MIDDLE_INITIAL, which is treated by Perl as an undef. This threw out a warning because of the use warnings pragma at the top of the script. 13.2.3.3 Considering changes to the scriptThe warning output we saw in the previous section could be eliminated by turning off the warning mechanism temporarily as follows: 38: no warnings; 39: while ( my $row = $gq->next ) { 40: printf("Last: %-20s First: %-20s MI: %1s\n", 41: $row->{LAST_NAME}, 42: $row->{FIRST_NAME}, 43: $row->{MIDDLE_INITIAL} 44: ); 45: } 46: use warnings However, this sweeps potentially difficult problems under the rug, which will almost always re-surface to bite us later. A better solution would ensure that all of the row elements get a guaranteed value before they get pumped into printf: 39: while ( my $row = $gq->next ) { foreach my $key ( keys %$row ) { $row->{$key} = '' unless defined $row->{$key} } 40: printf("Last: %-20s First: %-20s MI: %1s\n", 41: $row->{LAST_NAME}, 42: $row->{FIRST_NAME}, 43: $row->{MIDDLE_INITIAL} 44: ); 45: } The foreach loop added between lines 39 and 40 assigns empty strings to any undefined values in the $row hash reference, thereby preventing warnings. 13.2.3.4 Modifying the PDBA::GQ moduleAn even better solution would extend the PDBA::GQ (Generic Query) module so your scripts would automatically deal with NULL column data. However, as we learned earlier, modifying a module presents its own problems. What we need is a modified version of the PDBA::GQ module. More specifically, we should modify the PDBA::GQ->next method. Example 13-8 shows one way to do this. Example 13-8. PDBAx::GQ01: package PDBAx::GQ; 02: 03: our $VERSION=1.0; 04: 05: use Carp; 06: use warnings; 07: use strict; 08: 09: use PDBA::GQ; 10: our @ISA = qw{PDBA::GQ}; 11: 12: sub next { 13: my $self = shift; 14: my ( $ref ) = @_; 15: $ref ||= []; 16: 17: my $refType = ref $ref; 18: 19: my $data; 20: 21: $data = $self->SUPER::next($ref); 22: return unless $data; 23: 24: # transform NULL columns to a defined value 25: # to avoid problems with undefined values 26: if ( 'ARRAY' eq $refType ) { 27: foreach(@$data){ $_ = '' unless defined } 28: } elsif ( 'HASH' eq $refType ) { 29: foreach my $key ( keys %$data ) 30: { $data->{$key} = '' unless defined $data->{$key} } 31: } else { 32: croak "invalid ref type of $refType " . 33: "used to call PDBAx::GQ->next\n" 34: } 35: return $data; 36: } 37: 1;
We need to change just two of the lines in the null_test.pl script we first encountered in Example 13-8: ... 06: use PDBAx::GQ; ... 37: my $gq = new PDBAx::GQ ( $dbh, 'null_test'); ... All of the functionality of the PDBA::GQ module is still available, but your modifications allow you to stop thinking about referencing NULL values. 13.2.3.5 Taking one more stepThe PDBAx::GQ extension still has a shortcoming. It assigns an empty string to numeric columns that are NULL, and this may be unsuitable for some purposes. Financial reporting may require that these NULL columns be assigned a numeric zero. While this can be done, the complexity of the code required to do it increases significantly, as we'll see in the code supplied via the downloaded PDBAx::GQ module in Example 13-9. Example 13-9. Assigning zero to NULL numeric columns in PDBAx::GQ01: package PDBAx::GQ; 02: 03: our $VERSION=1.00; 04: 05: use Carp; 06: use warnings; 07: use strict; 08: 09: use PDBA::GQ; 10: our @ISA = qw{PDBA::GQ}; 11: 12: my @columnTypes; 13: my %columnTypes; 14: 15: sub new { 16: my $self = shift; 17: my ( $dbh ) = $_[0]; 18: my $qobj = $self->SUPER::new(@_); 19: 20: # get column types for array refs 21: @columnTypes = map {scalar $dbh->type_info($_)->{TYPE_NAME}} @{$qobj->{TYPE}}; 22: 23: # get column types for hash refs 24: 25: # get an array of data type numbers 26: my @types = @{$qobj->{TYPE}}; 27: 28: # get a hash ref of column names and position 29: my $nameHash = $qobj->{NAME_uc_hash}; 30: 31: # create a reverse hash with the column number as the key and 32: # the column name as the value 33: my %colnumHash = map { $nameHash->{$_} => $_ } keys %$nameHash; 34: 35: # create an array of the type names ( VARCHAR2, DATE, etc ) from the 36: # type info method 37: my @columnTypeNames = map { scalar $dbh->type_info($_)->{TYPE_NAME} } @types; 38: 39: # create a hash with column name as the key and data type as the value 40: %columnTypes = map { $colnumHash{$_} => $columnTypeNames[$_] } 0..$#columnTypeNames; 41: 42: $qobj->{private_PDBA_DATA_TYPES_ARRAY} = \@columnTypes; 43: $qobj->{private_PDBA_DATA_TYPES_HASH} = \%columnTypes; 44: return $qobj; 45: } 46: 47: sub next { 48: my $self = shift; 49: my ( $ref ) = @_; 50: $ref ||= []; 51: 52: my $refType = ref $ref; 53: 54: my $data; 55: 56: $data = $self->SUPER::next($ref); 57: return unless $data; 58: 59: # transform NULL columns to a defined value 60: # to avoid problems with undefined values 61: if ( 'ARRAY' eq $refType ) { 62: foreach my $el ( 0..$#{$data} ) { 63: unless ( defined $data->[$el] ) { 64: if ( $self->{private_PDBA_DATA_TYPES_ARRAY}[$el] =~ /CHAR/ ) { 65: $data->[$el] = ''; 66: } elsif ( $self->{private_PDBA_DATA_TYPES_ARRAY}[$el] =~ /DOUBLE|NUMBER/ ) { 67: $data->[$el] = 0; 68: } else { $data->[$el] = '' } 69: } 70: } 71: } elsif ( 'HASH' eq $refType ) { 72: foreach my $key ( keys %$data ) { 73: unless (defined $data->{$key} ) { 74: if ( $self->{private_PDBA_DATA_TYPES_HASH}{$key} =~ /CHAR/ ) { 75: $data->{$key} = ''; 76: } elsif ( $self->{private_PDBA_DATA_TYPES_HASH}{$key} =~ /DOUBLE|NUMBER/ ) { 77: $data->{$key} = 0; 78: } else { $data->{$key} = '' } 79: } 80: } 81: } else { croak "invalid ref type of $refType used to call PDBAx::GQ->next\n"} 82: 83: return $data; 84: } 85: 86: 1; In this example we subclass the new method of PDBA::GQ. We do this so we can determine the datatypes for each column selected in a query.
The script in Example 13-10 uses the all-new PDBAx::GQ module. Both numeric and character columns are inserted into a test table with NULL values, and then later printed out without any need to check to see if they're undefined. The script is stored in the PDBAx distribution as pdba_ext2.pl. Example 13-10. Using the PDBAx:GQ module with numeric and character values#!/usr/bin/perl use warnings; use strict; use PDBA::CM; use PDBAx::GQ; use PDBA::OPT; use Getopt::Long; use PDBAx; my %optctl=( ); my $help=q{ -machine database server -database database SID -username account name -password password for account }; # passthrough allows additional command line options # to be passed to PDBA::OPT if needed Getopt::Long::Configure(qw{pass_through}); GetOptions( \%optctl, "help!", "machine=s", "database=s", "username=s", "password=s", ); if ( $optctl{help} ) { PDBAx::usage(1,\$help) } # lookup the password if not on the command line my $password = ''; if ( defined( $optctl{password} ) ) { $password = $optctl{password}; } else { if ( ! defined($optctl{machine}) || ! defined($optctl{database}) || ! defined($optctl{username}) ) { PDBAx::usage(1,\$help) } $password = PDBA::OPT->pwcOptions ( INSTANCE => $optctl{database}, MACHINE => $optctl{machine}, USERNAME => $optctl{username} ); } my $dbh = new PDBA::CM( DATABASE => $optctl{database}, USERNAME => $optctl{username}, PASSWORD => $password, ); # drop test table eval { $dbh->do(q{drop table star_trek}); }; $dbh->do(q{create table star_trek( title varchar2(50) , year_released varchar2(4) , viewings number(4) )} ); my $insHandle = $dbh->prepare(q{ insert into star_trek values(?,?,?) }); $insHandle->execute('Star Trek - The Motion Picture','1979',1); $insHandle->execute('Star Trek II - The Wrath of Khan','1982',4); $insHandle->execute('Star Trek III - The Search for Spock','1984',undef); $insHandle->execute('Star Trek IV - The Voyage Home','1986',8); $insHandle->execute('Star Trek V - The Final Frontier','1989',1); $insHandle->execute('Star Trek VI - The Undiscovered Country','1991',3); $insHandle->execute('Star Trek Generations','1994',1); $insHandle->execute('Star Trek - First Contact','1996',4); $insHandle->execute('Star Trek - Insurrection','1998',2); $insHandle->execute('Star Trek: Nemesis',undef,undef); $dbh->commit; my $gq = new PDBAx::GQ($dbh, 'star_trek', {ORDER_BY=>'year_released'}); my $colHash = $gq->getColumns; while ( my $row = $gq->next ) { print "TITLE: $row->[$colHash->{TITLE}]\n"; print "\tYEAR: $row->[$colHash->{YEAR_RELEASED}]\n"; print "\tVIEWINGS: $row->[$colHash->{VIEWINGS}]\n"; } $dbh->disconnect; Previously, printing values returned from NULL columns would have required checking the return values within each script; now, we can safely ignore them: $ pdba_ext2.pl -machine sherlock -database ts01 -username jkstill TITLE: Star Trek - The Motion Picture YEAR: 1979 VIEWINGS: 1 TITLE: Star Trek II - The Wrath of Khan YEAR: 1982 VIEWINGS: 4 TITLE: Star Trek III - The Search for Spock YEAR: 1984 VIEWINGS: 0 TITLE: Star Trek IV - The Voyage Home YEAR: 1986 VIEWINGS: 8 TITLE: Star Trek V - The Final Frontier YEAR: 1989 VIEWINGS: 1 TITLE: Star Trek VI - The Undiscovered Country YEAR: 1991 VIEWINGS: 3 TITLE: Star Trek Generations YEAR: 1994 VIEWINGS: 1 TITLE: Star Trek - First Contact YEAR: 1996 VIEWINGS: 4 TITLE: Star Trek - Insurrection YEAR: 1998 VIEWINGS: 2 TITLE: Star Trek: Nemesis YEAR: VIEWINGS: 0 Notice that one of the entries has a blank year of release, and two of them have zero viewings. These values are actually stored as NULL in the database, but now, because of our implementation of PDBAx::GQ, they may be referenced with impunity — without invoking the wrath of Perl and use warnings. Revenge is a dish best eaten without undefined values! |