4.8 Autovivification and Hashes
Autovivification also works for
hash references. If a variable containing undef is
dereferenced as if it were a hash reference, a reference to an empty
anonymous hash is inserted, and the operation continues.
One place this comes in very handy is in a typical data reduction
task. For example let's say the Professor gets an
island-area network up and running (perhaps using Coco-Net or maybe
Vines), and now wants to track the traffic from host to host. He
begins logging the number of bytes transferred to a log file, giving
the source host, the destination host, and the number of transferred
bytes:
professor.hut gilligan.crew.hut 1250
professor.hut lovey.howell.hut 910
thurston.howell.hut lovey.howell.hut 1250
professor.hut lovey.howell.hut 450
professor.hut laser3.copyroom.hut 2924
ginger.girl.hut professor.hut 1218
ginger.girl.hut maryann.girl.hut 199
...
Now the Professor wants to produce a summary of the source host, the
destination host, and the total number of transferred bytes for the
day. Tabulating the data is as simple as:
my %total_bytes;
while (<>) {
my ($source, $destination, $bytes) = split;
$total_bytes{$source}{$destination} += $bytes;
}
Let's see how this works on the first line of data.
You'll be executing:
$total_bytes{"professor.hut"}{"gilligan.crew.hut"} += 1250;
Because %total_bytes is initially empty, the first
key of professor.hut is not found, but it
establishes an undef value for the dereferencing
as a hash reference. (Keep in mind that an implicit arrow is between
the two sets of curly braces here.) Perl sticks in a reference to an
empty anonymous hash in that element, which then is immediately
extended to include the element with a key of
gilligan.crew.hut. Its initial value is
undef, which acts like a zero when you add 1250 to
it, and the result of 1250 is inserted back into the hash.
Any later data line that contains this same source host and
destination host will re-use that same value, adding more bytes to
the running total. But each new destination host extends a hash to
include a new initially undef byte count, and each
new source host uses autovivification to create a destination host
hash. In other words, Perl does the right thing, as always.
Once you've processed the file,
it's time to display the summary. First, you
determine all the sources:
for my $source (keys %total_bytes) {
...
Now, you should get all destinations. The syntax for this is a bit
tricky. You want all keys of the hash, resulting from dereferencing
the value of the hash element, in the first structure:
for my $source (keys %total_bytes) {
for my $destination (keys %{ $total_bytes{$source} }) {
....
For good measure, you should probably sort both lists to be
consistent:
for my $source (sort keys %total_bytes) {
for my $destination (sort keys %{ $total_bytes{$source} }) {
print "$source => $destination:",
" $total_bytes{$source}{$destination} bytes\n";
}
print "\n";
}
This is a typical data-reduction report generation strategy. Simply
create a hash-of-hashrefs (perhaps nested even deeper, as
you'll see later), using autovivification to fill in
the gaps in the upper data structures as needed, and then walk
through the resulting data structure to display the
results.
|