[ Team LiB ] |
Recipe 14.19 Program: ggh—Grep Netscape Global HistoryThis program divulges the contents of Netscape's history.db file. It can be called with full URLs or with a (single) pattern. If called without arguments, it displays every entry in the history file. The ~/.netscape/history.db file is used unless the -database option is given. Each output line shows the URL and its access time. The time is converted into localtime representation with -localtime (the default) or gmtime representation with -gmtime—or left in raw form with -epochtime, which is useful for sorting by date. To specify a pattern to match against, give one single argument without a ://. To look up one or more URLs, supply them as arguments: % ggh http://www.perl.com/index.html To find out a link you don't quite recall, use a regular expression (a single argument without a :// is a pattern): % ggh perl To find out everyone you've mailed: % ggh mailto: To find out the FAQ sites you've visited, use a snazzy Perl pattern with an embedded /i modifier: % ggh -regexp '(?i)\bfaq\b' If you don't want the internal date converted to localtime, use -epoch: % ggh -epoch http://www.perl.com/perl/ If you prefer gmtime to localtime, use -gmtime: % ggh -gmtime http://www.perl.com/perl/ To look at the whole file, give no arguments (but perhaps redirect to a pager): % ggh | less If you want the output sorted by date, use the -epoch flag: % ggh -epoch | sort -rn | less If you want it sorted by date into your local time zone format, use a more sophisticated pipeline: % ggh -epoch | sort -rn | perl -pe 's/\d+/localtime $&/e' | less The Netscape release notes claim that they're using NDBM format. This is misleading: they're actually using Berkeley DB format, which is why we require DB_File (not supplied standard with all systems Perl runs on) instead of NDBM_File (which is). The program is shown in Example 14-7. Example 14-7. ggh#!/usr/bin/perl -w # ggh -- grovel global history in netscape logs $USAGE = << EO_COMPLAINT; usage: $0 [-database dbfilename] [-help] [-epochtime | -localtime | -gmtime] [ [-regexp] pattern] | href ... ] EO_COMPLAINT use Getopt::Long; ($opt_database, $opt_epochtime, $opt_localtime, $opt_gmtime, $opt_regexp, $opt_help, $pattern, ) = (0) x 7; usage( ) unless GetOptions qw{ database=s regexp=s epochtime localtime gmtime help }; if ($opt_help) { print $USAGE; exit; } usage("only one of localtime, gmtime, and epochtime allowed") if $opt_localtime + $opt_gmtime + $opt_epochtime > 1; if ( $opt_regexp ) { $pattern = $opt_regexp; } elsif (@ARGV && $ARGV[0] !~ m(://)) { $pattern = shift; } usage("can't mix URLs and explicit patterns") if $pattern && @ARGV; if ($pattern && !eval { '' =~ /$pattern/; 1 } ) { $@ =~ s/ at \w+ line \d+\.//; die "$0: bad pattern $@"; } require DB_File; DB_File->import( ); # delay loading until runtime $| = 1; # feed the hungry PAGERs $dotdir = $ENV{HOME} || $ENV{LOGNAME}; $HISTORY = $opt_database || "$dotdir/.netscape/history.db"; die "no netscape history dbase in $HISTORY: $!" unless -e $HISTORY; die "can't dbmopen $HISTORY: $!" unless dbmopen %hist_db, $HISTORY, 0666; # the next line is a hack because the C programmers who did this # didn't understand strlen vs strlen+1. jwz told me so. :-) $add_nulls = (ord(substr(each %hist_db, -1)) = = 0); # XXX: should now do scalar keys to reset but don't # want cost of full traverse, required on tied hashes. # better to close and reopen? $nulled_href=""; $byte_order = "V"; # PC people don't grok "N" (network order) if (@ARGV) { foreach $href (@ARGV) { $nulled_href = $href . ($add_nulls && "\0"); unless ($binary_time = $hist_db{$nulled_href}) { warn "$0: No history entry for HREF $href\n"; next; } $epoch_secs = unpack($byte_order, $binary_time); $stardate = $opt_epochtime ? $epoch_secs : $opt_gmtime ? gmtime $epoch_secs : localtime $epoch_secs; print "$stardate $href\n"; } } else { while ( ($href, $binary_time) = each %hist_db ) { chop $href if $add_nulls; next unless defined $href && defined $binary_time; # gnat reports some binary times are missing $binary_time = pack($byte_order, 0) unless $binary_time; $epoch_secs = unpack($byte_order, $binary_time); $stardate = $opt_epochtime ? $epoch_secs : $opt_gmtime ? gmtime $epoch_secs : localtime $epoch_secs; print "$stardate $href\n" unless $pattern && $href !~ /$pattern/o; } } sub usage { print STDERR "@_\n" if @_; die $USAGE; } 14.19.1 See AlsoThe Introduction to this chapter; Recipe 6.18 |
[ Team LiB ] |