[ Team LiB ] Previous Section Next Section

Recipe 21.9 Customizing Apache's Logging

21.9.1 Problem

You want to change how Apache logs requests. For example, you want a database of URLs and access counts, or per-user logs.

21.9.2 Solution

Install a handler with PerlLogHandler:

PerlModule Apache::MyLogger
PerlLogHandler Apache::MyLogger

Within the handler, methods on the request object obtain information about the completed request. In the following code, $r is the request object and $c is the connection object obtained from $r->connection:

$r->the_request               GET /roast/chickens.html HTTP/1.1
$r->uri                       /roast/chickens.html
$r->header_in("User-Agent")   Mozilla-XXX
$r->header_in("Referer")      http://gargle.com/?search=h0t%20chix0rz
$r->bytes_sent                1648
$c->get_remote_host           208.201.239.56
$r->status_line               200 OK
$r->server_hostname           www.myserver.com

21.9.3 Discussion

Apache calls logging handlers after sending the response to the client. You have full access to the request and response parameters, such as client IP address, headers, status, and even content. Access this information through method calls on the request object.

You'll probably want to escape values before writing them to a text file because spaces, newlines, and quotes could spoil the formatting of the files. Two useful functions are:

# return string with newlines and double quotes escaped
sub escape {
  my $a = shift;
  $a =~ s/([\n\"])/sprintf("%%%02x", ord($1))/ge;
 return $a;
}

# return string with newlines, spaces, and double quotes escaped
sub escape_plus {
  my $a = shift;
  $a =~ s/([\n \"])/sprintf("%%%02x", ord($1))/ge;
  return $a;
}

Two prebuilt logging modules on CPAN are Apache::Traffic and Apache::DBILogger. Apache::Traffic lets you assign owner strings (either usernames, UIDs, or arbitrary strings) to your web server's directories in httpd.conf. Apache::Traffic builds a DBM database as Apache serves files from these directories. For each owner, the database records the number of hits their directories received each day and the total number of bytes transferred by those hits.

Apache::DBILogger is a more general interface, logging each hit as a new entry in a table. The table has columns for data such as which virtual host delivered the data, the client's IP address, the user agent (browser), the date, the number of bytes transferred, and so on. Using this table and suitable indexes and queries, you can answer almost any question about traffic on your web site.

Because the logging handler runs before Apache has closed the connection to the client, don't use this phase if you have a slow logging operation. Instead, install the handler with PerlCleanupHandler so that it runs after the connection is closed.

21.9.4 See Also

Writing Apache Modules with Perl and C; Chapter 16 of mod_perl Developer's Cookbook; documentation for the Apache::Traffic and Apache::DBILogger CPAN modules; the Apache.pm manpage

    [ Team LiB ] Previous Section Next Section