Recipe 21.9 Customizing Apache's Logging
21.9.1 Problem
You want to change how
Apache logs requests. For example, you want a database of URLs and
access counts, or per-user logs.
21.9.2 Solution
Install a handler with PerlLogHandler:
PerlModule Apache::MyLogger
PerlLogHandler Apache::MyLogger
Within the handler, methods on the request object obtain information
about the completed request. In the following code,
$r is the request object and $c
is the connection object obtained from
$r->connection:
$r->the_request GET /roast/chickens.html HTTP/1.1
$r->uri /roast/chickens.html
$r->header_in("User-Agent") Mozilla-XXX
$r->header_in("Referer") http://gargle.com/?search=h0t%20chix0rz
$r->bytes_sent 1648
$c->get_remote_host 208.201.239.56
$r->status_line 200 OK
$r->server_hostname www.myserver.com
21.9.3 Discussion
Apache calls logging handlers after sending the response to the
client. You have full access to the request and response parameters,
such as client IP address, headers, status, and even content. Access
this information through method calls on the request object.
You'll probably want to escape values before writing them to a text
file because spaces, newlines, and quotes could spoil the formatting
of the files. Two useful functions are:
# return string with newlines and double quotes escaped
sub escape {
my $a = shift;
$a =~ s/([\n\"])/sprintf("%%%02x", ord($1))/ge;
return $a;
}
# return string with newlines, spaces, and double quotes escaped
sub escape_plus {
my $a = shift;
$a =~ s/([\n \"])/sprintf("%%%02x", ord($1))/ge;
return $a;
}
Two prebuilt logging modules on CPAN are
Apache::Traffic and Apache::DBILogger. Apache::Traffic lets you
assign owner strings (either usernames, UIDs, or arbitrary strings)
to your web server's directories in httpd.conf.
Apache::Traffic builds a DBM database as Apache serves files from
these directories. For each owner, the database records the number of
hits their directories received each day and the total number of
bytes transferred by those hits.
Apache::DBILogger is a more general interface, logging each hit as a
new entry in a table. The table has columns for data such as which
virtual host delivered the data, the client's IP address, the user
agent (browser), the date, the number of bytes transferred, and so
on. Using this table and suitable indexes and queries, you can answer
almost any question about traffic on your web site.
Because the logging handler runs before Apache has closed the
connection to the client, don't use this phase if you have a slow
logging operation. Instead, install the handler with
PerlCleanupHandler so that it runs after the connection is closed.
21.9.4 See Also
Writing Apache Modules with Perl and C;
Chapter 16 of mod_perl Developer's Cookbook;
documentation for the Apache::Traffic and Apache::DBILogger CPAN
modules; the Apache.pm manpage
|