6.11 Partitioning Large Log Tables
6.11.1 Problem
Log tables are usually very large in
that they contain a large number
of rows. However, only the most recent events are usually accessed on
a frequent basis. You want to create a structure that will take
advantage of this fact, allowing the more recent records to be
queried in an efficient manner.
6.11.2 Solution
Divide your log table into two tables. They should both have the same
structure. One table will hold the most recent events — for
example, the current month's data, and the other
table will hold everything else. The table with the most recent data
can be referred to as the current log, while the other table is
referred to as the archive log. You can then combine the data from
the two tables into one row set using the following query:
SELECT p1, p2, p3 ... pn FROM log_work
UNION
SELECT p1,p2,p3 ... pn FROM log_archive
6.11.3 Discussion
With this kind of a structure, the current log will always be
relatively short compared to the archive log. This is good for insert
operations, especially if you choose not to index the table. New log
entries are always written to the current log table, and the fewer
the indices to maintain, the better your performance will be. Because
it's relatively small, you may still be able to get
decent query performance from the current log table even without an
index. The archive log table, on the other hand, should be indexed
since it's relatively large.
This combined structure is very efficient for many types of
operations, but it requires additional work on the part of your
programmers, who must develop regular batch procedures that
periodically move older log records from the current log table to the
archive table. Still, when performance is the critical goal, this
solution can be very beneficial.
The UNION query can be nicely encapsulated into a UNION view. Then
you have a view that behaves as a single table even though the
underlying implementation uses two tables. Unfortunately, not all
database vendors implement UNION views. If you are not using MS SQL
Server, you'll have to check your database
server's documentation to see if the UNION view
option is available to you.
|
The solution shown in this recipe doesn't apply only
to log tables. It can be applied to any type of large table, where
only a small fraction of the rows in that table are used consistently
and the rest are used infrequently.
|
|
|