9.6 Developing a Backup Strategy
This is
a hardware book, so we don't spend much time on
software. But, in our experience, many people who buy a tape drive
have no idea how to use it effectively. We won't try
to explain how to use your backup software, because the specifics
vary and nearly any software bundled with a tape drive is sufficient
for the task, but we will devote some space to explaining how to get
the most from your tape drive and backup software.
9.6.1 File Attributes and the Archive Bit
If you have a tape drive large enough to back up your entire hard
disk and the time necessary to use only complete backups, the status
of any particular file doesn't matter. Every file
gets backed up every time, whether it was created that day or has
been sitting unchanged for a year. But if you need to use some
combination of complete and partial backups, the status of each file
becomes critical. If a file is unchanged since the last complete
backup, you want to ignore it when doing partial backups. If the file
was created or changed since the last complete backup, it needs to be
copied to the partial backup tape.
All modern operating systems maintain a file attribute for each file
called the archive
bit.
When a file is created or changed, the operating system toggles the
archive bit on, indicating that that file is a candidate for backup.
Backup software can manipulate the archive bit, either turning it off
after it backs up the file, or leaving it on so that file will again
be backed up the next time you do a partial backup.
9.6.2 Understanding Backup Types
Backup software can use or ignore the archive bit in determining
which files to back up, and can either turn the archive bit off or
leave it unchanged when the backup is complete. How the archive bit
is used and manipulated determines what type of backup is done, as
follows.
- Full Backup
-
A Full Backup, which Microsoft calls a
Normal Backup, backs up every selected file,
regardless of the status of the archive bit. When the backup
completes, the backup software turns off the archive bit for every
file that was backed up. Note that
"full" is a misnomer, because a
full backup backs up only the files you have
selected, which may be as little as one
directory or even a single file, so in that sense
Microsoft's terminology is actually more accurate.
Given the choice, full backup is the method to use, because all files
are on one tape, which makes it much easier to retrieve files from
tape when necessary. Relative to partial backups, full backups also
increase redundancy, because all files are on all tapes. That means
that if one tape fails, you may still be able to retrieve a given
file from another tape.
- Differential Backup
-
A Differential Backup is a partial backup that
copies a selected file to tape only if the archive bit for that file
is turned on, indicating that it has changed since the last Full
Backup. A Differential Backup leaves the archive bits unchanged on
the files it copies. Accordingly, any Differential Backup set
contains all files that have changed since the last Full Backup. A
Differential Backup set run soon after a Full Backup will contain
relatively few files. One run soon before the next Full Backup is due
will contain many files, including those contained on all previous
Differential Backup sets since the last Full Backup. When you use
Differential Backup, a complete backup set comprises only two tapes
or tape sets: the tape that contains the last Full Backup and the
tape that contains the most recent Differential Backup.
- Incremental Backup
-
An Incremental Backup is another form of partial
backup. Like Differential Backups, Incremental Backups copy a
selected file to tape only if the archive bit for that file is turned
on. Unlike the Differential Backup, however, the Incremental Backup
clears the archive bits for the files it backs up. An Incremental
Backup set therefore contains only files that have changed since the
last Full Backup or the last Incremental Backup.
If you run an Incremental Backup daily, files changed on Monday are
on the Monday tape, files changed on Tuesday are on the Tuesday tape,
and so forth. When you use an Incremental Backup scheme, a complete
backup set comprises the tape that contains the last Full Backup and
all of the tapes that contain every Incremental Backup done since the
last Normal Backup. The only advantages of Incremental Backups is
that they minimize backup time and keep multiple versions of files
that change frequently. The disadvantages are that backed-up files
are scattered across multiple tapes, making it difficult to locate
any particular file you need to restore, and that there is no
redundancy. That is, each file is stored only on one tape.
- Full Copy Backup
-
A Full Copy Backup (which Microsoft calls a
Copy Backup) is identical to a Full Backup
except the last step. The Full Backup finishes by turning off the
archive bit on all files that have been backed up. The Full Copy
Backup instead leaves the archive bits unchanged. The Full Copy
Backup is useful only if you are using a combination of Full Backups
and Incremental or Differential partial backups. The Full Copy Backup
allows you to make a duplicate
"full" backup, e.g., for storage
off-site, without altering the state of the hard drive you are
backing up, which would destroy the integrity of the partial backup
rotation.
|
Some Microsoft backup software provides a bizarre backup method they
call a Daily Copy
Backup
. This method ignores the archive
bit entirely and instead depends on the date/time stamp of files to
determine which files should be backed up. The problem is,
it's quite possible for software to change a file
without changing the date/time stamp, or to change the date/time
stamp without changing the contents of the file. For this reason, we
regard the Daily Copy Backup as entirely unreliable and recommend you
avoid using it.
|
|
9.6.3 Choosing a Tape Rotation Method
A tape rotation method is a
procedure that specifies when each particular tape will be used, and
what will be backed up to it. For example, for a simple tape rotation
scheme, you might label five tapes Monday through Friday and then do
a complete Full Backup to the corresponding tape each day. Some tape
rotation methods are simple and use only a few tapes. Others are
immensely complex and use many tapes. Choosing the most appropriate
tape rotation method is a critical step in developing and
implementing your backup plan.
On one extreme, you could use the same tape
every day, but that has obvious dangers, including the risk of that
one tape being lost or damaged, the inability to retrieve a file that
was deleted or corrupted more than a day previous, and the inability
to keep an off-site copy. On the other extreme, Robert once did some
consulting for a law firm that never reuses a backup tape. Every
evening they do a complete backup and compare of their
"active" volumes to a new tape,
which is then stored indefinitely in their vault. They regard the
small daily cost of a new backup tape as trivial relative to the
benefit of being able to reconstruct their data exactly for any
specified day.
Chances are, the best tape rotation method for you falls somewhere
between those extremes. Here are some issues to think about when you
choose a tape rotation method:
- Availability
-
When you need to do a restore, whether of a single file accidentally
deleted or of an entire volume whose hard drive crashed, time is
often important. A proper tape rotation scheme ensures that the most
recent backup data is immediately available to restore.
- Archiving
-
The most recent version of your backup data may not be good enough.
Perhaps a file was accidentally deleted or a database improperly
modified some time ago, but that was only recently discovered. The
most recent backup may, for various reasons, be missing the file you
need. An ideal tape rotation method allows you to retrieve a version
of a file from days, weeks, or months previous, before the file had
been deleted or improperly modified. Tape sets created with the best
and most powerful tape rotation methods allow you to select from
multiple versions of the file, so that you can retrieve the most
recent good version. A good tape rotation method also makes
provisions for periodically removing a tape from the rotation and
archiving it for historical reasons.
- Redundancy
-
Tapes can break or be misplaced. Someone may overwrite the wrong
tape. A good tape rotation scheme recognizes these facts, and uses
redundancy to minimize the effect of such problems. If the file
can't be retrieved from one tape, it should be
retrievable from another.
- Equalized tape wear
-
Ideally, you'd like all tapes in the set to be used
equally often to distribute wear evenly across the set. The simpler
tape rotation methods usually fall down in this regard. For example,
the popular Grandfather-Father-Son rotation, described later in this
section, requires writing to some tapes in the set once a week, to
others once a month, and to still others only once a year. Although
equalizing tape wear is a less important consideration for most users
than the others described, doing so is desirable in that it minimizes
the chance that a tape will break, stretch, or otherwise become
unusable because it has been used too frequently.
Many standard tape rotation methods exist. Some are simple and use
few tapes, but fail to meet some of the goals described above. Others
meet each goal, or nearly so, but are difficult to manage and require
many tapes. Some methods use only full backups, others use both full
and partial backups, and still others may be modified to use either
only full backups or a combination of full and partial backups.
|
If you have a choice, use only full backups. Use partial backups only
if you are forced to do so by limited tape drive capacity or a backup
window that is too short to allow using all complete backups. When it
comes time to restore, you will find that it makes your life much
easier to have the entire data set in one place rather than
distributed among multiple tapes.
|
|
Here are the most common backup rotations:
- Daily Full
-
The simplest rotation is to do a complete Full Backup each day,
assuming you have both adequate tape drive capacity and a long enough
backup window. Most sites that use this method use 10 tapes, labeled
"Monday A" through
"Friday A" and
"Monday B" through
"Friday B." Using this method
offers the considerable advantages of simple administration and
extreme data redundancy. It's always obvious which
tape you should be backing up to. If you start a restore and your
most recent backup tape breaks, you simply use the next most recent
tape. All tapes receive equal wear, and can be replaced periodically
as a set. You can cycle each backup tape off-site as it is replaced
by today's backup, leaving your most recent backup
available on-site for easy restores, while having an off-site tape
that is only one day old. The sole disadvantage of this rotation is
that it limits you to retrieving historical data from only two weeks
prior, assuming that you use ten tapes. This problem is easily
addressed. Simply add four Quarterly tapes or twelve Monthly tapes to
the rotation, and do a duplicate backup to the appropriate archive
tape at the end of each quarter or month.
- Weekly Full with Daily Differential
-
This is probably the most commonly used rotation on PC-class systems.
In its simplest form, it requires only three tapes,
"Weekly A,"
"Weekly B," and
"Daily." On
"odd" Fridays, you do a Full Backup
to Weekly A. On "even" Fridays, you
do a Full Backup to Weekly B. Monday through Thursday, you do a
Differential Backup up to the Daily tape. This rotation is simple to
manage and requires few tapes, but has the following disadvantages:
Historical data can be retrieved for a period of at most two weeks.
If you accidentally delete a file and don't realize
it for a couple of weeks, that file is gone for good.
If the Daily tape fails during a restore, your next most recent tape
is the last Weekly tape, which means you may lose as much as four
days' worth of data.
Only one current copy of the Normal Backup exists, so you must either
keep it on-site for easy retrieval or off-site for safety.
Tape wear is very uneven, since the Daily tape is used eight times
more often than a Weekly tape.
Simply adding more tapes and making minor changes to the rotation
solves most of these problems. For example, add a tape to do a second
Full Backup each Friday, and store that tape off-site. Add a second
Daily tape and alternate using them, or simply use a tape for each
workday. To extend historical data, add four Quarterly tapes (or
twelve Monthly tapes), do a Full Backup to the appropriate tape on
the final day of the corresponding quarter (or month), and then store
the tape.
|
The Weekly Full with Daily
Differential rotation described above is an excellent
choice for most people, but beware the similar-sounding
Weekly Full with Daily
Incremental rotation,
which is the worst possible choice short of not backing up at all.
For some reason, this rotation is recommended in many books and even
in some tape drive manuals. Don't use it
if you value your data!Its only advantage is that it minimizes backup times, but
at the expense of data security. Because this rotation uses
Incremental Backups, each Daily tape contains a different group of
files. Restoring one file may require looking at multiple tapes to
ensure that you are restoring the most recent version. Doing a
complete restore requires that you be able to restore the most recent
Normal Backup tape and all subsequent Daily tapes successfully. If
any Daily tape fails during the restore, you must either revert to
the last Normal Backup, losing all subsequent changes to files, or
risk incoherent file versions caused by restoring only some of the
Daily tapes.
|
|
- The Grandfather-Father-Son Rotation
-
The Grandfather-Father-Son (GFS) tape
rotation method is more commonly used on servers than on personal
systems, but it's worth considering if your data is
very valuable and you think it's worth going to some
trouble and expense to secure it. GFS is the easiest to manage of any
of the "complex" tape rotations,
requires relatively few tapes, and is supported directly by every
backup program on the market. A typical GFS rotation tape set
requires 21 tapes, as follows:
Daily Tapes. Label four tapes Monday through Thursday. Back up each
day to the tape for the corresponding day, overwriting each tape once
a week.
Weekly Tapes. Label five tapes Friday-1 through Friday-5. Back up
each Friday to the corresponding weekly tape, using the Friday-5 tape
only in months that have 5 Fridays. Weekly tapes 1 through 4 are
overwritten once a month, with Friday-5 being overwritten less
frequently.
Monthly Tapes. Label twelve tapes January through December. Back up
the first (or last) of each month to the corresponding monthly tape.
Monthly tapes are overwritten only once per year.
GFS meets most of the goals of an ideal tape rotation method. You can
keep recent tape sets on-site, and migrate others off-site. GFS
provides weekly granularity for the preceding month and monthly
granularity for the preceding year. GFS provides numerous copies of
both recent and older data. The disadvantage to GFS is that tape wear
is uneven. Daily tapes are written once a week, weekly tapes once a
month, and monthly tapes only once a year. Uneven tape wear is a
small price to pay for the other advantages of GFS, however. Most GFS
rotations use Differential Backup for daily tapes and Full Backup for
weekly and monthly tapes, but nothing prevents you from using Full
Backup for all tapes.
|