11.2 Writable CD Formats
The physical and logical format used by writable CDs is defined in
the rainbow books described in the CD-ROM chapter. The following
sections provide an overview of how data is physically and logically
stored on writable CDs. For further detail, refer to the rainbow
books.
|
CD-R discs are manufactured with a pregroove
track, which is 600 nanometers (nm) wide with a 1,600nm pitch. The
pregroove includes an impressed timing wobble of
±3 nm radial excursion at 22.05 KHz, with an FM carrier
modulated at 1 KHz superimposed on the pregroove. This modulation
provides an absolute clock signal (called absolute time in
pregroove, or ATIP) which provides an absolute location
reference for any sector on the CD-R disc. Absolute addresses on the
CD-R disc are specified in the form HH:MM:SS using ATIP information.
Audio CDs are addressable in this manner with resolution of one
second (75 sectors). Data CDs are addressable to the individual
sector level.
|
|
11.2.1 Physical Formats
Because they must be readable in a standard
CD-ROM drive or CD player, writable CDs use a physical format nearly
identical to pressed CDs. The dimensions of a CD are 120.00mm in
diameter (60.00mm radius) with a 15mm diameter central hole, which
accommodates the rotating center spindle of the drive. Beginning at
the edge of the center hole (radius 7.50mm) and proceeding outwards,
a CD-R disc is divided into the following areas:
- Clamping Area
-
The Clamping Area is that portion of the disc
that the drive spindle grasps to rotate the disc. On a pressed CD,
this area extends from radius 7.50mm to 23.00mm. On a writable CD,
this area occupies radius 7.50mm to 22.35mm.
- System Use Area
-
The System Use Area (SUA) is present only on
writable discs, occupies radius 22.35mm to 23.00mm, and can be
thought of as equivalent to the boot sector of a hard disk. The SUA
contains data that tells a CD drive or player what kind of
information is stored on the disc, where it is located, and what
format it uses. The SUA is inside the radius readable by standard
CD-ROM drives and CD players, and so may be read by (and written to)
only by CD recorders. The SUA is divided into two sub-areas:
- Optimal Power Calibration Area
-
The Optimal Power Calibration Area (OPCA), often
called the Power Calibration Area (PCA) for
short, is used by the CD writer as a testing area to decide the best
write schema to use when writing to that disc.
Each time you insert a disc into a CD-R drive, the drive fires its
writing laser at the PCA to calibrate that disc against the drive.
Each such calibration uses one ATIP frame. Only 99 PCA ATIP frames
are available at most, which limits a CD-R disc to 99 or fewer
recording sessions.
Many variables determine how the drive should best write to that
disc—the type of dye and reflective backing material the disc
uses, the proposed write speed, the firmware level of the drive, and
so on. From this calibration testing, the drive decides the power
level to use when writing, and whether to use a short write schema
(typical for cyanine-based discs) or a long write schema (typical for
pthalocyanine- and azo-based discs). The PCA begins at radius 22.35mm
(ATIP -00:00:36 relative to the 23.00mm beginning of the Lead-in
Area).
- Program Memory Area
-
The Program Memory Area (PMA) begins where the
PCA ends, and extends to the beginning of the Lead-in Area at radius
23.0mm. The PMA is used to store a temporary Table of Contents (TOC)
until the disc is finalized or
closed. Closing a disc writes the temporary TOC
stored in the PMA to the Lead-in Area, described below. That makes
the TOC (and therefore the disc) readable by a CD-ROM drive or CD
player, but also means that the disc can no longer be written to by a
CD recorder. The PMA can store location information for up to 99
track numbers, including the start and stop times for each track (for
audio) or the sector addresses for data.
- Information Area
-
The Information Area (IA) occupies a width of
35.0mm to 35.5mm, beginning at radius 23.0mm and ending between
radius 58.0mm and 58.5mm. This area provides the general storage
space to which user data is written. The IA is the only area of the
CD that is visible to standard CD-ROM drives and CD players, and
includes the following sub-areas:
- Lead-in Area
-
The Lead-in Area occupies radius 23.0 to 25.0mm
on both pressed and writable CDs. This area contains digital silence
in the main channel, as well as control information in various
subcode channels, which can be used to provide additional information
to the drive or reader about the content of the disc. The most
important of the subcode channel data is the Table of Contents for
the disc, which is stored in the Q-channel. The length of the Lead-in
Area is determined by the space required to store up to 99 Tables of
Contents for the 99 tracks that may potentially be written to the
Program Area.
|
A CD has a main data channel—which stores audio and/or computer
data—and eight interleaved subcode channels, designated P
through W, which can store supplemental control data that can be read
by CD-ROM drives and CD players. When the CD format was originally
designed, it was intended that the main channel would contain only
data and that subcode channels would be used to store administrative
information. Nowadays, such supplemental information is usually
encoded within the main data channel, and the only subchannels that
are generally used are the P-channel, which specifies the start and
end of each track, and the Q-channel, which stores the TOC, the track
type/catalog number, and the timecodes (in HH:MM:SS and frames) used
to locate data on the disc. Subchannels R through W were formerly
sometimes used to store graphics and other supplemental data, but are
now seldom used. The DVD specification eliminates subchannel coding
as superfluous.
If you've ever wondered why a CD-R
disc that has been written to but not closed can be read in a CD
recorder but not in a standard CD-ROM drive or CD player, this is
why. Standard readers look for the TOC in the Lead-in Area, where it
has not yet been written for a disc that is not yet closed. CD
recorders can read the temporary TOC stored in the PMA, which allows
them to read that disc. The PMA is invisible to standard CD-ROM
drives and CD players, so as far as they're
concerned, that disc has no TOC.
|
|
- Program Area
-
The Program Area (PA) occupies a width of 33.0mm
to 33.5mm, beginning at radius 25.0mm and ending between radius
58.0mm and 58.5mm. The PA is where actual user data (audio or
computer data) is stored. The PA varies in capacity according to the
CD-R disc you use. Discs are available that store 63 minutes of audio
(which corresponds to about 600 MB of data), 74 minutes (~650 MB),
and 80 minutes (~700 MB). Different brands of discs also have minor
variations from nominal capacity. Some nominally 74-minute discs, for
example, can store as much as 76.5 minutes.
- Lead-out Area
-
The Lead-out Area occupies a radius of 0.5mm to
1.0mm, which begins between radius 58.0mm and 58.5mm and ends between
radius 59.0mm and 59.5mm. The Lead-out Area is created when the disc
is closed, and defines the end of the Information Area.
- Edge
-
The remaining 0.5mm to 1.0mm at the outer edge of the disc is unused.
This area has no formal name that we know of, and exists simply to
protect the outer portion of the track from damage.
The preceding assumes that the data on the disc exists as one
session, which is nearly always true for commercial pressed CDs, as
well as for writable CDs produced using Disc-at-Once recording
(described in a later section). But Orange Book defines a concept
called multisession for CD-R discs.
With multisession recording, the overall disc layout remains the
same. As with a single-session disc, a multisession disc contains a
Lead-in Area, a Program Area, and a Lead-out Area. The difference is
that the Program Area on a multisession disc stores more than one
session, each of which contains its own session-based Lead-in Area,
Program Area, and Lead-out Area.
Like the disc itself, a session can be opened, written to, and
closed. When a session is closed, that session can no longer be
written to, but additional sessions can be added to the disc. In
fact, closing a session on a multisession disc automatically opens a
new session to which additional data can be written. Closing the
session writes the session TOC to the PMA. This session TOC includes
pointers to the start of the session Program Area for the new session
and to the start time of the last-used (outermost) Lead-out Area.
Closing the session does not close the disc, however, which means
that until the disc itself is closed, sessions on a multisession disc
can be read only by a CD recorder (which can read the temporary TOC
in the PMA) and by some recent CD-ROM drives. When the disc itself is
closed, all sessions are closed and the temporary TOC is written to
the Lead-in Area, allowing the disc to be read in any CD-ROM drive
and most CD players.
|
Although the PMA makes provision for 99 tracks or sessions, in
practice the number of sessions that can be recorded on a CD-R disc
is much lower because of the overhead required for each session. When
writing multiple sessions to a disc, the Lead-in Area for each
session occupies 4,500 sectors (60 seconds or 9,000 KB). The Lead-out
Area for the first session occupies 6,750 sectors (90 seconds or
13,500 KB). The Lead-out Area for the second and subsequent sessions
occupies 2,250 sectors (30 seconds or 4,500 KB).
|
|
11.2.2 Logical Formats
The logical format of a CD specifies how
data is arranged on the CD, and largely determines how data may be
structured on the disc and what operating systems will be able to
access it. CDs commonly use one of the logical formats described in
the following sections.
11.2.2.1 ISO 9660
Most data CDs use the ISO 9660 format or one of its variants. ISO
9660 is based on the de facto standard High Sierra format, which was
developed by the CD-ROM industry as a cooperative effort because of
the lack of formal standards that then existed for writing data to
CDs. In the days before High Sierra came into use, it was quite
common to find that you could not read the data on a particular
CD-ROM because that CD was incompatible with your software.
The primary purpose of ISO 9660, which was adopted in 1984, was to
standardize a common logical data format for data CDs and, at the
same time, to facilitate data exchange among different computing
platforms. As a least-common-denominator format, the original ISO
9660 format is feature-poor because it supports only features that
are common across many platforms. For example, the MS-DOS 8.3
file-naming convention limited ISO-9660 to using 8.3 filenames.
At the time ISO 9660 was adopted, these limitations were not much of
a problem. Most people ran either MS-DOS or a Mac using floppy disks
or small hard disks, and the limitations of ISO 9660 were not onerous
in those environments. But the world soon changed, and the strict
limits enforced by ISO 9660 became a problem, particularly for those
who wanted to use deeply nested directories and long filenames.
Accordingly, the ISO 9660 specification was expanded to include three
ISO 9660 Interchange Levels for naming files and
directories on disc. From most to least restrictive, these include:
- ISO 9660 Level 1
-
ISO 9660 Level 1 is the least-common-denominator
level, developed to accommodate DOS filename limitations. Each file
must be written to disc as a single, continuous stream of bytes,
called an extent. Files may not be fragmented or
interleaved. Filenames may contain from one to eight d-characters.
Filename extensions may contain from zero to three d-characters (see
following section). Directory names may contain from one to eight
d-characters, and may not have an extension.
- ISO 9660 Level 2
-
ISO 9660 Level 2 also requires that files be
written to disc as a single extent, but filenames may be up to 255
d-characters long, with an extension from zero to three d-characters.
ISO 9660 Level 2 discs are unreadable by some operating systems,
notably DOS.
- ISO 9660 Level 3
-
ISO 9660 Level 3 allows a file to be written in
multiple extents, and so is used for packet writing. Filenames may be
up to 255 characters long, with the same limitations as ISO 9660
Level 2.
|
Strictly interpreted, ISO 9660 filenames must end with a semicolon
followed by the version number. e.g.,
FILENAME.TXT;1. Most operating systems ignore
these final two characters when they access files or display
directory listings. Versions of the Macintosh OS prior to 7.5 and
some versions of Unix do not suppress the semicolon and version
number, which causes problems if they attempt to access
FILENAME.TXT rather than the actual filename of
FILENAME.TXT;1.
|
|
The various ISO 9660
Levels vary significantly in which characters are legal. In ISO
9660-speak, these characters are designated as follows:
- d-characters
-
For strict compliance with ISO 9660 Level 1 file and directory naming
conventions, only this character set may be used (and only in 8.3
format). d-characters include uppercase A through Z, digits 0 through
9, and the underscore character.
- a-characters
-
The character set usable for ISO Volume Descriptors (see below).
a-characters include all d-characters as well as the following
symbols: space; comma; semicolon; colon; period; question mark;
exclamation point; right and left parentheses; single and double
quotes; greater-than and less-than symbols; percent; ampersand;
equals; asterisk; plus and minus (hyphen) symbols; and forward slash.
ISO 9660 Volume Descriptors are optional
information fields recorded at the beginning of the data area on the
disc. Volume Descriptors were originally intended for use by CD
publishers, but may be used by anyone who creates an ISO 9660 disc,
assuming the mastering software supports assigning ISO Volume
Descriptors (some don't, or support only some of the
available volume descriptors). ISO 9660 Volume Descriptors include
the following, with allowable sizes in parentheses:
- System Name
-
The operating system for which the disc is intended. (0 to 32
a-characters)
- Volume Name
-
The disc name, displayed by the OS when the disc is mounted. (0 to 32
a-characters)
- Volume Set Name
-
Used in multi-disc sets to assign a common group name to each disc in
the set. (0 to 32 d-characters).
- Publisher's Name
-
The publisher of the disc. (0 to 128 a-characters)
- Data Preparer's Name
-
The author of the disc content. (0 to 128 a-characters)
- Application Name
-
The name of the program, if any, needed to access data on the disc.
(0 to 128 a-characters)
- Copyright File Name
-
Points to a file (which, if present, must reside in the root
directory of the disc) that contains copyright information. (Maximum
8.3 d-characters)
- Abstract File Name
-
Points to a file (which, if present, must reside in the root
directory of the disc) that contains text describing the contents of
the disc. (Maximum 8.3 d-characters)
- Bibliographic File Name
-
Points to a file (which, if present, may reside in any directory on
the disc) that contains bibliographic information, such as ISBN
number. (Maximum 8.3 d-characters)
- Date Fields
-
Four Volume Descriptor fields exist for dates: Creation Date;
Modification Date; Expiration Date; and Effective Date. Each of these
fields, if present, stores a date and time in the following format,
with size given in bytes in parenthesis: Year (4); Month (2); Day
(2); Hour (2); Minute (2); Second (2); Hundredths of a second (2);
Timezone (1 byte, signed integer; specifies the number of 15-minute
increments from UCT from -48 West to +52 East).
11.2.2.2 ISO 9660 Variants
The very real limitations of ISO 9660 formatted discs gave rise to
several alternative formats, all of which were based on ISO 9660:
- Rock Ridge
-
The Rock Ridge format is an extension of the ISO
9660 format, intended for use on Unix systems, which have much more
liberal restrictions on the length of and characters used in
filenames and directory names, as well as the depth of directories.
Using Rock Ridge allows a CD to support long mixed-case filenames,
symbolic links, and other conventions common to Unix systems.
Although full Rock Ridge support is available only on Unix systems, a
system running MS-DOS, Windows, or the Mac OS can still access the
data on a Rock Ridge disc, but not the long filenames and other
extended information. The Rock Ridge standard is available at
ftp://ftp.ymi.com/pub/rockridge
if you want to learn more about it.
- Romeo
-
The Romeo format is an obsolete extension to ISO
9660, developed by Adaptec as a stopgap measure for early versions of
their Easy-CD premastering software. The raison
d'être for the cutely-named Romeo format
was that Windows NT 3.5a did not support the proprietary Microsoft
Joliet format, described below. Romeo supports filenames of up to 128
characters, including spaces. However, unlike Joliet, Romeo supports
neither the Unicode character set nor associated short (MS-DOS 8.3)
filenames. Romeo-formatted discs can be read under Windows NT 3.51
and 4.0, Windows 98/SE/Me, and Windows 2000/XP. Because there is no
associated short filename, Romeo-formatted discs cannot be read under
MS-DOS. Romeo-formatted discs can be read on a Macintosh to the
extent that they do not use filenames that exceed 31 characters. The
Romeo format was essentially overtaken by events, was seldom used
even when current, and is almost never encountered today.
- Joliet
-
Joliet is an extension of ISO 9660, developed by Microsoft to allow
CDs to support long filenames, the Unicode character set, and
associated short (MS-DOS 8.3) filenames. Joliet allows filenames up
to 64 characters, including spaces. When read on a system running
Windows 9X, Windows NT 4, Windows 2000, or recent releases of Linux,
a Joliet-formatted disc displays long filenames and directory names.
When read on a system running an operating system that does not
support Microsoft long filename standards, the Joliet-formatted disc
is recognized as a standard ISO 9660 disc. Full information about the
Joliet
standard is available at http://www-plateau.cs.berkeley.edu/people/chaffee/jolspec.html.
|
Consider logical formatting issues carefully if you plan to use CD-R
premastering software to back up a hard disk that uses Windows
long filenames and
long folder names. ISO formatting restrictions mean that
it's quite possible to have multiple subdirectories
in one directory (or multiple files in one directory) whose long
names are unambiguous, but whose truncated names are not. That means
you might be unable to copy all files to CD unless you are very
careful about using filenames and directory names that will truncate
to unambiguous short names.
|
|
11.2.2.3 Universal Disc Format (UDF)
ISO 9660 and its variants were designed for duplicating or
premastering discs, but were never intended to allow incrementally
adding small amounts of data to a disc. Although ISO 9660 allows
adding data to a disc (until that disc has been closed), the only way
to do so is by opening a new session on that disc. That means that
writing even one new file incurs the overhead required for a new
session, which ranges from 13 MB to 22 MB.
In part to address these ISO 9660 limitations, OSTA defined a new
logical format for optical discs. The official designation of this
format is ISO 13346 but the common name is
Universal Disc Format
(UDF). UDF is an operating system independent logical formatting
standard that defines how data is written to various types of optical
discs, including CD-R, CD-RW, DVD-ROM, DVD-Video, and DVD-Audio. UDF
uses a redesigned directory structure that allows small amounts of
data (called packets) to be written
incrementally and individually to disc without incurring the large
overhead associated with writing a new session under ISO 9660.
In effect, with UDF each packet is written as a subsession within a
standard session, incurring the standard session overhead only when
that standard session is closed. Packet-writing software typically
closes the session automatically when the disc is ejected using the
eject feature of the software. As with ISO 9660, an open session on a
UDF-formatted disc can be read only by a CD recorder. Closing the
session allows the disc to be read by a standard CD-ROM drive or CD
player. It's possible, however, to subsequently open
a new session and add additional packet data to the disc.
In addition to session overhead, UDF addresses another issue that
makes ISO 9660 completely inappropriate for packet writing. ISO 9660
must know, in advance, exactly which files are to be written during a
session. It uses this information to create and write the
Path Tables and Primary Volume
Descriptors, which point to the physical locations of the
files on disc. Because packet writing allows any arbitrarily selected
file to be written to disc at any time, the information that ISO 9660
requires is not available before the write occurs.
UDF solves this problem by accumulating data about the physical
locations of files as they are written. At the end of a
packet-writing session, UDF consolidates these location pointers and
writes them to disc as the Virtual Allocation
Table (VAT). The VAT address of a file remains the same,
even if it is overwritten. At the end of each packet-writing session,
UDF creates a new VAT that includes not just the pointers for newly
created or modified files, but also the pointers stored in the old
VAT. That means the current VAT always includes pointers to every
file that has been written to the disc since it was originally
formatted.
|
The advantages of packet writing come at the cost of reduced
capacity. A typical CD-R/RW disc stores about 650 MB with ISO 9660
formatting, but stores only about 500 MB with UDF formatting. About
100 MB of that reduced capacity is accounted for by the complex UDF
directory and control structures that allow data to be added and
deleted incrementally. The remaining 50 MB or so is used to implement
various measures to distribute wear evenly across the CD-RW disc,
preventing some areas from being overused and thereby rendered
unwritable while other areas remain lightly used.
|
|
Two versions of UDF are in common use:
- UDF 1.02
-
UDF 1.02 was adopted in August 1996, and is the finalized version of
the October 1995 UDF 1.0 specification. UDF 1.02 specifies standards
for DVD and DVD-ROM, but does not support writable optical media.
Windows NT 4, Windows 98/SE/Me, and Windows 2000 include native UDF
1.02 support, which allows them to access DVD video and DVD-ROM discs
natively.
- UDF 1.5
-
UDF 1.5 was adopted in February 1997, and addresses the requirements
of sequential recorded media, including CD-R, CD-RW, and DVD-RAM. UDF
1.5 adds the Virtual Allocation Table (VAT),
which is analogous to the DOS File Allocation Table, and, optionally,
the Sparing Table, which allows bad sectors to
be marked as unusable and replaced by spare sectors. Windows 2000
includes native UDF 1.5 support, but Windows NT and Windows 9X do
not. You can download UDF 1.5 reader software for these versions of
Windows from http://www.adaptec.com/products/overview/udfreaders.html.
The UDF 2.0 and 2.01 specifications are available, but not yet
commonly used in commercial products. For more information about UDF,
see http://www.osta.org.
|