16.5 Should You Use XHTML?
For a document author used to HTML, XHTML is clearly a more painful and
less forgiving document markup language. Whereas at one time we
prided ourselves on being able to crank out HTML with pencil and
paper, it's much more tedious to write XHTML without
special document-preparation applications. Why should any author want
to take on that extra baggage?
16.5.1 The Dusty Deck Problem
Over just a few years, the Web has been filled with billions of
pages. It is a safe bet that many of these pages are not compliant
with any defined version of HTML. It is an even safer bet that the
vast majority of these pages are not XHTML-compliant.
The harsh reality is that these billions of pages will never be
converted to XHTML. Who has the time to go back, root out these old
pages, and tweak them to make them XHTML-compliant — especially
when the end result, as perceived by the user, will not change? Like
the dusty decks of COBOL programs that lay unchanged for decades
before Y2K forced programmers to bring them up to snuff, these dusty
decks of web pages will also lie untouched until a similarly dramatic
event forces us to update them.
However, the dusty deck problem is no excuse for not writing
compliant documents going forward. Leave those old documents alone,
but don't create a new conversion problem every time
you create a new document. A little effort now will help your
documents work across a wider range of browsers in the future.
16.5.2 Automatic Conversion
If your sense of responsibility leads you to undertake the conversion
of your existing HTML documents into XHTML, you'll
find a utility named
Tidy to be
exceptionally useful. Written by Dave Raggett, one of the movers and
shakers at the W3C, it automates a significant amout of the work
required to convert HTML documents into XHTML.
While Tidy's capabilities are too varied and
wonderful to be fully listed here, we can at least assure you that
case conversion, quoted attributes, and proper element nesting are
all detected and corrected by Tidy. For the complete list of features
and the latest version of Tidy for various computing platforms, visit
http://tidy.sourceforge.net.
16.5.3 Lenient Browsers and Lazy Authors
There is a good rule of thumb regarding data
sharing, especially on the Internet: be lenient in what you accept
and strict in what you produce. This is a not a commentary on social
policy, but rather a pragmatic admonition to tolerate ambiguity and
errors in data you receive while making sure that anything you send
is scrupulously correct.
Web browsers are good examples of lenient acceptors. Most current web
pages have some sort of error in them, albeit often just an error of
omission. Nonetheless, browsers accept the error and present a
reasonable document to the user. This leniency lets authors get away
with all sorts of things, often without even knowing
they've made a mistake.
Most authors stop developing a page when it looks good and works the
way they want it to. Very few take the time to run their pages
through the various HTML-compliance tools to catch potential errors.
Many of those who do try to test for compliance are so overwhelmed by
the number of minor errors they have committed that they simply give
up and continue to create bad pages that can be handled by good
browsers.
Since the number of bad pages continues to grow, browsers cannot
afford to start being strict. Any browser that tried to enforce even
the most basic rules of the HTML standard would be abandoned by users
who want to see web pages, not error messages. A vicious cycle
ensues: bad pages force the use of lenient browsers, which encourage
the creation of more bad pages. Break the cycle by vowing to create
only XHTML-compliant content whenever you can.
16.5.4 Time, Money, and Standards
XHTML was developed as an XML representation of the HTML standard. It
is intended, going forward, to become the single standard everyone
should use to create content for the Web.
In a perfect world, standards are universally adopted and used. Full
compliance is required of any document before it is placed on the
Web. Conversion of legacy documents is done immediately.
In the real world, a shortage of time and money prevents the
universal use of standards. Under pressure to quickly deliver
something that works, developers turn out pages that work only well
enough. Since browsers allow second-rate content to exist on the Web,
the need to comply with a standard becomes a secondary
issue — one that is too quickly ignored in the dizzying pace of
web development.
16.5.5 Man Versus Machine
All is not lost, however. While
XHTML is painful and tedious for
humans to create, it is quite easy for machines to create. The number
of web-authoring tools continues to increase, and the pages created
by these machines should be completely XHTML-compliant. While it
doesn't make much economic sense for a web author to
spend a lot of time getting all those end tags in the right spot, it
does make sense for the programmer developing an authoring tool to
ensure that the tool generates all those correct end tags. The effort
expended by the web author is leveraged exactly once for each page;
the effort of the tool creator is leveraged over and over, each time
the tool produces a new page.
It seems that the real future of XHTML lies in the realm of
machine-generated content. XHTML is far too picky to be successfully
used by the millions of casual web authors who create small sites.
However, if those same authors use a tool to create their pages, they
could be generating XHTML-compliant pages and never even know it.
If you are among that small community of developers who create tools
that generate HTML output, you are doing a great disservice to your
many potential customers if your tool does not generate
excruciatingly correct XHTML-compliant output. There is no technical
excuse for any tool not to generate XHTML-compliant output. If there
are compatibility issues surrounding how the output might be used
(with a non-XHTML browser, perhaps), the tool should provide a switch
that lets the author select XHTML-compliant output as an option.
16.5.6 What to Do?
We recommend that all HTML authors take the time to absorb the
differences between HTML and XHTML outlined in this chapter. Given
the resources and opportunity, you should try to create
XHTML-compliant pages wherever possible for the sites you are
creating. Certainly you should choose authoring tools that support
XHTML and give you the option of generating XHTML-compliant pages.
One day, XHTML may replace HTML as the official standard language of
the Web. Even so, the number of noncompliant pages on the Web is
overwhelming, forcing browsers to honor old HTML constructs and
features for at least the next five years. For better or worse, HTML
is here to stay as the de facto standard for web
authors for years to come.
|