6.1 How RDF Vocabularies Differ from XML Vocabularies
RDF is a way of recording
information about resources; RDF, as serialized using XML, is a way
of recording information about a specific business domain using a set
of elements defined within the rules of the RDF data model/graph and
the constraints of the RDF syntax, vocabulary, and semantics.
RDF recorded in XML is a very powerful
tool—it's been used to document events within
a heterogeneous application environment, to describe publications, to
record an environmental thesaurus, and so on. By using XML, you have
access to a great number of existing XML applications such as parsers
and APIs, even relational and Lightweight Directory Access Protocol
(LDAP) data sources that are XML-capable. However, what do you get
when you use RDF? Why not use XML directly?
As mentioned in previous chapters, RDF provides the same level of
functionality to XML as the relational data model adds to commercial
database systems. RDF provides a predefined grammar that can be used
to consistently record business domain information in such a way that
any business domain can have a vocabulary in RDF that can be
processed with a host of RDF-based tools and frameworks.
Consider the environmental thesaurus I just mentioned. This is a
joint effort between the California Environmental Resource Evaluation
System (CERES) and the National Biological Information Infrastructure
(NBII). This partnership was formed to create a common environmental
vocabulary and the tools necessary to work with this vocabulary. One
of the efforts of this project is to document this vocabulary using
RDF.
Within the RDF vocabulary, the project has defined a class called
Term that has several properties, such as Source, Category, and
Status, attached to it. Instead of using RDF, the project could have
recorded this information directly within XML; however, if they did
this, they then would have to define the concept of
"class" and
"property" in order to record
relationships such as "Source is a property of
Term." In addition, the project would also have to
create code to process the XML in such a way that the Source element
is processed as a property of Term rather than an arbitrary related
element that happens to be nested within the Term element. Lastly,
the group would need to create a schema to support these new objects
so that the XML document matches the constraints documented in this
schema.
For the latter requirement, a Document Type
Definition (DTD) file won't work, as DTDs primarily
control nesting and frequency of occurrence of elements; XML Schema
won't work, as it is concerned more with data types
and other constraints rather than the metalanguage nature of
"class" and
"property." RELAX NG is more easily
processed than either of those, but again it is solving different
problems.
As you can use XML to serialize the contents of a relational
database, you can use XML to serialize the contents of an RDF-based
model—but XML isn't a replacement because XML
is nothing more than a syntax. You need a metalanguage vocabulary to
be able to use XML to record business domain information in such a
way that any business can be documented, and RDF provides this
capability.
However, don't take my word for it; try it yourself
in the next several sections when you have a chance to see how a
vocabulary is created.
|