[ Team LiB ] Previous Section Next Section

6.1 How RDF Vocabularies Differ from XML Vocabularies

RDF is a way of recording information about resources; RDF, as serialized using XML, is a way of recording information about a specific business domain using a set of elements defined within the rules of the RDF data model/graph and the constraints of the RDF syntax, vocabulary, and semantics.

RDF recorded in XML is a very powerful tool—it's been used to document events within a heterogeneous application environment, to describe publications, to record an environmental thesaurus, and so on. By using XML, you have access to a great number of existing XML applications such as parsers and APIs, even relational and Lightweight Directory Access Protocol (LDAP) data sources that are XML-capable. However, what do you get when you use RDF? Why not use XML directly?

As mentioned in previous chapters, RDF provides the same level of functionality to XML as the relational data model adds to commercial database systems. RDF provides a predefined grammar that can be used to consistently record business domain information in such a way that any business domain can have a vocabulary in RDF that can be processed with a host of RDF-based tools and frameworks.

Consider the environmental thesaurus I just mentioned. This is a joint effort between the California Environmental Resource Evaluation System (CERES) and the National Biological Information Infrastructure (NBII). This partnership was formed to create a common environmental vocabulary and the tools necessary to work with this vocabulary. One of the efforts of this project is to document this vocabulary using RDF.

Within the RDF vocabulary, the project has defined a class called Term that has several properties, such as Source, Category, and Status, attached to it. Instead of using RDF, the project could have recorded this information directly within XML; however, if they did this, they then would have to define the concept of "class" and "property" in order to record relationships such as "Source is a property of Term." In addition, the project would also have to create code to process the XML in such a way that the Source element is processed as a property of Term rather than an arbitrary related element that happens to be nested within the Term element. Lastly, the group would need to create a schema to support these new objects so that the XML document matches the constraints documented in this schema.

For the latter requirement, a Document Type Definition (DTD) file won't work, as DTDs primarily control nesting and frequency of occurrence of elements; XML Schema won't work, as it is concerned more with data types and other constraints rather than the metalanguage nature of "class" and "property." RELAX NG is more easily processed than either of those, but again it is solving different problems.

As you can use XML to serialize the contents of a relational database, you can use XML to serialize the contents of an RDF-based model—but XML isn't a replacement because XML is nothing more than a syntax. You need a metalanguage vocabulary to be able to use XML to record business domain information in such a way that any business can be documented, and RDF provides this capability.

However, don't take my word for it; try it yourself in the next several sections when you have a chance to see how a vocabulary is created.

    [ Team LiB ] Previous Section Next Section