[ Team LiB ] |
8.2 Creating and Serializing an RDF ModelAutomating the process of creating an RDF/XML document is actually a fairly simple process, but you have to understand first how your RDF triples relate to one another. One approach to using Jena to generate RDF/XML for a particular vocabulary is to create a prototype document of the vocabulary and run it/them through the RDF Validator. Once the RDF/XML validates, parse it into N-Triples, and use these to build an application that can generate instances of a model of a given vocabulary, each using different data. For the purposes of this chapter, I'm using Example 6-6 from Chapter 6 for a demonstration. This particular document, duplicated in this chapter's source, records the history and status of an article from one of my web sites. It makes a good example because it demonstrates the relationships that can appear within the PostCon vocabulary, and therefore makes a fine prototype for building an application that will build new versions of PostCon RDF/XML documents.
8.2.1 Very Quick Simple LookAt its simplest, you can create an RDF model, create a single resource, add a couple of properties and then serialize it, all with just a few lines of code. So to get started, we'll do just that. In Example 8-1, a new model is created, with the resource and one predicate repeated with two different objects. To create this model, an in-memory memory model is instantiated first, then an instance of an RDF resource using the Jena Resource class. Two instances of Property are created and attached to the module using addProperty, forming two complete RDF statements. The first parameter in the addProperty method is the Property instance, the second the actual property value. Once the model is built, it's printed out to standard output using the Jena PrintWriter class. For now, the values used within the model are all hardcoded into the application. Example 8-1. Creating an RDF model with two statements, serialized to RDF/XMLimport com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.common.PropertyImpl; import java.io.FileOutputStream; import java.io.PrintWriter; public class pracRdfFirst extends Object { public static void main (String args[]) { String sURI = "http://burningbird.net/articles/monsters1.htm"; String sPostcon = "http://www.burningbird.net/postcon/elements/1.0/"; String sRelated = "related"; try { // Create an empty graph Model model = new ModelMem( ); // Create the resource Resource postcon = model.createResource(sURI); // Create the predicate (property) Property related = model.createProperty(sPostcon, sRelated); // Add the properties with associated values (objects) postcon.addProperty(related, "http://burningbird.net/articles/monsters3.htm"); postcon.addProperty(related, "http://burningbird.net/articles/monsters2.htm"); // Print RDF/XML of model to system output model.write(new PrintWriter(System.out)); } catch (Exception e) { System.out.println("Failed: " + e); } } } Once compiled, running the application results in the following output: <rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:NS0='http://www.burningbird.net/postcon/elements/1.0/' > <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'> <NS0:related>http://burningbird.net/articles/monsters3.htm</NS0:related> <NS0:related>http://burningbird.net/articles/monsters2.htm</NS0:related> </rdf:Description> </rdf:RDF> The generated RDF validates within the RDF Validator, producing the graph shown in Figure 8-1. Figure 8-1. RDF model with one resource and two statementsAt this point, we can continue creating and adding properties to the model directly in the application. However, the problem with creating the Property and Resource objects directly in the application that builds the models is that you have to duplicate this functionality across all applications that want to use the vocabulary. Not only is this inefficient, it adds to the overall size and complexity of an application. A better approach would be one the Jena developers demonstrated when they built their vocabulary objects: using a Java wrapper class.
8.2.2 Encapsulating the Vocabulary in a Java Wrapper ClassIf you look at your Jena installation, in the directory source code directory under the following path, you'll find several Java classes in the vocabulary directory, /com/hp/hpl/mesa/rdf/jena/vocabulary. The classes included wrap Dublin Core (DC) RDF, VCARD RDF, and so on. By using a wrapper class for the properties and resources of your RDF vocabulary, you have a way of defining all aspects of the RDF vocabulary in one spot, an approach that simplifies both implementation and maintenance.
In this section, we'll create a vocabulary class for PostCon, using the existing Jena vocabulary wrapper classes as a template, The PostCon wrapper class consists of a set of static strings holding property or resource labels and a set of associated RDF properties, as shown in Example 8-2. As complex as the example RDF file is, you may be surprised by how few entries there are in this class; PostCon makes extensive use of other RDF vocabularies for much of its data collection, including Dublin Core, which has a predefined vocabulary wrapper class included with Jena (DC.java). Example 8-2. POSTCON vocabulary wrapper classpackage com.burningbird.postcon.vocabulary; import com.hp.hpl.mesa.rdf.jena.common.ErrorHelper; import com.hp.hpl.mesa.rdf.jena.common.PropertyImpl; import com.hp.hpl.mesa.rdf.jena.common.ResourceImpl; import com.hp.hpl.mesa.rdf.jena.model.Model; import com.hp.hpl.mesa.rdf.jena.model.Property; import com.hp.hpl.mesa.rdf.jena.model.Resource; import com.hp.hpl.mesa.rdf.jena.model.RDFException; public class POSTCON extends Object { // URI for vocabulary elements protected static final String uri = "http://burningbird.net/postcon/elements/1.0/"; // Return URI for vocabulary elements public static String getURI( ) { return uri; } // Define the property labels and objects static final String nbio = "bio"; public static Property bio = null; static final String nrelevancy = "relevancy"; public static Property relevancy = null; static final String npresentation = "presentation"; public static Resource presentation = null; static final String nhistory = "history"; public static Property history = null; static final String nmovementtype = "movementType"; public static Property movementtype = null; static final String nreason = "reason"; public static Property reason = null; static final String nstatus = "currentStatus"; public static Property status = null; static final String nrelated = "related"; public static Property related = null; static final String ntype = "type"; public static Property type = null; static final String nrequires = "requires"; public static Property requires = null; // Instantiate the properties and the resource static { try { // Instantiate the properties bio = new PropertyImpl(uri, nbio); relevancy = new PropertyImpl(uri, nrelevancy); presentation = new PropertyImpl(uri, npresentation); history = new PropertyImpl(uri, nhistory); related = new PropertyImpl(uri, nrelated); type = new PropertyImpl(uri, ntype); requires = new PropertyImpl(uri, nrequires); movementtype = new PropertyImpl(uri, nmovementtype); reason = new PropertyImpl(uri, nreason); status = new PropertyImpl(uri, nstatus); } catch (RDFException e) { ErrorHelper.logInternalError("POSTCON", 1, e); } } } At the top of the example code, after the declarations, is a static string holding the URI of the PostCon element vocabulary and a method to return it. Following these is a list of declarations for each property, including a Property element and the associated label for each.
Once the properties are defined in the code, they are instantiated, and the file is saved and compiled. To import this class, use the following in your Java applications: import com.burningbird.postcon.vocabulary.POSTCON; At this point, the PostCon vocabulary wrapper class is ready for use. We rewrite the application in Example 8-1, except this time we'll use the POSTCON wrapper class, as shown in Example 8-3. In addition, we'll cascade the addProperty calls directly in the function call to create the resource (createResource), to keep the code compact, as well as to show a more direct connection between the two. Example 8-3. Using wrapper class to add properties to resourceimport com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.vocabulary.*; import com.burningbird.postcon.vocabulary.POSTCON; import java.io.FileOutputStream; import java.io.PrintWriter; public class pracRDFSecond extends Object { public static void main (String args[]) { // Resource names String sResource = "http://burningbird.net/articles/monsters1.htm"; String sRelResource1 = "http://burningbird.net/articles/monsters2.htm"; String sRelResource2 = "http://burningbird.net/articles/monsters3.htm"; try { // Create an empty graph Model model = new ModelMem( ); // Create the resource // and add the properties cascading style Resource article = model.createResource(sResource) .addProperty(POSTCON.related, model.createResource(sRelResource1)) .addProperty(POSTCON.related, model.createResource(sRelResource2)); // Print RDF/XML of model to system output model.write(new PrintWriter(System.out)); } catch (Exception e) { System.out.println("Failed: " + e); } } } As you can see, using the wrapper class simplified the code considerably. The new application is saved, compiled, and run. The output from this application is shown in Example 8-4. Again, running it through the RDF Validator confirms that the serialized RDF/XML represents the model correctly and validly. Example 8-4. Generated RDF/XML from serialized PostCon submodel<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:NS0='http://burningbird.net/postcon/elements/1.0/' > <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'> <NS0:related rdf:resource='http://burningbird.net/articles/monsters2.htm'/> <NS0:related rdf:resource='http://burningbird.net/articles/monsters3.htm'/> </rdf:Description> </rdf:RDF> You've probably noted by now that Jena generates namespace prefixes for the vocabulary elements. As you'll see later, you can change the prefix used for namespaces. However, the specific prefix used is unimportant, except perhaps for readability across models when the same vocabulary is used in multiple places, such as the Dublin Core vocabulary. 8.2.3 Adding More Complex StructuresAs has been demonstrated, adding literal or simple resource properties for a specific RDF resource in a model is quite uncomplicated with Jena. However, many RDF models make use of more complex structures, including nesting resources following the RDF node-edge-node pattern. In this section, we'll demonstrate how Jena can just as easily handle more complex RDF model structures and their associated RDF/XML.
The pstcn:bio property is, itself, a resource that does not have a specific URI—a blank node, or bnode. Though not a literal, it's still added as a property using addProperty. In Example 8-5, a new resource representing the article is created and the two related resource properties are added. In addition, a new resource is created for bio, and several properties are added to it; these properties are defined within the DC vocabulary, and I used the DC wrapper class to create them. Once the resource is implemented, I attach it to a higher-level resource using addProperty. Example 8-5. Adding a blank node to a modelimport com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.vocabulary.*; import com.burningbird.postcon.vocabulary.POSTCON; import java.io.FileOutputStream; import java.io.PrintWriter; public class pracRDFThird extends Object { public static void main (String args[]) { // Resource names String sResource = "http://burningbird.net/articles/monsters1.htm"; String sRelResource1 = "http://burningbird.net/articles/monsters2.htm"; String sRelResource2 = "http://burningbird.net/articles/monsters3.htm"; String sType = "http://burningbird.net/postcon/elements/1.0/Resource"; try { // Create an empty graph Model model = new ModelMem( ); // Create the resource // and add the properties cascading style Resource article = model.createResource(sResource) .addProperty(POSTCON.related, model.createResource(sRelResource1)) .addProperty(POSTCON.related, model.createResource(sRelResource2)); // Create the bio bnode resource // and add properties Resource bio = model.createResource( ) .addProperty(DC.creator, "Shelley Powers") .addProperty(DC.publisher, "Burningbird") .addProperty(DC.title, model.createLiteral("Tale of Two Monsters: Legends", "en")); // Attach to main resource article.addProperty(POSTCON.bio, bio); // Print RDF/XML of model to system output model.write(new PrintWriter(System.out)); } catch (Exception e) { System.out.println("Failed: " + e); } } } String sResource = "http://burningbird.net/articles/monsters1.htm"; String sRelResource1 = "http://burningbird.net/articles/monsters2.htm"; I could have used the cascade approach to add the bio directly to the document resource as it was being created. However, creating bio separately and then adding it to the top-level resource is, in my opinion, easier to read, and the resulting RDF model and serialized RDF/XML is identical. The results of the application are shown in Example 8-6. As you can see, Jena uses rdf:nodeID and separates out the resource, rather than nesting it. This is nothing more than convenience and syntactic sugar—the resulting RDF graph is still equivalent in meaning. Example 8-6. Generated RDF/XML demonstrating more complex structures<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:NS0='http://burningbird.net/postcon/elements/1.0/' xmlns:dc='http://purl.org/dc/elements/1.0/' > <rdf:Description rdf:nodeID='A0'> <dc:creator>Shelley Powers</dc:creator> <dc:publisher>Burningbird</dc:publisher> <dc:title xml:lang='en'>Tale of Two Monsters: Legends</dc:title> </rdf:Description> <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'> <NS0:related rdf:resource='http://burningbird.net/articles/monsters2.htm'/> <NS0:related rdf:resource='http://burningbird.net/articles/monsters3.htm'/> <NS0:bio rdf:nodeID='A0'/> </rdf:Description> </rdf:RDF> The example demonstrates how to implement the striped XML quality of RDF, which has a node-edge-node-edge pattern of nesting. Another RDF pattern that PostCon supports is a container holding the resource's history, which is implemented in Section 8.2.5. 8.2.4 Creating a Typed NodeThe RDF model created to this point shows the top-level resource as a basic rdf:Description node, with a given URI. However, in the actual RDF/XML, the top-level node is what is known as a typed node, which means it is defined with a specific rdf:type property. Implementing a typed node in Jena is actually quite simple, by the numbers. First, the POSTCON wrapper class needs to be modified to add the new resource implementation. To support this, two new Jena classes are imported into the POSTCON Java code: import com.hp.hpl.mesa.rdf.jena.common.ResourceImpl; import com.hp.hpl.mesa.rdf.jena.model.Resource; Next, the document resource definition is added: // add the one resource static final String nresource = "resource"; public static Resource resource = null; Finally, the resource is instantiated: resource = new ResourceImpl(uri+nresource); Once the wrapper class is modified, the typed node information is implemented within the Jena code, as shown in Example 8-7. Example 8-7. Adding an rdf:type for the top-level document resourceimport com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.vocabulary.*; import com.burningbird.postcon.vocabulary.POSTCON; import java.io.FileOutputStream; import java.io.PrintWriter; public class chap1005 extends Object { public static void main (String args[]) { // Resource names String sResource = "http://burningbird.net/articles/monsters1.htm"; try { // Create an empty graph Model model = new ModelMem( ); // Create the resource // and add the properties cascading style Resource article = model.createResource(sResource) .addProperty(RDF.type, POSTCON.resource); // Print RDF/XML of model to system output model.write(new PrintWriter(System.out)); } catch (Exception e) { System.out.println("Failed: " + e); } } } The resulting RDF/XML: <rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' > <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'> <rdf:type rdf:resource='http://burningbird.net/postcon/elements/1.0/Resource'/> </rdf:Description> </rdf:RDF> is equivalent to the same RDF/XML used in the sample document: <pstcn:Resource rdf:about="monsters1.htm"> ... </pstcn:Resource> Both result in the exact same RDF model, shown in Figure 8-2. Figure 8-2. RDF model of typed (document) node8.2.5 Creating a ContainerAs discussed earlier in the book, an RDF container is a grouping of related items. There are no formalized semantics for a container other than this, though tools and applications may add additional semantics based on type of container: Alt, Seq, or Bag. The PostCon vocabulary uses an rdf:Seq container to group the resource history, with the application-specific implication that if tools support this concept, the contained items are sequenced in order, from top to bottom, within the container: <pstcn:history> <rdf:Seq> <rdf:_1 rdf:resource="http://www.yasd.com/dynaearth/monsters1.htm" /> <rdf:_2 rdf:resource="http://www.dynamicearth.com/articles/monsters1.htm" /> <rdf:_3 rdf:resource="http://burningbird.net/articles/monsters1.htm" /> </rdf:Seq> </pstcn:history> For tools that don't support my additional container semantics, the items can be sequenced by whatever properties are associated with each contained resource—the date, URI, movement type, or even random sequencing: <rdf:Description rdf:about="http://www.yasd.com/dynaearth/monsters1.htm"> <pstcn:movementType>Add</pstcn:movementType> <pstcn:reason>New Article</pstcn:reason> <dc:date>1998-01-01T00:00:00-05:00</dc:date> </rdf:Description> RDF containers are just a variation of typed node and can be implemented directly just by using the same code shown to this point. After all, a container is nothing more than a blank node with a given rdf:type (such as http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq) acting as the subject for several statements, all with the same predicate and all pointing to objects that are resources. You could emulate containers directly given previous code. However, it's a lot simpler just to use the APIs. In Example 8-8, an RDF container, an rdf:Seq, is created and three resources are added to it. Each of the resources has properties of its own, including pstcn:movementType, reason (both of which are from POSTCON), and date (from DC). Once completed, the rdf:Seq is then added to the document resource. Example 8-8. Adding the history container to the modelimport com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.vocabulary.*; import com.burningbird.postcon.vocabulary.POSTCON; import java.io.FileOutputStream; import java.io.PrintWriter; public class pracRDFFifth extends Object { public static void main (String args[]) { // Resource names String sResource = "http://burningbird.net/articles/monsters1.htm"; String sHistory1 = "http://www.yasd.com/dynaearth/monsters1.htm"; String sHistory2 = "http://www.dynamicearth.com/articles/monsters1.htm"; String sHistory3 = "http://www.burningbird.net/articles/monsters1.htm"; try { // Create an empty graph Model model = new ModelMem( ); // Create Seq Seq hist = model.createSeq( ) .add (1, model.createResource(sHistory1) .addProperty(POSTCON.movementtype, model.createLiteral("Add")) .addProperty(POSTCON.reason, model.createLiteral("New Article")) .addProperty(DC.date, model.createLiteral("1998-01-01T00:00:00-05:00"))) .add (2, model.createResource(sHistory2) .addProperty(POSTCON.movementtype, model.createLiteral("Move")) .addProperty(POSTCON.reason, model.createLiteral("Moved to separate dynamicearth.com domain")) .addProperty(DC.date, model.createLiteral("1999-10-31:T00:00:00-05:00"))) .add (3, model.createResource(sHistory3) .addProperty(POSTCON.movementtype, model.createLiteral("Move")) .addProperty(POSTCON.reason, model.createLiteral("Collapsed into Burningbird")) .addProperty(DC.date, model.createLiteral("2002-11-01:T00:00:00-5:00"))); // Create the resource // and add the properties cascading style Resource article = model.createResource(sResource) .addProperty(POSTCON.history, hist); // Print RDF/XML of model to system output RDFWriter writer = model.getWriter( ); writer.setNsPrefix("pstcn", "http://burningbird.net/postcon/elements/1.0/"); writer.write(model, new PrintWriter(System.out), "http://burningbird.net/articles" ); } catch (Exception e) { System.out.println("Failed: " + e); } } } Another new item added with this code is the RDFWriter.setNsPrefix method, which defines the prefix so that it shows as pstcn rather than the default of NSO. This isn't necessarily important—whatever abbreviation used is resolved to the namespace within the model—but it does make the models easier to read if you use the same QName all the time. As described in Chapter 4, a container is a grouping of like items, and there are no additional formal semantics attached to the concept of container. Now, the fact that I used rdf:Seq could imply that the items within the container should be processed in order, from first to last. However, this is up to the implementation to determine exactly how an rdf:Seq container is processed outside of the formal semantics within the RDF specifications. What's interesting is that, within Jena, a container is treated exactly as the typed node that I described earlier—which means that the generated RDF/XML, as shown in Example 8-9, shows the rdf:Seq as its typed node equivalent, rather than in the container-like syntax shown in the example source. Example 8-9. Generated RDF/XML showing container defined as typed node<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:pstcn='http://burningbird.net/postcon/elements/1.0/' xmlns:dc='http://purl.org/dc/elements/1.0/' > <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'> <pstcn:history rdf:nodeID='A0'/> </rdf:Description> <rdf:Description rdf:about='http://www.dynamicearth.com/articles/monsters1.htm'> <pstcn:movementType>Move</pstcn:movementType> <pstcn:reason>Moved to separate dynamicearth.com domain</pstcn:reason> <dc:date>1999-10-31:T00:00:00-05:00</dc:date> </rdf:Description> <rdf:Description rdf:about='http://www.burningbird.net/articles/monsters1.htm'> <pstcn:movementType>Move</pstcn:movementType> <pstcn:reason>Collapsed into Burningbird</pstcn:reason> <dc:date>2002-11-01:T00:00:00-5:00</dc:date> </rdf:Description> <rdf:Description rdf:nodeID='A0'> <rdf:type rdf:resource='http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq'/> <rdf:_1 rdf:resource='http://www.yasd.com/dynaearth/monsters1.htm'/> <rdf:_2 rdf:resource='http://www.dynamicearth.com/articles/monsters1.htm'/> <rdf:_3 rdf:resource='http://www.burningbird.net/articles/monsters1.htm'/> </rdf:Description> <rdf:Description rdf:about='http://www.yasd.com/dynaearth/monsters1.htm'> <pstcn:movementType>Add</pstcn:movementType> <pstcn:reason>New Article</pstcn:reason> <dc:date>1998-01-01T00:00:00-05:00</dc:date> </rdf:Description> </rdf:RDF> I prefer the Jena implementation of the container because it implies nothing about container-like behavior that doesn't exist within the RDF specifications. The generated RDF/XML provides a clearer picture of a set of like resources, grouped for some reason, and then added as a property to another resource. No more, no less. Now that we've had a chance to build RDF models and view the serialized RDF/XML from them, we'll take a look at parsing and accessing data in existing RDF/XML documents.
|
[ Team LiB ] |