[ Team LiB ] |
8.3 Parsing and Querying an RDF DocumentOnce an RDF/XML document is created, it serves no useful purpose unless the data in the document can be parsed and queried. In many ways, the advantage to something like RDF/XML is that the data is structured in specific ways, making it easier to access different data with the same code. This section will take a look at opening an existing RDF/XML document, both within the filesystem and through the Internet, and accessing the data contained within the documents. 8.3.1 Just Doing a Basic DumpWhen accessing the data within an RDF/XML document, you'll want to access the data in two different ways—accessing specific pieces of data or accessing all of it for alternative presentation. For instance, most of the tools discussed in Chapter 14 and Chapter 15 are interested in all the data within an RDF/XML document, data that is then transformed in one way or another. One of the most common ways of "dumping" the data within an RDF/XML document (outputting all the data in a new format) is to print it out in N-Triples format. This was demonstrated with the parser attached with the Jena Toolkit, ARP. However, another way of looking at the data is to dump out a listing of objects of one type or another. In Example 8-10, the PostCon RDF file for the demonstration article is accessed and opened into a memory model using the read method; this method takes the URL of the file as its parameter. Once the model is loaded, the listObjects method is called on the model object and assigned to a nodeIterator. This object is just one of the many different iterators that Jena provides: nodeIterator, stmtIterator, ResIterator, and so on. Each of these is specialized to provide access to specific Jena object types. In the example, once the nodeIterator is populated, it's traversed, and all of the RDF objects—the property "values"—are printed out using the simple toString base method. Example 8-10. Basic dump of objects, printing out object valuesimport com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; public class pracRDFSixth extends Object { public static void main (String args[]) { String sUri = args[0]; try { // Create memory model, read in RDF/XML document ModelMem model = new ModelMem( ); model.read(sUri); // Print out objects in model using toString NodeIterator iter = model.listObjects( ); while (iter.hasNext( )) { System.out.println(" " + iter.next( ).toString( )); } } catch (Exception e) { System.out.println("Failed: " + e); } } } The application is run against the monsters1.rdf example file: java pracRDFSixth http://burningbird.net/articles/monsters1.rdf This is probably one of the simplest Jena applications you can write and test to make sure that a model is loaded correctly. Instead of objects, you could also dump out the subjects ( ResIterator and listSubjects) or even the entire statement ( StmtIterator and listStatements). The functionality is relatively the same, except for the iterator and the fetch method called. 8.3.2 Accessing Specific ValuesInstead of listing all statements or all objects, you can fine-tune the code to list only subjects, statements, or objects matching specific properties, using the property implementations created within the wrapper classes, such as POSTCON. To access all objects that have the PostCon related property, the POSTCON wrapper class is added to the import section: import com.burningbird.postcon.vocabulary.POSTCON; Next, the listObjectsOfProperty method is used instead of listObjects: NodeIterator iter = model.listObjectsOfProperty(POSTCON.related); That's it to access all objects given a specific property. As you can see, the wrapper class is handy for more than just creating a model. To access all the statements for a given resource, first access the resource from the model and then list all the properties associated with that resource. In Example 8-11, all of the statements are accessed for the top-level resource contained within the document. Traversing the list of statements, the subject is accessed and printed out (both namespace and local name), followed by the predicate (again, namespace and local name), and finally the object. Example 8-11. Printing out each statement triple for a given RDF/XML documentimport com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.burningbird.postcon.vocabulary.POSTCON; public class pracRDFSeventh extends Object { public static void main (String args[]) { String sUri = args[0]; String sResource = args[1]; try { // Create memory model, read in RDF/XML document ModelMem model = new ModelMem( ); model.read(sUri); // Find resource Resource res = model.getResource(sResource); // Find properties StmtIterator iter = res.listProperties( ); // Print out triple - subject | property | object while (iter.hasNext( )) { // Next statement in queue Statement stmt = iter.next( ); // Get subject, print Resource res2 = stmt.getSubject( ); System.out.print(res2.getNameSpace( ) + res2.getLocalName( )); // Get predicate, print Property prop = stmt.getPredicate( ); System.out.print(" " + prop.getNameSpace( ) + prop.getLocalName( )); // Get object, print RDFNode node = stmt.getObject( ); System.out.println(" " + node.toString( ) + "\n"); } } catch (Exception e) { System.out.println("Failed: " + e); } } } Running this application outputs the triple for each statement for the document, including application-generated object values for blank nodes: http://burningbird.net/articles/monsters1.htm http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://burningbird.net/postcon/ elements/1.0/Resource http://burningbird.net/articles/monsters1.htm http://burningbird.net/postcon/elements/1.0/bio anon:a9ae05:f2ecfdc9db:-7fff http://burningbird.net/articles/monsters1.htm http://burningbird.net/postcon/ elements/1.0/relevancy anon:a9ae05:f2ecfdc9db:-7ff7 http://burningbird.net/articles/monsters1.htm http://burningbird.net/postcon/ elements/1.0/presentation anon:a9ae05:f2ecfdc9db:-7fec http://burningbird.net/articles/monsters1.htm http://burningbird.net/postcon/elements/1.0/history anon:a9ae05:f2ecfdc9db:-7fde http://burningbird.net/articles/monsters1.htm http://burningbird.net/postcon/elements/1.0/related http://burningbird.net/articles/monsters2.htm http://burningbird.net/articles/monsters1.htm http://burningbird.net/postcon/elements/1.0/related http://burningbird.net/articles/monsters3.htm http://burningbird.net/articles/monsters1.htm http://burningbird.net/postcon/elements/1.0/related http://burningbird.net/articles/monsters4.htm Note in the code that the variation of getObject used is the one returning an RDFNode object. The reason is that other variations work only if the object is a literal and throw exceptions if a nonliteral is found. Since some of the objects in this document are resources, the RDFNode method works best. As can be seen from the examples, querying the data in an RDF/XML document doesn't have to be difficult—you just have to remember the triple nature of the statements in RDF/XML.
|
[ Team LiB ] |