We used XML::Parser in Chapter 4, "Event Streams" as an event generator to drive stream processing programs, but did you know that this same module can also generate tree data structures? We've modified our preference-reader program to use XML::Parser for parsing and building a tree, as shown in Example 6-4.
# initialize parser and read the file use XML::Parser; $parser = new XML::Parser( Style => 'Tree' ); my $tree = $parser->parsefile( shift @ARGV ); # dump the structure use Data::Dumper; print Dumper( $tree );
When run on the file in Example 6-4, it gives this output:
$tree = [ 'preferences', [ {}, 0, '\n', 'font', [ { 'role' => 'console' }, 0, '\n', 'size', [ {}, 0, '9' ], 0, '\n', 'fname', [ {}, 0, 'Courier' ], 0, '\n' ], 0, '\n', 'font', [ { 'role' => 'default' }, 0, '\n', 'fname', [ {}, 0, 'Times New Roman' ], 0, '\n', 'size', [ {}, 0, '14' ], 0, '\n' ], 0, '\n', 'font', [ { 'role' => 'titles' }, 0, '\n', 'size', [ {}, 0, '10' ], 0, '\n', 'fname', [ {}, 0, 'Helvetica' ], 0, '\n', ], 0, '\n', ] ];
This structure is more complicated than the one we got from XML::Simple; it tries to preserve everything, including node type, order of nodes, and mixed text. Each node is represented by one or two items in a list. Elements require two items: the element name followed by a list of its contents. Text nodes are encoded as the number 0 followed by their values in a string. All attributes for an element are stored in a hash as the first item in the element's content list. Even the whitespace between elements has been saved, represented as 0, \n. Because lists are used to contain element content, the order of nodes is preserved. This order is important for some XML documents, such as books or animations in which elements follow a sequence.
XML::Parser cannot output XML from this data structure like XML::Simple can. For a complete, bidirectional solution, you should try something object oriented.
Copyright © 2002 O'Reilly & Associates. All rights reserved.