http://www.saxproject.org/Review of
Book of SAX [
1]
RS: In another aspect, SAX (Simple API for XML) is a model of processing
XML on the fly. A SAX parser like
expat (available in
tdom) goes through the XML input without keeping much state, and issues configurable callbacks for instance at the start or end of an XML element, or when it encounters a character data chunk.
Following is a little example of instrumenting a SAX parser. On every start tag,
el is called; on every end tag,
ee is called, and for all character data in an element,
ch is called. These are the callbacks provided by the user.
To keep track of where in the tag hierarchy we are, a global stack ::S is maintained:
el pushes the current tag name,
ee pops it.
ch collects the content of "grill" and "baz" elements in a global array
g. When a "bar" element ends, the collected contents are formatted and output, and
g reset (so that earlier content cannot be mis-used).
package require tdom
proc parse xml {
set ::S {}
set p [expat -elementstartcommand el \
-characterdatacommand ch \
-elementendcommand ee ]
if [catch {$p parse $xml} res] {
puts "Error: $res"
}
}
#---- Callbacks for start, end, character data
proc el {name atts} {
lappend ::S $name ;# push
if {$name eq "bar"} {array unset ::g}
}
proc ee name {
global g
set ::S [lrange $::S 0 end-1] ;# pop
if {$name eq "bar"} {
puts $g(grill)=$g(baz)
}
}
proc ch str {
global g
set type [lindex $::S end]
switch -- $type {
grill - baz {set g($type) $str}
}
}
#-- Now to test the whole thing:
parse "<foo><bar><grill>hello</grill><baz>42</baz></bar>
<bar><grill>world</grill><baz>24</baz></bar></foo>"
Running this script displays
hello=42
world=24