Marcus Bristav
11/9/2006 8:11:00 AM
On 11/9/06, Tomasz Wegrzanowski <tomasz.wegrzanowski@gmail.com> wrote:
> I think subtree-based parsers are a great tradeoff between
> convenience of read-everything parsers and low memory use
> of stream-based parsers. Deciding inside a block seems
> much more natural than predefining matched tags (like
> in Perl's XML::Twig).
>
Back in the world of j... there are these libs (nux and dom4j and
probably more). They let you stream parse and register callbacks to
xpath expressions. Whenever a registered xpath is encountered it
invokes the callback for that xpath using a dom object (not w3c
DOM...) for the complete sub tree. This is very convenient and raises
the abstraction a bit (the xpath part) from what seems to be your
approach. They don't allow full xpath but only those parts that make
sense in this context.
Anyways, look into it, it's very nice.
/Marcus
ps. I think XML processing tools sucks quite a bit in Ruby (I love
Ruby...). You cannot do high performance processing in a cross
platform way (as far as I know). Libxml on *nix or MSXML on win (since
REXML sucks perfomance wise). It's kind of sad. Is it impossible to
make libxml/libxsl work on Windows?