Harry Ohlsen
10/13/2003 12:37:00 AM
Chad Fowler wrote:
> Sort of. I shouldn't have said "HTML Parser2". The right name seems to
> be ruby-htmltools. It integrates with REXML and allows you to do this:
>
> parser = HTMLTree::Parser.new(true, true)
> parser.feed(file.readlines.join)
> tree = parser.tree.html_node.as_rexml_document
> tree.elements.to_a('*/table').each do |element|
> # do something with element
> end
I take it the need for putting ruby-htmltools in the middle is that generally HTML isn't clean XML? So, I take it the tools do things like turn "<br>" int "<br/>" and stick "</p>" at the end of paragraphs, that sort of thing?
Could be very useful for a number of things!
Harry O.