Phlip
3/16/2008 5:59:00 PM
Junkone wrote:
> <TABLE cellSpacing=1 width="80%" align=center border=0><!--
> startingtable -->
You simply cannot identify the table by an ID, or containership?
> how can i do using search functionality in hpricot.
Here's a little experiment showing how to do that:
html = %(<html><body>
<div><!-- yo --></div>
</body></html>)
html = Hpricot(html)
p (html/:body/:div).innerHTML
That spews "<!-- yo -->".
However, I wouldn't know how to bottle that up into a cute query. The ugly
way is to loop through all the 'div's that statement returns, checking the
innerHTML of each one for a '<--' mark.
I would prefer to query HTML with XPath, but Hpricot cannot do complex
XPaths. Other libraries could, if your HTML were well-formed. If you can't
well-form it, I would pass it thru tidy -asxhtml, then query it with REXML
or LIBxml.
--
Phlip