Peter Szinek
12/13/2006 1:59:00 PM
Dhanasekaran Vivekanandhan wrote:
> yes, I want the text of the first <p> because it
> has an image. and reject if <p> has no image.
> thanks,
I see. Try this:
===============================================
require 'rubygems'
require 'hpricot'
doc = Hpricot %q{<p class=posted>
this is fun
<img src="" class="dhans"/>
</p>
<p class=posted>
NO FUN
</p>
<p class=posted>
fun again!
<img src=""/>
</p>
<p class=posted>
NO FUN AT ALL!
</p>
}
paragraphs = doc/'p'
good_elems = paragraphs.map.reject {|elem| ((elem/"img").empty?) }
good_elems.each { |elem| puts elem.inner_text.strip }
===============================================
output:
************
this is fun
fun again!
************
You will need hpricot 0.4.84 because of inner_text - if you don't want
to install it (I did not experience any difficulties, so I can recommend
it) then you have to roll your own inner_text, but I guess this is not a
big problem.
Cheers,
Peter