[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Hpricot: match inner_text then return parent

ga rg

8/6/2008 2:27:00 PM

Hello,

Here is sample HTML:

<html><body><form><table><tbody>
<tr>
<td>abcd</td>
<td><a href="efgh.com">efgh</a></td>
<td><a href="ijkl.com">ijkl</a></td>
<tr>

<tr>
<td>mnop</td>
<td><a href="qrst.com">qrst</a></td>
<td><a href="uvwx.com">uvwx</a></td>
</tr>

</tbody></form></table><body><html>

I have a requirement that I need to match for example the inner_text
"qrst". If I find it, then I want to grab the immediate parent. So for
example, I want qrst, if a a.inner_text == qrst then grab parent and the
result I want is:

<tr>
<td>mnop</td>
<td><a href="qrst.com">qrst</a></td>
<td><a href="uvwx.com">uvwx</a></td>
</tr>


My very wrong attempt:


(doc/"html//body//form//table//tbody//tr//td").each do |row|

result = row.search("a").select {|ele|
if ele.inner_text.to_s == mac_addr
puts "I'm in"
parent = ele.nodes_at(-1)
puts parent
end
}


end

I can't seem to grab just one selection let alone grab the parent :(

Thank you
--
Posted via http://www.ruby-....

1 Answer

ga rg

8/6/2008 6:35:00 PM

0

Thanks for Ryan52 on IRC. It ended up being simpler than I thought:

For input file:
<html><body><form><table><tbody>
<tr>
<td>abcd</td>
<td><a href="1e1f1g1h.com">efgh</a></td>
<td><a href="1i1j1k1l.com">ijkl</a></td>
<tr>

<tr>
<td>mnop</td>
<td><a href="2q2r2s2t.com">qrst</a></td>
<td><a href="2u2v2w2x.com">uvwx</a></td>
</tr>

</tbody></form></table><body><html>


The following with parse and then find the element and then return the
parent:

require 'rubygems'
require 'hpricot'

doc = open("test.html") { |f| Hpricot(f)}

result =
(doc/"tr").search("a[text()*='uvwx']").first.parent.parent
puts result



ga rg wrote:
> Hello,
>
> Here is sample HTML:
>
> <html><body><form><table><tbody>
> <tr>
> <td>abcd</td>
> <td><a href="efgh.com">efgh</a></td>
> <td><a href="ijkl.com">ijkl</a></td>
> <tr>
>
> <tr>
> <td>mnop</td>
> <td><a href="qrst.com">qrst</a></td>
> <td><a href="uvwx.com">uvwx</a></td>
> </tr>
>
> </tbody></form></table><body><html>
>
> I have a requirement that I need to match for example the inner_text
> "qrst". If I find it, then I want to grab the immediate parent. So for
> example, I want qrst, if a a.inner_text == qrst then grab parent and the
> result I want is:
>
> <tr>
> <td>mnop</td>
> <td><a href="qrst.com">qrst</a></td>
> <td><a href="uvwx.com">uvwx</a></td>
> </tr>
>
>
> My very wrong attempt:
>
>
> (doc/"html//body//form//table//tbody//tr//td").each do |row|
>
> result = row.search("a").select {|ele|
> if ele.inner_text.to_s == mac_addr
> puts "I'm in"
> parent = ele.nodes_at(-1)
> puts parent
> end
> }
>
>
> end
>
> I can't seem to grab just one selection let alone grab the parent :(
>
> Thank you

--
Posted via http://www.ruby-....