[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

How to delete a node with Hpricot?

Daniel N

5/21/2007 12:04:00 PM

Hi,

Sorry if this is not the right forum for this question.

If a node is bad I'm trying to comment it out. If it's really bad
I'm trynig to delete it.

The way I'm trying to do is is as follows.

def this_is_a_problem
doc = Hpricot( html )
doc.traverse_element do |node|
if some_bad_node_test
unless really_bad?
node.swap( "<!-- comment out node #{node.to_html} -->" )
else
node.swap( "" )
end
end
doc.to_html
end

However I'm getting a nasty error.

TypeError: no implicit conversion from nil to integer
/usr/local/lib/ruby/gems/1.8/gems/hpricot-0.5/lib/hpricot/traverse.rb:395:in
`[]='

Am I doing this the wrong way?

Thanx

Daniel

2 Answers

eden li

5/22/2007 2:36:00 AM

0

I can't comment on the error, but you can delete a given node in your
traverse_element block by doing node.parent.children.delete(node).

Alternatively, you can run a search first to remove the "really bad
elements." If you define really bad elements in terms of an xpath
query, this becomes very simple:

(doc/"script").remove

Then you can go through and swap things out as necessary:

(doc/"xpath to bad nodes").each do |el|
el.inner_html = "<!-- #{el.to_html} -->"
end

On May 21, 8:04 pm, "Daniel N" <has....@gmail.com> wrote:
> Hi,
>
> Sorry if this is not the right forum for this question.
>
> If a node is bad I'm trying to comment it out. If it's really bad
> I'm trynig to delete it.
>
> The way I'm trying to do is is as follows.
>
> def this_is_a_problem
> doc = Hpricot( html )
> doc.traverse_element do |node|
> if some_bad_node_test
> unless really_bad?
> node.swap( "<!-- comment out node #{node.to_html} -->" )
> else
> node.swap( "" )
> end
> end
> doc.to_html
> end
>
> However I'm getting a nasty error.
>
> TypeError: no implicit conversion from nil to integer
> /usr/local/lib/ruby/gems/1.8/gems/hpricot-0.5/lib/hpricot/traverse.rb:395:in
> `[]='
>
> Am I doing this the wrong way?
>
> Thanx
>
> Daniel


Daniel N

5/22/2007 3:10:00 AM

0

On 5/22/07, eden li <eden.li@gmail.com> wrote:
> I can't comment on the error, but you can delete a given node in your
> traverse_element block by doing node.parent.children.delete(node).
>
> Alternatively, you can run a search first to remove the "really bad
> elements." If you define really bad elements in terms of an xpath
> query, this becomes very simple:
>
> (doc/"script").remove
>
> Then you can go through and swap things out as necessary:
>
> (doc/"xpath to bad nodes").each do |el|
> el.inner_html = "<!-- #{el.to_html} -->"
> end
>
> On May 21, 8:04 pm, "Daniel N" <has....@gmail.com> wrote:
> > Hi,
> >
> > Sorry if this is not the right forum for this question.
> >
> > If a node is bad I'm trying to comment it out. If it's really bad
> > I'm trynig to delete it.
> >
> > The way I'm trying to do is is as follows.
> >
> > def this_is_a_problem
> > doc = Hpricot( html )
> > doc.traverse_element do |node|
> > if some_bad_node_test
> > unless really_bad?
> > node.swap( "<!-- comment out node #{node.to_html} -->" )
> > else
> > node.swap( "" )
> > end
> > end
> > doc.to_html
> > end
> >
> > However I'm getting a nasty error.
> >
> > TypeError: no implicit conversion from nil to integer
> > /usr/local/lib/ruby/gems/1.8/gems/hpricot-0.5/lib/hpricot/traverse.rb:395:in
> > `[]='
> >
> > Am I doing this the wrong way?
> >
> > Thanx
> >
> > Daniel
>
>
>

Sweet, that looks like a good way to go. I'll try that out.

I've actually got it working but it's very very slow, hopefully this
will speed it up

Thankyou
Daniel