[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

A few confusing Hpricot outputs. Anyone had similar experience?

Wang Jian

4/6/2009 11:11:00 AM

[Note: parts of this message were removed to make it a legal post.]

## I wanted to work on something like the following example string

require 'hpricot'
string = '<html><a></a><a href="/123/456" title="2009-04-06">posted on April
2009</a></html>'
h = Hpricot(string)
t = "2009-04-06"

## Here it goes: confusion No.1

h.at('a[@title*="2009-04-06"]')
##=> returns the 2nd anchor element, as expected.
h.at('a[@title*=Time.now.strftime("%Y-%m-%d")]')
##=> *1st anchor element. Why is that??*
h.at("a[@title*=#{t}]")
##=> 2nd anchor. works fine
h.at('a[@title*="#{t}"]')
##=> *nil. Because of the single quote?*

## And here comes another confusion:

year = "2009"
h.at("a[@title*=#{t}][text()*='2009']")
##=> 2nd anchor, as expected.
h.at("a[@title*=#{t}][text()*=#{year}]")
##=> *nil. Why is that? Hpricot can't handle #{} more than once?*

## Hope you can fill me in on this one. Thanks!!

##Jay

2 Answers

Christopher Dicely

4/6/2009 2:55:00 PM

0

On Mon, Apr 6, 2009 at 4:11 AM, Wang Jian <jwang376@gmail.com> wrote:
> ## I wanted to work on something like the following example string
>
> require 'hpricot'
> string = '<html><a></a><a href="/123/456" title="2009-04-06">posted on April
> 2009</a></html>'
> h = Hpricot(string)
> t = "2009-04-06"
>
> ## Here it goes: confusion No.1
>
> h.at('a[@title*="2009-04-06"]')
> ##=> returns the 2nd anchor element, as expected.
> h.at('a[@title*=Time.now.strftime("%Y-%m-%d")]')
> ##=> *1st anchor element. Why is that??*

I'm not sure why it is returning {emptyelem <a>}, but I can tell you
why its not returning the element you expect: because you didn't use
string interpolation so that the call to Time.now.strftime(...) would
be evaluated and inserted into the string. This selects the expected
element:

h.at("a[@title*=#{Time.now.strftime('%Y-%m-%d')}]")

> h.at("a[@title*=#{t}]")
> ##=> 2nd anchor. works fine
> h.at('a[@title*="#{t}"]')
> ##=> *nil. Because of the single quote?*

Exactly, that's just ruby single- versus double-quote string behavior.
With the same setup as you used:

irb(main):037:0> "#{t}"
=> "2009-04-06"
irb(main):038:0> '#{t}'
=> "\#{t}"

>
> ## And here comes another confusion:
>
> year = "2009"
> h.at("a[@title*=#{t}][text()*='2009']")
> ##=> 2nd anchor, as expected.
> h.at("a[@title*=#{t}][text()*=#{year}]")
> ##=> *nil. Why is that? Hpricot can't handle #{} more than once?*

Do you mean for these to pass different strings to h.at()? Look at the
strings you are using.

irb(main):048:0> puts [ "a[@title*=#{t}][text()*='2009']",
irb(main):049:1* "a[@title*=#{t}][text()*=#{year}]" ]
a[@title*=2009-04-06][text()*='2009']
a[@title*=2009-04-06][text()*=2009]

So, you are just getting unreliable results when you aren't using
quotes around the values you are searching for. This version works,
where the second one above did not:

h.at("a[@title*='#{t}'][text()*='#{year}']")

Note that I've put quotes on both values, though at least in this
example the title appears to work without them.

Wang Jian

4/7/2009 10:06:00 AM

0

[Note: parts of this message were removed to make it a legal post.]

Great notes. Thanks a lot!

So the take home message is like always use " on the very outside, and use
(literally) '#{expression}' to ensure consistency.

It's kinda counter-intuitive at first look, as normally the #{} won't work
when placed in between single quotes. But it works in this one. :)

2009/4/6 Christopher Dicely <cmdicely@gmail.com>

> On Mon, Apr 6, 2009 at 4:11 AM, Wang Jian <jwang376@gmail.com> wrote:
> > ## I wanted to work on something like the following example string
> >
> > require 'hpricot'
> > string = '<html><a></a><a href="/123/456" title="2009-04-06">posted on
> April
> > 2009</a></html>'
> > h = Hpricot(string)
> > t = "2009-04-06"
> >
> > ## Here it goes: confusion No.1
> >
> > h.at('a[@title*="2009-04-06"]')
> > ##=> returns the 2nd anchor element, as expected.
> > h.at('a[@title*=Time.now.strftime("%Y-%m-%d")]')
> > ##=> *1st anchor element. Why is that??*
>
> I'm not sure why it is returning {emptyelem <a>}, but I can tell you
> why its not returning the element you expect: because you didn't use
> string interpolation so that the call to Time.now.strftime(...) would
> be evaluated and inserted into the string. This selects the expected
> element:
>
> h.at("a[@title*=#{Time.now.strftime('%Y-%m-%d')}]")
>
> > h.at("a[@title*=#{t}]")
> > ##=> 2nd anchor. works fine
> > h.at('a[@title*="#{t}"]')
> > ##=> *nil. Because of the single quote?*
>
> Exactly, that's just ruby single- versus double-quote string behavior.
> With the same setup as you used:
>
> irb(main):037:0> "#{t}"
> => "2009-04-06"
> irb(main):038:0> '#{t}'
> => "\#{t}"
>
> >
> > ## And here comes another confusion:
> >
> > year = "2009"
> > h.at("a[@title*=#{t}][text()*='2009']")
> > ##=> 2nd anchor, as expected.
> > h.at("a[@title*=#{t}][text()*=#{year}]")
> > ##=> *nil. Why is that? Hpricot can't handle #{} more than once?*
>
> Do you mean for these to pass different strings to h.at()? Look at the
> strings you are using.
>
> irb(main):048:0> puts [ "a[@title*=#{t}][text()*='2009']",
> irb(main):049:1* "a[@title*=#{t}][text()*=#{year}]" ]
> a[@title*=2009-04-06][text()*='2009']
> a[@title*=2009-04-06][text()*=2009]
>
> So, you are just getting unreliable results when you aren't using
> quotes around the values you are searching for. This version works,
> where the second one above did not:
>
> h.at("a[@title*='#{t}'][text()*='#{year}']")
>
> Note that I've put quotes on both values, though at least in this
> example the title appears to work without them.
>
>