[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Escaping single quotes in XPath query with REXML

Francis Hwang

10/21/2004 2:49:00 AM

Anybody tried to use XPath in REXML with a single quote, only to run
into the fact that quote escaping in XPath is apparently not accounted
for? If this were in the context on XSLT I'd be able to assign some
annoying temp variable like $apos, but it's not, so I can't.

irb(main):001:0> require 'rexml/document'
=> true
irb(main):002:0> include REXML
=> Object
irb(main):003:0> xml = "<rss version='2.0'><channel><item><title>John's
Doe</title></item></channel></rss>"
=> "<rss version='2.0'><channel><item><title>John's
Doe</title></item></channel></rss>"
irb(main):004:0> xmldoc = Document.new xml
=> <UNDEFINED> ... </>
irb(main):005:0> XPath.first( xmldoc, "/rss/channel/item/title" ).to_s
=> "<title>John's Doe</title>"
irb(main):006:0> XPath.first( xmldoc,
"/rss/channel/item/title[text()='John's Doe']" ).to_s
NoMethodError: undefined method `node_type' for "John":String
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:124:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in `each'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:402:in
`Predicate'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:346:in
`Predicate'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:204:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
`times'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:34:in `parse'
from /usr/local/lib/ruby/1.8/rexml/xpath.rb:28:in `first'
from (irb):6
irb(main):007:0> XPath.first( xmldoc,
"/rss/channel/item/title[text()='John\'s Doe']" ).to_s
NoMethodError: undefined method `node_type' for "John":String
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:124:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in `each'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:402:in
`Predicate'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:346:in
`Predicate'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:204:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
`times'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:34:in `parse'
from /usr/local/lib/ruby/1.8/rexml/xpath.rb:28:in `first'
from (irb):7



2 Answers

Brian Candler

10/21/2004 8:29:00 AM

0

> irb(main):006:0> XPath.first( xmldoc,
> "/rss/channel/item/title[text()='John's Doe']" ).to_s

I'm no expert in XPath, but that looks like a broken XPath query because of
the three single quotes.

> irb(main):007:0> XPath.first( xmldoc,
> "/rss/channel/item/title[text()='John\'s Doe']" ).to_s

That's identical, as you'll see if you try this:

irb(main):001:0> a="text()='John\'s Doe'"
=> "text()='John's Doe'"

You've not inserted a backslash into the string, you just escaped the quote,
and the escaping was removed. You need two backslashes to insert a single
backslash into the string:

irb(main):002:0> a="text()='John\\'s Doe'"
=> "text()='John\\'s Doe'"

(Despite how it looks, there is only a single backslash in there; it's shown
as two because it's inside a double-quoted string, to make it valid Ruby)

irb(main):003:0> a.each_byte { |c| print c.chr," " }
t e x t ( ) = ' J o h n \ ' s D o e ' => "text()='John\\'s Doe'"

However, I've just had a quick scan through the XPath-1.0 spec, and I don't
think that's how you do it. You can include single quotes inside a
double-quoted string, and vice versa. But probably what you want for the
general case is XML character entities: &#39; or &apos;

Try passing your string through this before constructing your XPath query:

require 'rexml/text'
a = "John's Doe"
b = REXML::Text::normalize(a)
#=> "John&apos;s Doe"

HTH,

Brian.


Brian Candler

10/21/2004 8:36:00 AM

0

On Thu, Oct 21, 2004 at 09:28:51AM +0100, Brian Candler wrote:
> Try passing your string through this before constructing your XPath query:
>
> require 'rexml/text'
> a = "John's Doe"
> b = REXML::Text::normalize(a)
> #=> "John&apos;s Doe"

Hmm, that doesn't work.

irb(main):007:0> XPath.first( xmldoc, "/rss/channel/item/title[text()='John&apos;s Doe']" ).to_s
=> ""
irb(main):008:0> XPath.first( xmldoc, "/rss/channel/item/title[text()='John&#39;s Doe']" ).to_s
=> ""
irb(main):009:0> XPath.first( xmldoc, "/rss/channel/item/title[text()=\"John's Doe\"]" ).to_s
=> "<title>John's Doe</title>"

You might want to raise that with the REXML author. In the mean time, if you
know the string only contains single quotes, then you can surround it with
double quotes in the XPath query, as per the third line above.

Regards,

Brian.