Asp Forum
Home
|
Login
|
Register
|
Search
Forums
>
comp.lang.ruby
XML, WebService and Character Encoding issue
Gonzalo Rubio
1/31/2006 7:18:00 PM
I created a Ruby proxy for a FoxPro app that needs to fetch data from a
WebService (which returns it in XML) and read it in CSV format (for
which i use REXML parser and output the CSV by hand)
To do this the WebService returns me a Base64 encoded XML that i then
decode and process.
Everything works ok until you have non-standard characters in the XML
data (non 7-bit characters, i.e. Western European accented characters)
since the REXML parser dies complaining about a closing tag not found.
I looked for an entities processor or a character encoding converter in
the standard library and i coudn't find it.
I ended doing an ugly hack by feeding a Hash with the accented character
as the key, and the entity as the value, and then replacing back and
forth the returned data.
my function looks like this:
def iso2entities(str, inverse)
rep = Hash.new
rep['á'] = 'á'
# ... snipped code ...
rep['©'] = '©'
unless inverse
rep.each{|code, entity| str.gsub!(code, entity) }
else
rep.each{|code, entity| str.gsub!(entity, code) }
end
return str
end
It works, but feeding the Hash by hand is time consuming and code
obviously looks like an ugly work-around... is there a "ruby standard"
way to do it?
--
Posted via
http://www.ruby-...
.
1 Answer
Russell R. Rutledge
2/25/2006 12:47:00 PM
0
Gonzalo Rubio wrote:
> I created a Ruby proxy for a FoxPro app that needs to fetch data from a
> WebService (which returns it in XML) and read it in CSV format (for
> which i use REXML parser and output the CSV by hand)
> To do this the WebService returns me a Base64 encoded XML that i then
> decode and process.
>
> Everything works ok until you have non-standard characters in the XML
> data (non 7-bit characters, i.e. Western European accented characters)
> since the REXML parser dies complaining about a closing tag not found.
> I looked for an entities processor or a character encoding converter in
> the standard library and i coudn't find it.
>
> I ended doing an ugly hack by feeding a Hash with the accented character
> as the key, and the entity as the value, and then replacing back and
> forth the returned data.
> my function looks like this:
>
> def iso2entities(str, inverse)
> rep = Hash.new
> rep['á'] = 'á'
> # ... snipped code ...
> rep['©'] = '©'
>
> unless inverse
> rep.each{|code, entity| str.gsub!(code, entity) }
> else
> rep.each{|code, entity| str.gsub!(entity, code) }
> end
> return str
> end
>
> It works, but feeding the Hash by hand is time consuming and code
> obviously looks like an ugly work-around... is there a "ruby standard"
> way to do it?
Hey Gonzalo. I was having the same problem. I'm not at a final
solution, but part of what worked for me was changing my XML character
encoding (find it in the first line of your XML file) from UTF-8 to
ISO-8859-1. For some reason REXML can parse characters encoded by more
than 7-bits in this format (like é). Hope that helps.
Russ
--
Posted via
http://www.ruby-...
.
Servizio di avviso nuovi messaggi
Ricevi direttamente nella tua mail i nuovi messaggi per
XML, WebService and Character Encoding issue
Inserendo la tua e-mail nella casella sotto, riceverai un avviso tramite posta elettronica ogni volta che il motore di ricerca troverà un nuovo messaggio per te
Il servizio è completamente GRATUITO!
x
Login to ForumsZone
Login with Google
Login with E-Mail & Password