[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Converting escaped html to utf-8

Chris Worrall

7/25/2007 11:31:00 PM

Hi everyone,

I've looked around online for a solution, but I'm pretty new to ruby
and programming in general, so I feel like I'm hitting a wall here.

I'm retrieving data from Hpricot that I'd like to store in UTF-8, but
I can't find a function to convert hex NCRs like:

á

Surely somebody's had to do this in the past that could point me in
the right direction? Thanks!

2 Answers

Chris Worrall

7/26/2007 4:40:00 PM

0

Well, after some more googling, I found a solution. If anyone was curious --

require 'cgi'
require 'iconv'

n = "á"
n = CGI.unescapeHTML(n)
n = Iconv.conv("UTF-8", "ISO-8859-1", n)


On 7/25/07, Chris Worrall <chris.worrall@gmail.com> wrote:
> Hi everyone,
>
> I've looked around online for a solution, but I'm pretty new to ruby
> and programming in general, so I feel like I'm hitting a wall here.
>
> I'm retrieving data from Hpricot that I'd like to store in UTF-8, but
> I can't find a function to convert hex NCRs like:
>
> &#xe1;
>
> Surely somebody's had to do this in the past that could point me in
> the right direction? Thanks!
>
>

Daniel DeLorme

7/26/2007 10:55:00 PM

0

Chris Worrall wrote:
> Well, after some more googling, I found a solution. If anyone was
> curious --
>
> require 'cgi'
> require 'iconv'
>
> n = "&#xe1;"
> n = CGI.unescapeHTML(n)
> n = Iconv.conv("UTF-8", "ISO-8859-1", n)

I'm surprised no one mentioned it but you could use

require "rubygems"
require "htmlentities"
puts HTMLEntities.decode_entities("&#x100; &#x108; &#x10E;")
=> Ä? Ä? Ä?

Daniel