[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: Bug in CGI::unescapeHTML?

Yukihiro Matsumoto

7/5/2007 4:42:00 PM

Hi,

In message "Re: Bug in CGI::unescapeHTML?"
on Thu, 5 Jul 2007 13:00:02 +0900, Esad Hajdarevic <esad.talks@esse.at> writes:

|I think there's a bug in CGI::unescapeHTML. Or am I doing something wrong?
|
|$KCODE='u'
|CGI::unescapeHTML("&#xE3;")
|
|will return "\343", which according to my screaming mysql utf-8 encoded
|database is not a valid utf-8 sequence

Not a bug, unfortunately. Since your client sent a binary sequence
"\343" in URL encoding, unescapeHTML() decoded it back. Specifying
$KCODE='u' does not affect encoding your clients send. You have to
check (or convert) input from your clients explicitly, anyway.

matz.

1 Answer

Lionel Bouton

7/6/2007 4:31:00 PM

0

Paul Battley wrote the following on 05.07.2007 23:10 :
>
> If I understand HTML correctly, it is pretty much a bug, although it's
> perhaps more of a reflection of Ruby's limited encoding support (which
> has already been well discussed on this list!).
>

I tend to agree. I just fixed a bug in one of my apps where I blindly
used CGI.unescapeHTML which, as the original poster mentionned,
generates output that isn't welcomed by a system configured to use UTF-8
all the way, especially the database (PostgreSQL in my case)...

Thanks for htmlentities, it saved my day.

Lionel