[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Re: How to Encode String of Raw UTF-8 into Unicode?

Gabriel Genellina

3/7/2008 12:53:00 AM

En Thu, 06 Mar 2008 22:43:58 -0200, Henry Chang <goldspin@gmail.com>
escribi�:

> Suppose I start out with a raw string of utf-8 code points.

"utf-8 code points"???
Looks like a utf-8 encoded string, and then written in hex format.

> raw_string = "68656E727963"
>
> I can coerce it into proper unicode format by slicing out two
> characters at a time.
>
> unicode_string = u"\x68\x65\x6E\x72\x79\x63"
>
> >>> print unicode_proper
> >>> henry
>
> My question: is there an existing function that can do this (without
> having to manually slicing the raw text string)?

Two steps: first decode from hex to string, and then from utf8 string to
unicode:

py> raw_string = "68656E727963"
py> raw_string.decode("hex")
'henryc'
py> raw_string.decode("hex").decode("utf8")
u'henryc'

--
Gabriel Genellina