Asp Forum - strftime return value encoding (mbcs, locale, etc.

Giovanni Bajo

1/27/2008 8:16:00 PM

Hello,

I am trying to find a good way to portably get the output of strftime()
and put it onto a dialog (I'm using PyQt, but it doesn't really matter).
The problem is that I need to decode the byte stream returned by strftime
() into Unicode.

This old bug:
http://mail.python.org/pipermail/python-bugs-...
November/020983.html

(last comment) mentions that it is "byte string in the locale's encoding".

The comment also suggests to use "mbcs" in Windows (which, AFAIK, it's
sort of an "alias" encoding for the current Windows codepage), and to
find out the exact encoding using locale.getpreferredencoding().

Thus, I was hoping that something like:

strftime("%#c", localtime()).decode(locale.getpreferredencoding())

would work... but alas I was reported this exception:

LookupError: unknown encoding: cp932

So: what is the correct code to achieve this? Will something like this
work:

data = strftime("%#c", localtime())
if os.name == "nt":
data = data.decode("mbcs")
else:
data = dada.decode(locale.getpreferredencoding())

Is this the correct way of doing it? (Yes, it sucks).

Shouldn't Python automatically alias whatever is returned by
locale.getpreferredencoding() to "mbcs", so that my original code works
portably?

Thanks in advance!
--
Giovanni Bajo

2 Answers

Mark Tolonen

1/27/2008 8:58:00 PM

"Giovanni Bajo" <rasky@develer.com> wrote in message
news:%D5nj.2630$Xg7.935@tornado.fastwebnet.it...
> Hello,
>
> I am trying to find a good way to portably get the output of strftime()
> and put it onto a dialog (I'm using PyQt, but it doesn't really matter).
> The problem is that I need to decode the byte stream returned by strftime
> () into Unicode.
>
> This old bug:
> http://mail.python.org/pipermail/python-bugs-...
> November/020983.html
>
> (last comment) mentions that it is "byte string in the locale's encoding".
>
> The comment also suggests to use "mbcs" in Windows (which, AFAIK, it's
> sort of an "alias" encoding for the current Windows codepage), and to
> find out the exact encoding using locale.getpreferredencoding().
>
> Thus, I was hoping that something like:
>
> strftime("%#c", localtime()).decode(locale.getpreferredencoding())
>
> would work... but alas I was reported this exception:
>
> LookupError: unknown encoding: cp932
>
> So: what is the correct code to achieve this? Will something like this
> work:
>
> data = strftime("%#c", localtime())
> if os.name == "nt":
> data = data.decode("mbcs")
> else:
> data = dada.decode(locale.getpreferredencoding())
>
> Is this the correct way of doing it? (Yes, it sucks).
>
> Shouldn't Python automatically alias whatever is returned by
> locale.getpreferredencoding() to "mbcs", so that my original code works
> portably?
>
> Thanks in advance!
> --
> Giovanni Bajo

Odd, what version of Python are you using? Python 2.5 works:

>>> import time,locale
>>> time.strftime('%#c').decode(locale.getpreferredencoding()) # cp1252 on
>>> my system
u'Sunday, January 27, 2008 12:56:30'
>>> time.strftime('%#c').decode('cp932')
u'Sunday, January 27, 2008 12:56:40'
>>> time.strftime('%#c').decode('mbcs')
u'Sunday, January 27, 2008 12:56:48'

--Mark

Martin v. Loewis

1/27/2008 10:25:00 PM

> LookupError: unknown encoding: cp932

What Python version are you using? cp932 is supported cross-platform
since Python 2.4.

> So: what is the correct code to achieve this? Will something like this
> work:
>
> data = strftime("%#c", localtime())
> if os.name == "nt":
> data = data.decode("mbcs")
> else:
> data = dada.decode(locale.getpreferredencoding())
>
> Is this the correct way of doing it?

Not necessarily. On some systems, and in some locales, Python will not
have any codec that converts the locale's encoding to Unicode.

In such a case, using ASCII with replacement characters might be the
best bet, as long as the locale's charset is an ASCII superset (i.e.
you don't work on an EBCDIC machine).

> Shouldn't Python automatically alias whatever is returned by
> locale.getpreferredencoding() to "mbcs", so that my original code works
> portably?

No. The "mbcs" codec has a slightly different semantics from the cp932
codec, on your system. Specifically, the "mbcs" codec might map
characters as approximations, whereas the cp932 codec will give errors
if a certain Unicode character is not supported in the target character
set.

Regards,
Martin

comp.lang.python

strftime return value encoding (mbcs, locale, etc.

Giovanni Bajo

Mark Tolonen

Martin v. Loewis

x Login to ForumsZone