Vinay Sajip
1/14/2008 9:32:00 PM
On Jan 14, 5:46 pm, Karsten Hilbert <Karsten.Hilb...@gmx.net> wrote:
> Dear all,
>
> I have a problem withloggingan exception.
>
> environment:
>
> Python 2.4, Debian testing
>
> ${LANGUAGE} not set
> ${LC_ALL} not set
> ${LC_CTYPE} not set
> ${LANG}=de_DE.UTF-8
>
> activating user-default locale with <locale.setlocale(locale.LC_ALL, '')> returns: [de_DE.UTF-8]
>
> locale.getdefaultlocale() - default (user) locale: ('de_DE', 'utf-8')
> encoding sanity check (also check "locale.nl_langinfo(CODESET)" below):
> sys.getdefaultencoding(): [ascii]
> locale.getpreferredencoding(): [UTF-8]
> locale.getlocale()[1]: [utf-8]
> sys.getfilesystemencoding(): [UTF-8]
>
> _logfile = codecs.open(filename = _logfile_name, mode = 'wb', encoding = 'utf8', errors = 'replace')
>
> logging.basicConfig (
> format = fmt,
> datefmt = '%Y-%m-%d %H:%M:%S',
> level =logging.DEBUG,
> stream = _logfile
> )
>
> I am using psycopg2 which in turn uses libpq. When trying to
> connect to the database and providing faulty authentication
> information:
>
> try:
> ... try to connect ...
> except StandardError, e:
> _log.error(u"login attempt %s/%s failed:", attempt+1, max_attempts)
>
> print "exception type :", type(e)
> print "exception dir :", dir(e)
> print "exception args :", e.args
> msg = e.args[0]
> print "msg type :", type(msg)
> print "msg.decode(utf8):", msg.decode('utf8')
> t,v,tb = sys.exc_info()
> print "sys.exc_info() :", t, v
> _log.exception(u'exception detected')
>
> the following output is generated:
>
> exception type : <type 'instance'>
> exception dir : ['__doc__', '__getitem__', '__init__', '__module__', '__str__', 'args']
> exception args : ('FATAL: Passwort-Authentifizierung f\xc3\xbcr Benutzer \xc2\xbbany-doc\xc2\xab fehlgeschlagen\n',)
> msg type : <type 'str'>
> msg.decode(utf8): FATAL: Passwort-Authentifizierung für Benutzer »any-doc« fehlgeschlagen
>
> sys.exc_info() : psycopg2.OperationalError FATAL: Passwort-Authentifizierung für Benutzer »any-doc« fehlgeschlagen
>
> Traceback (most recent call last):
> File "/usr/lib/python2.4/logging/__init__.py", line 739, in emit
> self.stream.write(fs % msg.encode("UTF-8"))
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 191: ordinal not in range(128)
>
> Now, the string "FATAL: Passwort-Auth..." comes from libpq
> via psycopg2. It is translated to German via gettext within
> libpq (at the C level). As we can see it is of type string.
> I know from the environment that it is likely encoded in
> utf8 manually applying which (see the decode call) succeeds.
>
> On _log.exception() theloggingmodule wants to output the
> message as encoded as utf8 (that's what the log file is set
> up as). So it'll look at the string, decide it is of type
> "str" and decode with the *Python default encoding* to get
> to type "unicode". Following which it'll re-encode with utf8
> to get back to type "str" ready for outputting to the log
> file.
>
> However, since the Python default encoding is "ascii" that
> conversion fails.
>
> Changing the Python default encoding isn't really an option
> as it is advocated against and would have to be made to work
> reliably on other users machines.
>
> One could, of course, write code to specifically check for
> this condition and manually pre-convert the message string
> to unicode but that seems not as things should be.
>
> How can I cleanly handle this situation ?
>
> Should theloggingmodule internally use an encoding gotten
> from the locale module rather than the default string encoding ?
>
> Karsten
> --
> GPG key ID E4071346 @ wwwkeys.pgp.net
> E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
Please reduce to a minimal program which demonstrates the issue and
log an issue on bugs.python.org.
Best regards,
Vinay Sajip