[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

problem with logging exceptions with non-ASCII __str__ result

Karsten Hilbert

1/14/2008 5:47:00 PM

Dear all,

I have a problem with logging an exception.

environment:

Python 2.4, Debian testing

${LANGUAGE} not set
${LC_ALL} not set
${LC_CTYPE} not set
${LANG}=de_DE.UTF-8

activating user-default locale with <locale.setlocale(locale.LC_ALL, '')> returns: [de_DE.UTF-8]

locale.getdefaultlocale() - default (user) locale: ('de_DE', 'utf-8')
encoding sanity check (also check "locale.nl_langinfo(CODESET)" below):
sys.getdefaultencoding(): [ascii]
locale.getpreferredencoding(): [UTF-8]
locale.getlocale()[1]: [utf-8]
sys.getfilesystemencoding(): [UTF-8]

_logfile = codecs.open(filename = _logfile_name, mode = 'wb', encoding = 'utf8', errors = 'replace')

logging.basicConfig (
format = fmt,
datefmt = '%Y-%m-%d %H:%M:%S',
level = logging.DEBUG,
stream = _logfile
)

I am using psycopg2 which in turn uses libpq. When trying to
connect to the database and providing faulty authentication
information:

try:
... try to connect ...
except StandardError, e:
_log.error(u"login attempt %s/%s failed:", attempt+1, max_attempts)

print "exception type :", type(e)
print "exception dir :", dir(e)
print "exception args :", e.args
msg = e.args[0]
print "msg type :", type(msg)
print "msg.decode(utf8):", msg.decode('utf8')
t,v,tb = sys.exc_info()
print "sys.exc_info() :", t, v
_log.exception(u'exception detected')

the following output is generated:

exception type : <type 'instance'>
exception dir : ['__doc__', '__getitem__', '__init__', '__module__', '__str__', 'args']
exception args : ('FATAL: Passwort-Authentifizierung f\xc3\xbcr Benutzer \xc2\xbbany-doc\xc2\xab fehlgeschlagen\n',)
msg type : <type 'str'>
msg.decode(utf8): FATAL: Passwort-Authentifizierung für Benutzer »any-doc« fehlgeschlagen

sys.exc_info() : psycopg2.OperationalError FATAL: Passwort-Authentifizierung für Benutzer »any-doc« fehlgeschlagen

Traceback (most recent call last):
File "/usr/lib/python2.4/logging/__init__.py", line 739, in emit
self.stream.write(fs % msg.encode("UTF-8"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 191: ordinal not in range(128)

Now, the string "FATAL: Passwort-Auth..." comes from libpq
via psycopg2. It is translated to German via gettext within
libpq (at the C level). As we can see it is of type string.
I know from the environment that it is likely encoded in
utf8 manually applying which (see the decode call) succeeds.

On _log.exception() the logging module wants to output the
message as encoded as utf8 (that's what the log file is set
up as). So it'll look at the string, decide it is of type
"str" and decode with the *Python default encoding* to get
to type "unicode". Following which it'll re-encode with utf8
to get back to type "str" ready for outputting to the log
file.

However, since the Python default encoding is "ascii" that
conversion fails.

Changing the Python default encoding isn't really an option
as it is advocated against and would have to be made to work
reliably on other users machines.

One could, of course, write code to specifically check for
this condition and manually pre-convert the message string
to unicode but that seems not as things should be.

How can I cleanly handle this situation ?

Should the logging module internally use an encoding gotten
from the locale module rather than the default string encoding ?

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
1 Answer

Vinay Sajip

1/14/2008 9:32:00 PM

0

On Jan 14, 5:46 pm, Karsten Hilbert <Karsten.Hilb...@gmx.net> wrote:
> Dear all,
>
> I have a problem withloggingan exception.
>
> environment:
>
> Python 2.4, Debian testing
>
> ${LANGUAGE} not set
> ${LC_ALL} not set
> ${LC_CTYPE} not set
> ${LANG}=de_DE.UTF-8
>
> activating user-default locale with <locale.setlocale(locale.LC_ALL, '')> returns: [de_DE.UTF-8]
>
> locale.getdefaultlocale() - default (user) locale: ('de_DE', 'utf-8')
> encoding sanity check (also check "locale.nl_langinfo(CODESET)" below):
> sys.getdefaultencoding(): [ascii]
> locale.getpreferredencoding(): [UTF-8]
> locale.getlocale()[1]: [utf-8]
> sys.getfilesystemencoding(): [UTF-8]
>
> _logfile = codecs.open(filename = _logfile_name, mode = 'wb', encoding = 'utf8', errors = 'replace')
>
> logging.basicConfig (
> format = fmt,
> datefmt = '%Y-%m-%d %H:%M:%S',
> level =logging.DEBUG,
> stream = _logfile
> )
>
> I am using psycopg2 which in turn uses libpq. When trying to
> connect to the database and providing faulty authentication
> information:
>
> try:
> ... try to connect ...
> except StandardError, e:
> _log.error(u"login attempt %s/%s failed:", attempt+1, max_attempts)
>
> print "exception type :", type(e)
> print "exception dir :", dir(e)
> print "exception args :", e.args
> msg = e.args[0]
> print "msg type :", type(msg)
> print "msg.decode(utf8):", msg.decode('utf8')
> t,v,tb = sys.exc_info()
> print "sys.exc_info() :", t, v
> _log.exception(u'exception detected')
>
> the following output is generated:
>
> exception type : <type 'instance'>
> exception dir : ['__doc__', '__getitem__', '__init__', '__module__', '__str__', 'args']
> exception args : ('FATAL: Passwort-Authentifizierung f\xc3\xbcr Benutzer \xc2\xbbany-doc\xc2\xab fehlgeschlagen\n',)
> msg type : <type 'str'>
> msg.decode(utf8): FATAL: Passwort-Authentifizierung für Benutzer »any-doc« fehlgeschlagen
>
> sys.exc_info() : psycopg2.OperationalError FATAL: Passwort-Authentifizierung für Benutzer »any-doc« fehlgeschlagen
>
> Traceback (most recent call last):
> File "/usr/lib/python2.4/logging/__init__.py", line 739, in emit
> self.stream.write(fs % msg.encode("UTF-8"))
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 191: ordinal not in range(128)
>
> Now, the string "FATAL: Passwort-Auth..." comes from libpq
> via psycopg2. It is translated to German via gettext within
> libpq (at the C level). As we can see it is of type string.
> I know from the environment that it is likely encoded in
> utf8 manually applying which (see the decode call) succeeds.
>
> On _log.exception() theloggingmodule wants to output the
> message as encoded as utf8 (that's what the log file is set
> up as). So it'll look at the string, decide it is of type
> "str" and decode with the *Python default encoding* to get
> to type "unicode". Following which it'll re-encode with utf8
> to get back to type "str" ready for outputting to the log
> file.
>
> However, since the Python default encoding is "ascii" that
> conversion fails.
>
> Changing the Python default encoding isn't really an option
> as it is advocated against and would have to be made to work
> reliably on other users machines.
>
> One could, of course, write code to specifically check for
> this condition and manually pre-convert the message string
> to unicode but that seems not as things should be.
>
> How can I cleanly handle this situation ?
>
> Should theloggingmodule internally use an encoding gotten
> from the locale module rather than the default string encoding ?
>
> Karsten
> --
> GPG key ID E4071346 @ wwwkeys.pgp.net
> E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

Please reduce to a minimal program which demonstrates the issue and
log an issue on bugs.python.org.

Best regards,

Vinay Sajip