Asp Forum - Re: Short confusing example with unicode, print, and __str__

Gerard Brunick

3/6/2008 7:45:00 AM

Gary Herron wrote:
> Gerard Brunick wrote:
>> I really don't understand the following behavior:
>>
>> >>> class C(object):
>> ... def __init__(self, s): self.s = s
>> ... def __str__(self): return self.s
>> ...
>> >>> cafe = unicode("Caf\xe9", "Latin-1")
>> >>> c = C(cafe)
>> >>> print "Print using c.s:", c.s
>> Print using c.s: Café
>> >>> print "Print using just c:", c
>> Print using just c: Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
>> position 3: ordinal not in range(128)
>> >>> str(c)
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
>> position 3: ordinal not in range(128)
>>
>> Why would "print c.s" work but the other two cases throw an exception?
>> Any help understanding this would be greatly appreciated.
>>
>> Thanks in advance,
>> Gerard
>>
> It's the difference between how __str__ and __repr__ act on strings.
>
> Here's s simpler example
>
> >>> d=unicode("Caf\xe9", "Latin-1")
> >>> repr(d)
> "u'Caf\\xe9'"
> >>> str(d)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
> position 3: ordinal not in range(128)
>
> Gary Herron
It seems the question is more about what does print do. Lets extend
your example:

>>> d=unicode("Caf\xe9", "Latin-1")
>>> repr(d)
"u'Caf\\xe9'"
>>> print d
Café
>>> str(d)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
position 3: ordinal not in range(128)

Why doesn't the print statement that a UnicodeEncodeError? I assumed
that print calls str and then prints
the result, but this doesn't seem to be the case. What the heck does
print do?

1 Answer

Peter Otten

3/6/2008 8:14:00 AM

Gerard Brunick wrote:

> It seems the question is more about what does print do. Â Lets extend
> your example:
>
> >>> d=unicode("Caf\xe9", "Latin-1")
> >>> repr(d)
> "u'Caf\\xe9'"
> >>> print d
> CafÃ©
> >>> str(d)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
> position 3: ordinal not in range(128)
>
> Why doesn't the print statement that a UnicodeEncodeError? Â I assumed
> that print calls str and then prints
> the result, but this doesn't seem to be the case. Â What the heck does
> print do?

Something like

d = ...
if type(d) is unicode:
sys.stdout.write(d.encode(sys.stdout.encoding))
else:
sys.stdout.write(str(d))

Unfortunately you can't make that work smoothly with arbitrary objects as
you have to throw in an explicit conversion to unicode:

>>> class C(object):
.... def __unicode__(self): return u"Caf\xe9"
....
>>> print C()
<__main__.C object at 0x2b1da33e0bd0>
>>> print unicode(C())
CafÃ©

Or, even less intuitive:

>>> class D(object):
.... def __str__(self): return u"Caf\xe9"
....
>>> print unicode(D())
CafÃ©
>>> print D()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position
3: ordinal not in range(128)

Peter

comp.lang.python

Re: Short confusing example with unicode, print, and str

Gerard Brunick

Peter Otten

comp.lang.python

Re: Short confusing example with unicode, print, and __str__

Gerard Brunick

Peter Otten

x Login to ForumsZone

Re: Short confusing example with unicode, print, and str