[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Re: Programmatically discovering encoding types supported by codecs module

python

3/28/2010 10:49:00 AM

Gabriel,

Thank you for your analysis - very interesting. Enjoyed your fromlist
choice of names. I'm still in my honeymoon phase with Python so I only
know the first part :)

Regards,
Malcolm


----- Original message -----
From: "Gabriel Genellina" <gagsl-py2@yahoo.com.ar>
To: python-list@python.org
Date: Wed, 24 Mar 2010 19:50:11 -0300
Subject: Re: Programmatically discovering encoding types supported by
codecs module

En Wed, 24 Mar 2010 14:58:47 -0300, <python@bdurham.com> escribió:

>> After looking at how things are done in codecs.c and
>> encodings/__init__.py I think you should enumerate all modules in the
>> encodings package that define a getregentry function. Aliases come from
>> encodings.aliases.aliases.
>
> Thanks for looking into this for me. Benjamin Kaplan made a similar
> observation. My reply to him included the snippet of code we're using to
> generate the actual list of encodings that our software will support
> (thanks to Python's codecs and encodings modules).

I was curious as whether both methods would give the same results:

py> modules=set()
py> for name in glob.glob(os.path.join(encodings.__path__[0], "*.py")):
.... name = os.path.basename(name)[:-3]
.... try: mod = __import__("encodings."+name,
fromlist=['ilovepythonbutsometimesihateit'])
.... except ImportError: continue
.... if hasattr(mod, 'getregentry'):
.... modules.add(name)
....
py> fromalias = set(encodings.aliases.aliases.values())
py> fromalias - modules
set(['tactis'])
py> modules - fromalias
set(['charmap',
'cp1006',
'cp737',
'cp856',
'cp874',
'cp875',
'idna',
'iso8859_1',
'koi8_u',
'mac_arabic',
'mac_centeuro',
'mac_croatian',
'mac_farsi',
'mac_romanian',
'palmos',
'punycode',
'raw_unicode_escape',
'string_escape',
'undefined',
'unicode_escape',
'unicode_internal',
'utf_8_sig'])

There is a missing 'tactis' encoding (?) and about twenty without alias.

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/p...