[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Re: Learning Python via a little word frequency program

Fredrik Lundh

1/9/2008 11:34:00 AM

Andrew Savige wrote:


> Here's my first attempt:
>
> names = "freddy fred bill jock kevin andrew kevin kevin jock"
> freq = {}
> for name in names.split():
> freq[name] = 1 + freq.get(name, 0)
> deco = zip([-x for x in freq.values()], freq.keys())
> deco.sort()
> for v, k in deco:
> print "%-10s: %d" % (k, -v)
>
> I'm interested to learn how more experienced Python folks would solve
> this little problem. Though I've read about the DSU Python sorting idiom,
> I'm not sure I've strictly applied it above ... and the -x hack above to
> achieve a descending sort feels a bit odd to me, though I couldn't think
> of a better way to do it.

sort takes a reverse flag in recent versions, so you can do a reverse
sort as:

deco.sort(reverse=True)

in older versions, just do:

deco.sort()
deco.reverse() # this is fast!

also note that recent versions also provide a "sorted" function that
returns the sorted list, and both "sort" and "sorted" now allow you to
pass in a "key" function that's used to generate a sort key for each
item. taking that into account, you can simply write:

# sort items on descending count
deco = sorted(freq.items(), key=lambda x: -x[1])

simplifying the print statement is left as an exercise.

> I also have a few specific questions. Instead of:
>
> for name in names.split():
> freq[name] = 1 + freq.get(name, 0)
>
> I might try:
>
> for name in names.split():
> try:
> freq[name] += 1
> except KeyError:
> freq[name] = 1
>
> Which is preferred?

for simple scripts and small datasets, always the former.

for performance-critical production code, it depends on how often you
expect "name" to be present in the dictionary (setting up a try/except
is cheap, but raising and catching one is relatively costly).

> Ditto for:
>
> deco = zip([-x for x in freq.values()], freq.keys())
>
> versus:
>
> deco = zip(map(operator.neg, freq.values()), freq.keys())

using zip/keys/values to emulate items is a bit questionable. if you
need to restructure the contents of a dictionary, I usually prefer items
(or iteritems, where suitable) and tuple indexing/unpacking in a list
comprehension (or generator expression, where suitable).

> Finally, I might replace:
>
> for v, k in deco:
> print "%-10s: %d" % (k, -v)
>
> with:
>
> print "\n".join("%-10s: %d" % (k, -v) for v, k in deco)

why?

</F>