Fredrik Lundh
1/9/2008 11:34:00 AM
Andrew Savige wrote:
> Here's my first attempt:
>
> names = "freddy fred bill jock kevin andrew kevin kevin jock"
> freq = {}
> for name in names.split():
> freq[name] = 1 + freq.get(name, 0)
> deco = zip([-x for x in freq.values()], freq.keys())
> deco.sort()
> for v, k in deco:
> print "%-10s: %d" % (k, -v)
>
> I'm interested to learn how more experienced Python folks would solve
> this little problem. Though I've read about the DSU Python sorting idiom,
> I'm not sure I've strictly applied it above ... and the -x hack above to
> achieve a descending sort feels a bit odd to me, though I couldn't think
> of a better way to do it.
sort takes a reverse flag in recent versions, so you can do a reverse
sort as:
deco.sort(reverse=True)
in older versions, just do:
deco.sort()
deco.reverse() # this is fast!
also note that recent versions also provide a "sorted" function that
returns the sorted list, and both "sort" and "sorted" now allow you to
pass in a "key" function that's used to generate a sort key for each
item. taking that into account, you can simply write:
# sort items on descending count
deco = sorted(freq.items(), key=lambda x: -x[1])
simplifying the print statement is left as an exercise.
> I also have a few specific questions. Instead of:
>
> for name in names.split():
> freq[name] = 1 + freq.get(name, 0)
>
> I might try:
>
> for name in names.split():
> try:
> freq[name] += 1
> except KeyError:
> freq[name] = 1
>
> Which is preferred?
for simple scripts and small datasets, always the former.
for performance-critical production code, it depends on how often you
expect "name" to be present in the dictionary (setting up a try/except
is cheap, but raising and catching one is relatively costly).
> Ditto for:
>
> deco = zip([-x for x in freq.values()], freq.keys())
>
> versus:
>
> deco = zip(map(operator.neg, freq.values()), freq.keys())
using zip/keys/values to emulate items is a bit questionable. if you
need to restructure the contents of a dictionary, I usually prefer items
(or iteritems, where suitable) and tuple indexing/unpacking in a list
comprehension (or generator expression, where suitable).
> Finally, I might replace:
>
> for v, k in deco:
> print "%-10s: %d" % (k, -v)
>
> with:
>
> print "\n".join("%-10s: %d" % (k, -v) for v, k in deco)
why?
</F>