[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

why not bisect options?

Robert Bossy

2/29/2008 2:32:00 PM

Hi all,

I thought it would be useful if insort and consorts* could accept the
same options than list.sort, especially key and cmp.

The only catch I can think of is that nothing prevents a crazy developer
to insort elements using different options to the same list. I foresee
two courses of actions:
1) let the developer be responsible for the homogeneity of successive
insort calls on the same list (remember that the developer is already
responsible for giving a sorted list), or
2) make bisect a class which keeps the key and cmp options internally
and always use them for comparison, something like:


class Bisect:
def __init__(self, lst = [], key = None, cmp = None):
self.key = key
self.cmp = cmp
self.lst = lst
self.lst.sort(key = key, cmp = cmp)

def compare_elements(self, a, b):
if self.cmp is not None:
return self.cmp(a, b)
if self.key is not None:
return cmp(self.key(a), self.key(b))
return cmp(a,b)

def insort_right(self, elt, lo = 0, hi = None):
"""Inspired from bisect in the python standard library"""
if hi is None:
hi = len(self.lst)
while lo < hi:
mid = (lo + hi) / 2
if self.compare_elements(elt, self.lst[mid]) < 0:
hi = mid
else:
lo = mid + 1
self.lst.insert(lo, elt)
...


Any thoughts about this?

RB

* at this point you should smile...
7 Answers

Raymond Hettinger

2/29/2008 7:15:00 PM

0

[Robert Bossy]
> I thought it would be useful if insort and consorts* could accept the
> same options than list.sort, especially key and cmp.

If you're going to do many insertions or searches, wouldn't it be
*much* more efficient to store your keys in a separate array?

The sort() function guarantees that it calls the key function exactly
once for each member of the list. With and bisect/insort, successive
searches can call the key function over and over again with the same
value.


Raymond



Aaron Brady

2/29/2008 9:36:00 PM

0

On Feb 29, 1:15 pm, Raymond Hettinger <pyt...@rcn.com> wrote:
> [Robert Bossy]
>
> > I thought it would be useful if insort and consorts* could accept the
> > same options than list.sort, especially key and cmp.
>
> If you're going to do many insertions or searches, wouldn't it be
> *much* more efficient to store your keys in a separate array?
>
> The sort() function guarantees that it calls the key function exactly
> once for each member of the list.  With and bisect/insort, successive
> searches can call the key function over and over again with the same
> value.
>
> Raymond

Since sort time is at least linear, sort ( key, obj ) pairs; return
obj's.

rbossy

3/1/2008 1:52:00 PM

0

Selon Raymond Hettinger <python@rcn.com>:

> [Robert Bossy]
> > I thought it would be useful if insort and consorts* could accept the
> > same options than list.sort, especially key and cmp.
>
> If you're going to do many insertions or searches, wouldn't it be
> *much* more efficient to store your keys in a separate array?
>
> The sort() function guarantees that it calls the key function exactly
> once for each member of the list. With and bisect/insort, successive
> searches can call the key function over and over again with the same
> value.

Yeah, sure. Thanks for pointing that out.

RB

Paul Rubin

3/1/2008 7:36:00 PM

0

Raymond Hettinger <python@rcn.com> writes:
> The sort() function guarantees that it calls the key function exactly
> once for each member of the list.

That is a time-space tradeoff though, and it presupposes that it's
possible to write a key function. Depending on the objects involved,
it could be that it's easier to write a comparison function than a
key function.

rbossy

3/4/2008 3:10:00 PM

0

Quoting Raymond Hettinger <python@rcn.com>:

> [Robert Bossy]
> > I thought it would be useful if insort and consorts* could accept the
> > same options than list.sort, especially key and cmp.
>
> If you're going to do many insertions or searches, wouldn't it be
> *much* more efficient to store your keys in a separate array?
>
> The sort() function guarantees that it calls the key function exactly
> once for each member of the list. With and bisect/insort, successive
> searches can call the key function over and over again with the same
> value.

I factored your remark into the code. I actually don't have any particular
question about it, that's just an acknowledgment. Though I feel that if one
uses the default key, then an awkward list of twins is stored...



from operator import itemgetter


def identity(x): return x


class Bisect:
def __init__(self, it, key=identity, cmp=cmp):
self.lst = [ (key(x),x,) for x in it ]
self.lst.sort(cmp=cmp, key=itemgetter(0))
self.key = key
self.cmp = cmp

def insort_right(self, x, lo = 0, hi = None):
if hi is None:
hi = len(self.lst)
xk = self.key(x)
while lo < hi:
mid = (lo + hi) // 2
if self.cmp(xk, self.lst[mid][0]) < 0:
hi = mid
else:
lo = mid + 1
self.lst.insert(lo, (kx,x,))

# skipped insort_left, bisect_right, bisect_left
# also skipped the container methods __len__, __getitem__, __iter__


Cheers,
RB

Aaron Watters

3/4/2008 5:04:00 PM

0

On Feb 29, 9:31 am, Robert Bossy <Robert.Bo...@jouy.inra.fr> wrote:
> Hi all,
>
> I thought it would be useful if insort and consorts* could accept the
> same options than list.sort, especially key and cmp.....

Wouldn't this make them slower and less space efficient? It would
be fine to add something like this as an additional elaboration, but
I want bisect to scream as fast as possible in the default streamlined
usage.
-- Aaron Watters
===
dueling bumper stickers:
use java/get rich
use perl/get laid -- both equally sensible

http://www.xfeedme.com/nucular/pydistro.py/go?FREE...

Robert Bossy

3/4/2008 5:36:00 PM

0

Aaron Watters wrote:
> On Feb 29, 9:31 am, Robert Bossy <Robert.Bo...@jouy.inra.fr> wrote:
>
>> Hi all,
>>
>> I thought it would be useful if insort and consorts* could accept the
>> same options than list.sort, especially key and cmp.....
>>
>
> Wouldn't this make them slower and less space efficient? It would
> be fine to add something like this as an additional elaboration, but
> I want bisect to scream as fast as possible in the default streamlined
> usage.
Yes it is slower and bigger, so I agree that the canonical
implementation for default values should be kept. Also because the
original bisect functions are actually written in C, the speed
difference is even more noticeable.

Though, I needed custom ordering bisects since I was implementing
interval trees (storing intervals by startpoint/endpoint).

Cheers,
RB