[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

common problem - elegant solution sought

Helmut Jarausch

1/15/2008 10:34:00 AM

Hi,

I'm looking for an elegant solution of the following tiny but common problem.

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")]
pos=-1
found=-1
for (Key,Date) in L :
pos+= 1
if Key == "b" :
found= pos
break

if found >= 0 :
del L[found]

print L

Most probably there are much more elegant solutions.
Unfortunately, the index-list-method doesn't take an
additional function argument for the comparisons.

Many thanks for your hints,

Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
14 Answers

cokofreedom

1/15/2008 10:50:00 AM

0

> I have a list of tuples (Unique_ID,Date) both of which are strings.
> I want to delete the tuple (element) with a given Unique_ID, but
> I don't known the corresponding Date.
>
> My straight forward solution is a bit lengthy, e.g.
>
> L=[("a","070501"),("b","080115"),("c","071231")]

Do they have to be tuples? Seems to me if you are using a Unique_ID
that a dictionary would be perfect. Indeed when you are looking to
delete an entry you can simply look for the Unique_ID in the
dictionary and it will be a very fast look up.

if key in "my_dictionary":

However, if you must use a list of tuples, your current method is very
inefficient. Why not use the del within the if Key == "b":?

I cannot provide extremely elegant solutions with your current system,
but I can tell you with a large enough list your way is going to take
a very long time since you will iterate over the whole list
sequentially to find an entry...

Tim Golden

1/15/2008 10:55:00 AM

0

Helmut Jarausch wrote:
> I'm looking for an elegant solution of the following tiny but common problem.
>
> I have a list of tuples (Unique_ID,Date) both of which are strings.
> I want to delete the tuple (element) with a given Unique_ID, but
> I don't known the corresponding Date.
>
> My straight forward solution is a bit lengthy, e.g.
>
> L=[("a","070501"),("b","080115"),("c","071231")]
> pos=-1
> found=-1
> for (Key,Date) in L :
> pos+= 1
> if Key == "b" :
> found= pos
> break
>
> if found >= 0 :
> del L[found]
>
> print L
>
> Most probably there are much more elegant solutions.
> Unfortunately, the index-list-method doesn't take an
> additional function argument for the comparisons.

Probably the most common solution to this in Python
is to produce a second list which has all the items
in the first except for the one(s) you wish to delete:

<code>
L=[("a","070501"),("b","080115"),("c","071231")]
L2 = [(uniqid, date) for (uniqid, date) in L if not uniqid == 'b']
</code>

It might look a little wasteful, but since Python lists
are supremely fast and since the tuples themselves aren't
copied, only their references, the result is probably what
you need.

Obviously you've given us a toy example, which is fine for
demonstrating the problem. Suggestions might vary if, for
example, your data set were much bigger or if the tuples
were more complex.

TJG

Diez B. Roggisch

1/15/2008 11:03:00 AM

0

Helmut Jarausch wrote:

> Hi,
>
> I'm looking for an elegant solution of the following tiny but common
> problem.
>
> I have a list of tuples (Unique_ID,Date) both of which are strings.
> I want to delete the tuple (element) with a given Unique_ID, but
> I don't known the corresponding Date.
>
> My straight forward solution is a bit lengthy, e.g.
>
> L=[("a","070501"),("b","080115"),("c","071231")]
> pos=-1
> found=-1
> for (Key,Date) in L :
> pos+= 1
> if Key == "b" :
> found= pos
> break
>
> if found >= 0 :
> del L[found]
>
> print L
>
> Most probably there are much more elegant solutions.
> Unfortunately, the index-list-method doesn't take an
> additional function argument for the comparisons.
>
> Many thanks for your hints,

Several solutions:

- use a different datastructure, as others suggested

- use a database. SQLite for example.

- replace the list in-place like this:

L[:] = [(k, d) for k, d in L if k != "b"]

Diez

Helmut Jarausch

1/15/2008 11:20:00 AM

0

Thanks to you all for your help.

The solution to regenerate the list skipping
the one to be deleted is fine for me since my
lists are of moderate size and the operation
is infrequent.

Helmut.


--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany

Bruno Desthuilliers

1/15/2008 11:20:00 AM

0

Helmut Jarausch a écrit :
> Hi,
>
> I'm looking for an elegant solution of the following tiny but common
> problem.
>
> I have a list of tuples (Unique_ID,Date) both of which are strings.
> I want to delete the tuple (element) with a given Unique_ID, but
> I don't known the corresponding Date.

If you don't care about ordering, you could use a dict:

L=[("a","070501"),("b","080115"),("c","071231")]
d = dict(L)
del d['b']
L = d.items()

But I guess you care about ordering !-)

Anyway, you can still use a dict to find out the date part:

L=[("a","070501"),("b","080115"),("c","071231")]
d = dict(L)
t = ('b', d['b'])
L.remove(t)

Or you could build an 'index' list to find out the appropriate index:
L=[("a","070501"),("b","080115"),("c","071231")]
index = [key for key, val in L]
pos = index.index('b')
del L[pos]

> My straight forward solution is a bit lengthy, e.g.
>
> L=[("a","070501"),("b","080115"),("c","071231")]
> pos=-1
> found=-1
> for (Key,Date) in L :
> pos+= 1
> if Key == "b" :
> found= pos
> break
>
> if found >= 0 :
> del L[found]
>
> print L

You could have used the enumerate(seq) function here. And since you're
breaking out once the element deleted, you could as well delete it from
within the loop:

for pos, (key, date) in enumerate(L):
if key == 'b':
del L[pos]
break


Benchmarking left as an exercice to the reader !-)

Also note that the "best" solution may depend on your real use case and
dataset.

Steven D'Aprano

1/15/2008 11:56:00 AM

0

On Tue, 15 Jan 2008 11:33:36 +0100, Helmut Jarausch wrote:

> Hi,
>
> I'm looking for an elegant solution of the following tiny but common
> problem.
>
> I have a list of tuples (Unique_ID,Date) both of which are strings. I
> want to delete the tuple (element) with a given Unique_ID, but I don't
> known the corresponding Date.
>
> My straight forward solution is a bit lengthy, e.g.
>
> L=[("a","070501"),("b","080115"),("c","071231")] pos=-1
> found=-1
> for (Key,Date) in L :
> pos+= 1
> if Key == "b" :
> found= pos
> break
>
> if found >= 0 :
> del L[found]
>
> print L
>
> Most probably there are much more elegant solutions. Unfortunately, the
> index-list-method doesn't take an additional function argument for the
> comparisons.


Your code mixes two distinct steps into one, and therefore becomes a
little messy. I suggest you change your problem to the two step problem
(1) find an item with the given key; and (2) delete it.

Here's a solution to the first part:

def find_by_key(L, key, where=0):
"""Search the 'where'th position of elements of L."""
for i, x in enumerate(L):
if x[where] == key:
return i
return -1 # or raise an exception


And the second:

pos = find_by_key(L, 'b')
if pos != -1:
del L[pos]

This too could be made into a function, if necessarily.



How else could we solve this problem?

class Tester(object):
def __init__(self, key, where=0):
self.key = key
self.where = where
def __eq__(self, other):
return other[self.where] == self.key


L = [("a", "070501"), ("b", "080115"), ("c", "071231")]
L.index(Tester("c"))


But beware, this technique might not generalize to arbitrary elements in
the list L. It should work if the elements are tuples, but may not work
if they are a custom class.


--
Steven

Neil Cerutti

1/15/2008 1:14:00 PM

0

On Jan 15, 2008 5:33 AM, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
> Hi,
>
> I'm looking for an elegant solution of the following tiny but common problem.
>
> I have a list of tuples (Unique_ID,Date) both of which are strings.
> I want to delete the tuple (element) with a given Unique_ID, but
> I don't known the corresponding Date.
>
> My straight forward solution is a bit lengthy, e.g.
>
> L=[("a","070501"),("b","080115"),("c","071231")]

If the data is truly sorted by Unique_ID, then binary search may be
feasible (though not actually recommended for such small list).

import bisect

i = bisect.bisect_left(L, ('b', '000000'))
if i[0] == 'b':
del L[i]

--
Neil Cerutti <mr.cerutti+python@gmail.com>

Helmut Jarausch

1/15/2008 2:55:00 PM

0

Neil Cerutti wrote:
> On Jan 15, 2008 5:33 AM, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
>> Hi,
>>
>> I'm looking for an elegant solution of the following tiny but common problem.
>>
>> I have a list of tuples (Unique_ID,Date) both of which are strings.
>> I want to delete the tuple (element) with a given Unique_ID, but
>> I don't known the corresponding Date.
>>
>> My straight forward solution is a bit lengthy, e.g.
>>
>> L=[("a","070501"),("b","080115"),("c","071231")]
>
> If the data is truly sorted by Unique_ID, then binary search may be
> feasible (though not actually recommended for such small list).
>
> import bisect
>
> i = bisect.bisect_left(L, ('b', '000000'))
> if i[0] == 'b':
> del L[i]
>

Thanks,
unfortunately, they are sorted on 'Date'
Helmut.


--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany

Helmut Jarausch

1/15/2008 3:09:00 PM

0

Again, many thanks to all who provide their solution.
I have timed these (though on my old P3(0.9GHz)) - see below
Helmut.

Helmut Jarausch wrote:
> Hi,
>
> I'm looking for an elegant solution of the following tiny but common
> problem.
>
> I have a list of tuples (Unique_ID,Date) both of which are strings.
> I want to delete the tuple (element) with a given Unique_ID, but
> I don't known the corresponding Date.
>

#!/usr/bin/python

import random
import timeit

Lorg=[]

def genList(L) :
for f in range(ord('A'),ord('z')+1) :
for s in range(ord('A'),ord('z')+1) :
L.append((chr(f)+chr(s),str(random.randrange(0,1000000))))

genList(Lorg)
Times= 1000
T0= timeit.Timer('L=list(Lorg)','from __main__ import Lorg').timeit(Times)
print T0

SetUp1=r'''from __main__ import Lorg
def del_by_key(L,key) :
d= dict(L)
del d[key]
L[:]= d.items()
'''

SetUp2=r'''from __main__ import Lorg
def del_by_key(L,key) :
d= dict(L)
t= (key,d[key])
L.remove(t)
'''

SetUp3=r'''from __main__ import Lorg
def del_by_key(L,key) :
index= [k for k,val in L]
pos = index.index(key)
del L[pos]
'''

SetUp4=r'''from __main__ import Lorg
def del_by_key(L,key) :
for pos, (k,d) in enumerate(L):
if k == key :
del L[pos]
break
'''

SetUp5=r'''from __main__ import Lorg
def del_by_key(L,key) :
L[:]= [(k,d) for (k,d) in L if k !=key]
'''

SetUp6=r'''from __main__ import Lorg
class Tester(object) :
def __init__(self,key) :
self.key= key
def __eq__(self,other) :
return other[0] == self.key

def del_by_key(L,key) :
del L[L.index(Tester(key))]
'''

print '*** ready ***'


T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp1).timeit(Times)
print "Method 1 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp2).timeit(Times)
print "Method 2 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp3).timeit(Times)
print "Method 3 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp4).timeit(Times)
print "Method 4 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp5).timeit(Times)
print "Method 5 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp6).timeit(Times)
print "Method 6 :",T-T0

# Results on an old P3 (0.9 GHz)
# *** ready ***
# Method 1 : 10.9850928783
# Method 2 : 5.96455168724
# Method 3 : 3.97821164131
# Method 4 : 1.66151881218
# Method 5 : 8.90886187553
# Method 6 : 6.2503888607


The clear winner is

def del_by_key(L,key) :
for pos, (k,d) in enumerate(L):
if k == key :
del L[pos]
break


--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany

Bearophile

1/15/2008 4:49:00 PM

0

Helmut Jarausch:
> The clear winner is
>
> def del_by_key(L,key) :
> for pos, (k,d) in enumerate(L):
> if k == key :
> del L[pos]
> break

If you use Psyco this is faster:

def del_by_key(L,key):
pos = 0
for pair in L:
if pair[0] == key :
del L[pos]
break
pos += 1

Bye,
bearophile