Asp Forum - pairs from a list

Alan G Isaac

1/22/2008 3:21:00 AM

I want to generate sequential pairs from a list.
Here is a way::

from itertools import izip, islice
for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
print x12

(Of course the print statement is just illustrative.)
What is the fastest way? (Ignore the import time.)

Thanks,
Alan Isaac

31 Answers

Paul Rubin

1/22/2008 3:48:00 AM

Alan Isaac <aisaac@american.edu> writes:
> (Of course the print statement is just illustrative.)
> What is the fastest way? (Ignore the import time.)

You have to try a bunch of different ways and time them. One
idea (untested):

def pairs(seq):
while True:
yield (seq.next(), seq.next())

George Sakkis

1/22/2008 4:54:00 AM

On Jan 21, 10:20 pm, Alan Isaac <ais...@american.edu> wrote:
> I want to generate sequential pairs from a list.
> Here is a way::
>
> from itertools import izip, islice
> for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
> print x12
>
> (Of course the print statement is just illustrative.)
> What is the fastest way? (Ignore the import time.)

Look up the timeit module and test yourself the various alternatives;
that's the most reliable way to tell for sure.

George

Paddy

1/22/2008 5:15:00 AM

On Jan 22, 3:20 am, Alan Isaac <ais...@american.edu> wrote:
> I want to generate sequential pairs from a list.
<<snip>>
> What is the fastest way? (Ignore the import time.)
1) How fast is the method you have?
2) How much faster does it need to be for your application?
3) Are their any other bottlenecks in your application?
4) Is this the routine whose smallest % speed-up would give the
largest overall speed up of your application?

- Paddy.

George Sakkis

1/22/2008 5:34:00 AM

On Jan 22, 12:15 am, Paddy <paddy3...@googlemail.com> wrote:
> On Jan 22, 3:20 am, Alan Isaac <ais...@american.edu> wrote:> I want to generate sequential pairs from a list.
> <<snip>>
> > What is the fastest way? (Ignore the import time.)
>
> 1) How fast is the method you have?
> 2) How much faster does it need to be for your application?
> 3) Are their any other bottlenecks in your application?
> 4) Is this the routine whose smallest % speed-up would give the
> largest overall speed up of your application?

I believe the "what is the fastest way" question for such small well-
defined tasks is worth asking on its own, regardless of whether it
makes a difference in the application (or even if there is no
application to begin with). Just because cpu cycles are cheap these
days is not a good reason to be sloppy. Moreover, often the fastest
pure Python version happens to be among the most elegant and concise,
unlike other languages where optimization usually implies obfuscation.

George

Steven D'Aprano

1/22/2008 7:09:00 AM

On Mon, 21 Jan 2008 21:34:28 -0800, George Sakkis wrote:

> I believe the "what is the fastest way" question for such small well-
> defined tasks is worth asking on its own, regardless of whether it makes
> a difference in the application (or even if there is no application to
> begin with). Just because cpu cycles are cheap these days is not a good
> reason to be sloppy. Moreover, often the fastest pure Python version
> happens to be among the most elegant and concise, unlike other languages
> where optimization usually implies obfuscation.

I wonder why it is that people automatically assume that "optimization"
means optimize the time taken, and not the developer effort to write it
in the first place, the effort required to maintain it over time, or the
memory used at runtime, let alone some combination of all four factors.

Memory is cheap, but applications are hungry.

CPUs are fast, and for most applications the difference between 3ms and
30ms is undetectable by the user. Why do we care so little about saving
memory and so much about ever-decreasing time savings?

--
Steven

Arnaud Delobelle

1/22/2008 7:11:00 AM

On Jan 22, 3:20 am, Alan Isaac <ais...@american.edu> wrote:
> I want to generate sequential pairs from a list.
> Here is a way::
>
> from itertools import izip, islice
> for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
> print x12
>
> (Of course the print statement is just illustrative.)
> What is the fastest way? (Ignore the import time.)
>
> Thanks,
> Alan Isaac

Don't know the fastest, but here's a very concise way:

from itertools import izip

def ipairs(seq):
it = iter(seq)
return izip(it, it)

>>> list(pairs(xrange(10)))
[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
>>> list(pairs('hello'))
[('h', 'e'), ('l', 'l')]

--
Arnaud

Alan G Isaac

1/22/2008 1:19:00 PM

I suppose my question should have been,
is there an obviously faster way?
Anyway, of the four ways below, the
first is substantially fastest. Is
there an obvious reason why?

Thanks,
Alan Isaac

PS My understanding is that the behavior
of the last is implementation dependent
and not guaranteed.

def pairs1(x):
for x12 in izip(islice(x,0,None,2),islice(x,1,None,2)):
yield x12

def pairs2(x):
xiter = iter(x)
while True:
yield xiter.next(), xiter.next()

def pairs3(x):
for i in range( len(x)//2 ):
yield x[2*i], x[2*i+1],

def pairs4(x):
xiter = iter(x)
for x12 in izip(xiter,xiter):
yield x12

Arnaud Delobelle

1/22/2008 2:19:00 PM

On Jan 22, 1:19 pm, Alan Isaac <ais...@american.edu> wrote:
[...]
> PS My understanding is that the behavior
> of the last is implementation dependent
> and not guaranteed.
[...]
> def pairs4(x):
> xiter = iter(x)
> for x12 in izip(xiter,xiter):
> yield x12

According to the docs [1], izip is defined to be equivalent to:

def izip(*iterables):
iterables = map(iter, iterables)
while iterables:
result = [it.next() for it in iterables]
yield tuple(result)

This guarantees that it.next() will be performed from left to right,
so there is no risk that e.g. pairs4([1, 2, 3, 4]) returns [(2, 1),
(4, 3)].

Is there anything else that I am overlooking?

[1] http://docs.python.org/lib/itertools-func...

--
Arnaud

Bearophile

1/22/2008 3:26:00 PM

Alan Isaac>What is the fastest way? (Ignore the import time.)<

Maybe someday someone will realize such stuff belongs to the python
STD lib...

If you need a lazy generator without padding, that splits starting
from the start, then this is the faster to me if n is close to 2:

def xpartition(seq, n=2):
return izip( *(iter(seq),)*n )

If you need the faster greedy version without padding then there are
two answers, one for Psyco and one for Python without... :-)
If you need padding or to start from the end then there are more
answers...

Bye,
bearophile

Arnaud Delobelle

1/22/2008 4:09:00 PM

On Jan 22, 1:19 pm, Alan Isaac <ais...@american.edu> wrote:
> I suppose my question should have been,
> is there an obviously faster way?
> Anyway, of the four ways below, the
> first is substantially fastest. Is
> there an obvious reason why?

Can you post your results?

I get different ones (pairs1 and pairs2 rewritten slightly to avoid
unnecessary indirection).

====== pairs.py ===========
from itertools import *

def pairs1(x):
return izip(islice(x,0,None,2),islice(x,1,None,2))

def pairs2(x):
xiter = iter(x)
while True:
yield xiter.next(), xiter.next()

def pairs3(x):
for i in range( len(x)//2 ):
yield x[2*i], x[2*i+1],

def pairs4(x):
xiter = iter(x)
return izip(xiter,xiter)

def compare():
import timeit
for i in '1234':
t = timeit.Timer('list(pairs.pairs%s(l))' % i,
'import pairs; l=range(1000)')
print 'pairs%s: %s' % (i, t.timeit(10000))

if __name__ == '__main__':
compare()
=====================

marigold:python arno$ python pairs.py
pairs1: 0.789824962616
pairs2: 4.08462786674
pairs3: 2.90438890457
pairs4: 0.536775827408

pairs4 wins.

--
Arnaud

comp.lang.python

pairs from a list

Alan G Isaac

Paul Rubin

George Sakkis

Paddy

George Sakkis

Steven D'Aprano

Arnaud Delobelle

Alan G Isaac

Arnaud Delobelle

Bearophile

Arnaud Delobelle

x Login to ForumsZone