Duncan Booth
1/9/2008 8:59:00 AM
Fredrik Lundh <fredrik@pythonware.com> wrote:
> Giampaolo Rodola' wrote:
>
>> To flush a list it is better doing "del mylist[:]" or "mylist = []"?
>> Is there a preferred way? If yes, why?
>
> The latter creates a new list object, the former modifies an existing
> list in place.
>
> The latter is shorter, reads better, and is probably a bit faster in
> most cases.
>
> The former should be used when it's important to clear a specific list
> object (e.g. if there are multiple references to the list).
I tried to measure this with timeit, and it looks like the 'del' is
actually quite a bit faster (which I find suprising).
C:\Python25\Lib>timeit.py -s "lista=range(10000)" "mylist=list(lista)"
10000 loops, best of 3: 81.1 usec per loop
C:\Python25\Lib>timeit.py -s "lista=range(10000)" "mylist=list(lista)"
"del mylist[:]"
10000 loops, best of 3: 61.7 usec per loop
C:\Python25\Lib>timeit.py -s "lista=range(10000)" "mylist=list(lista)"
"mylist=[]"
10000 loops, best of 3: 80.9 usec per loop
In the first test the local variable 'mylist' is simply allowed to go
out of scope, so the list is destroyed as its reference count drops to
0.
In the third case again the list is destroyed when no longer referenced,
but an empty list is also created and destroyed. Evidently the empty
list takes virtually no time to process compared with the long list.
The second case clears the list before destroying it, and appears to be
significantly faster.
Increasing the list length by a factor of 10 and it becomes clear that
not only is #2 always fastest, but #3 always comes in second. Only when
the lists are quite short (e.g. 10 elements) does #1 win (and even at 10
elements #2 beats #3).
Unless I've missed something, it looks like there may be an avoidable
bottleneck in the list code: whatever the slice delete is doing should
also be done by the deletion code (at least if the list is longer than
some minimum length).
The most obvious thing I can see is that list_dealloc:
if (op->ob_item != NULL) {
/* Do it backwards, for Christian Tismer.
There's a simple test case where somehow this reduces
thrashing when a *very* large list is created and
immediately deleted. */
i = Py_Size(op);
while (--i >= 0) {
Py_XDECREF(op->ob_item[i]);
}
PyMem_FREE(op->ob_item);
}
would be better written as a copy of (or even call to) list_clear which
picks up op->ob_item once instead of every time through the loop.