[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

file seek is slow

jcb

3/9/2010 11:57:00 PM

I ran a comparison that timed 1e6 file seeks.
The tests were run with Python 2.6.4 and Microsoft Visual C 6.0 on
Windows XP with an Intel 3GHz single processor with hyperthreading.

Results:
C : 0.812 seconds
Python: 1.458 seconds.
difference = 0.646 seconds.

If the file.seek is removed the Python loop takes 2ms so the loop
overhead is minimal.
Without pysco the loop overhead is 4.6ms and Python takes 1.866.

Any ideas what is causing the slow down over the 'C' version.

In general I have been trying to take a video application written in C+
+ and make it work in Python.
There seem to be delays in the handoff to Windows System code that
make the Python version just a touch slower in some areas, but these
slowdowns are critically effecting the work. File seek is not a deal
breaker here, it is just the latest thing I have noticed and the
simplest to demonstrate.


Python version:
import time
def main():
# write temp file
SIZE = 1000
f1 = file('video.txt', 'wb')
f1.write('+' * SIZE)
f1.close()

f1 = file('video.txt', 'rb')
t0 = time.clock()
for i in xrange(1000000):
f1.seek(0)
delta = time.clock() - t0
print "%.3f" % delta
f1.close()

if __name__ == '__main__':
import psyco
psyco.full()
main()

// 'C' version
#include <stdio.h>
#include <time.h>
#define SIZE 1000

static void main(int argc, char *argv[])
{
FILE *f1;
int i;
int t0;
float delta;
char buffer[SIZE];

// write temp file
memset(buffer, (int)'+', SIZE);
f1 = fopen("video.txt", "wb");
fwrite(buffer, SIZE, 1, f1);
fclose(f1);

f1 = fopen("video.txt", "rb");
t0 = clock();
for (i=0; i < 1000000; i++)
{
fseek(f1, 0, SEEK_SET);
}
delta = (float)(clock() - t0) / CLOCKS_PER_SEC;
printf("%.3f\n", delta);
fclose(f1);
}
12 Answers

Paul McGuire

3/10/2010 12:12:00 AM

0

This is a pretty tight loop:

for i in xrange(1000000):
f1.seek(0)

But there is still a lot going on, some of which you can lift out of
the loop. The easiest I can think of is the lookup of the 'seek'
attribute on the f1 object. Try this:

f1_seek = f1.seek
for i in xrange(1000000):
f1_seek(0)

How does that help your timing?

-- Paul

Tim Roberts

3/10/2010 7:09:00 AM

0

Metalone <jcb@iteris.com> wrote:
>
>static void main(int argc, char *argv[])

As a side note, do you realize that this definition is invalid, in two
ways? "main" cannot be declared "static". The whole reason we use the
special name "main" is so that the startup code in the C run-time can link
to it. If "main" is static, it won't be exposed in the object file, and
the linker couldn't find it. It happens to work here because your C
compiler knows about "main" and discards the "static", but that's not a
good practice.

Further, it's not valid to have "main" return "void". The standards
require that it be declared as returning "int". Again, "void" happens to
work in VC++, but there are architectures where it does not.
--
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

jcb

3/10/2010 8:04:00 PM

0

f1_seek = f1.seek did not change the performance at all.
As it turns out each call is only
646 nanoseconds slower than 'C'.
However, that is still 80% of the time to perform a file seek,
which I would think is a relatively slow operation compared to just
making a system call.

jcb

3/10/2010 8:04:00 PM

0

Thanks, Tim.
Good to know.

Neil Hodgson

3/10/2010 11:01:00 PM

0

Metalone:

> As it turns out each call is only
> 646 nanoseconds slower than 'C'.
> However, that is still 80% of the time to perform a file seek,
> which I would think is a relatively slow operation compared to just
> making a system call.

A seek may not be doing much beyond setting a current offset value.
It is likely that fseek(f1, 0, SEEK_SET) isn't even doing a system call.

An implementation of fseek will often return relatively quickly when
the position is within the current buffer -- from line 192 in
http://www.google.com/codesearch/p?hl=en#XAzRy8oK4zA/libc/stdio/fseek.c&q=fseek&sa=N&cd=1...

Neil

sjdevnull@yahoo.com

3/10/2010 11:38:00 PM

0

On Mar 10, 6:01 pm, Neil Hodgson <nyamatongwe+thun...@gmail.com>
wrote:
> Metalone:
>
> > As it turns out each call is only
> > 646 nanoseconds slower than 'C'.
> > However, that is still 80% of the time to perform a file seek,
> > which I would think is a relatively slow operation compared to just
> > making a system call.
>
>    A seek may not be doing much beyond setting a current offset value.
> It is likely that fseek(f1, 0, SEEK_SET) isn't even doing a system call.

Exactly. If I replace both calls to fseek with gettimeofday (aka
time.time() on my platform in python) I get fairly close results:
$ ./testseek
4.120
$ python2.5 testseek.py
4.170
$ ./testseek
4.080
$ python2.5 testseek.py
4.130


FWIW, my results with fseek aren't as bad as those of the OP. This is
python2.5 on a 2.6.9 Linux OS, with psyco:
$ ./testseek
0.560
$ python2.5 testseek.py
0.750
$ ./testseek
0.570
$ python2.5 testseek.py
0.760

jcb

3/11/2010 7:02:00 PM

0

I am assuming that Python delegates the f.seek call to the seek call
in the MS C runtime library msvcrt.dll.
Does anybody know a nice link to the Python source like was posted
above for the BSD 'C' library?

Ok, I ran some more tests.
C, seek : 0.812 seconds // test from original post
Python, f.seek : 1.458 seconds. // test from original post

C, time(&tm) : 0.671 seconds
Python, time.time(): 0.513 seconds.
Python, ctypes.msvcrt.time(ctypes.byref(tm)): 0.971 seconds. #
factored the overhead to be outside the loop, so really this was
func_ptr(ptr).

Perhaps I am just comparing apples to oranges.
I never tested the overhead of ctypes like this before.
Most of my problem timings involve calls through ctypes.

jcb

3/11/2010 10:57:00 PM

0

I just tried the seek test with Cython.
Cython fseek() : 1.059 seconds. 30% slower than 'C'
Python f.seek : 1.458 secondds. 80% slower than 'C'.

It is amazing to me that Cython generates a 'C' file that is 1478
lines.


#Cython code

import time

cdef int SEEK_SET = 0

cdef extern from "stdio.h":
void* fopen(char* filename, char* mode)
int fseek(void*, long, int)

def main():
cdef void* f1 = fopen('video.txt', 'rb')
cdef int i=1000000
t0 = time.clock()
while i > 0:
fseek(f1, 0, SEEK_SET)
i -= 1
delta = time.clock() - t0
print "%.3f" % delta

if __name__ == '__main__':
main()

Steve Holden

3/12/2010 1:44:00 AM

0

Metalone wrote:
> I just tried the seek test with Cython.
> Cython fseek() : 1.059 seconds. 30% slower than 'C'
> Python f.seek : 1.458 secondds. 80% slower than 'C'.
>
> It is amazing to me that Cython generates a 'C' file that is 1478
> lines.
>
And what response are you seeking to your amazement?

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
See PyCon Talks from Atlanta 2010 http://pyco...
Holden Web LLC http://www.hold...
UPCOMING EVENTS: http://holdenweb.event...

Stefan Behnel

3/12/2010 8:56:00 AM

0

Metalone, 11.03.2010 23:57:
> I just tried the seek test with Cython.
> Cython fseek() : 1.059 seconds. 30% slower than 'C'
> Python f.seek : 1.458 secondds. 80% slower than 'C'.
>
> It is amazing to me that Cython generates a 'C' file that is 1478
> lines.

Well, it generated an optimised Python interface for your module and made
it compilable in CPython 2.3 through 3.2. It doesn't look like your C
module features that. ;)


> #Cython code
>
> import time
>
> cdef int SEEK_SET = 0
>
> cdef extern from "stdio.h":
> void* fopen(char* filename, char* mode)
> int fseek(void*, long, int)

Cython ships with a stdio.pxd that you can cimport. It looks like it
doesn't currently define fseek(), but it defines at least fopen() and FILE.
Patches are always welcome.


> def main():
> cdef void* f1 = fopen('video.txt', 'rb')
> cdef int i=1000000
> t0 = time.clock()
> while i> 0:
> fseek(f1, 0, SEEK_SET)
> i -= 1
> delta = time.clock() - t0

Note that the call to time.clock() takes some time, too, so it's not
surprising that this is slower than hand-written C code. Did you test how
it scales?

Also, did you look at the generated C code or the annotated Cython code
(cython -a)? Did you make sure both were compiled with the same CFLAGS?

Also, any reason you're not using a for-in-xrange loop? It shouldn't make a
difference in speed, it's just more common. You even used a for loop in
your C code.

Finally, I'm not sure why you think that these 30% matter at all. In your
original post, you even state that seek-time isn't the "deal breaker", so
maybe you should concentrate on the real issues?

Stefan