[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.c++

ifstream speed

Frank Neuhaus

11/16/2008 10:43:00 AM

Hi,

I have a large file that I am opening with an std::ifstream. This file
contains a number of objects. My classes know how to deserialize from
a std::istream, so right now, I am just passing this std::ifstream to
my class constructors and they read themselfes from the stream. Those
classes read their members right from the stream (i.e. they dont read
a number of bytes into a buffer and then extract their data from there
or anything). Unfortunately I have the impression that my current
approach is somewhat slow. I believe that the ifstream is not
buffering correctly. I would expect it to read a big chunk into
memory, and then have my deserialization basically work right inside
memory. Could this be? What else could be the cause of the slowdown?
How could I make this faster?

Thank you
4 Answers

Juha Nieminen

11/16/2008 12:29:00 PM

0

Frank Neuhaus wrote:
> What else could be the cause of the slowdown?
> How could I make this faster?

Although I have not measured in many years if this has changed with
more modern compilers, at least years ago the C++ streams were
significantly slower than the C streams in most (if not all) compilers.
I'm not exactly sure about why this is so.

Two completely different projects I have been involved in saw a very
significant increase in reading and writing speed when the usage of C++
streams was changed to C streams.

I recommend that you write a small test program in your system which
does the same thing with lots of input (or output) data using C++
streams and then C streams, and measure if there is a significant
difference in speed. If there is, then your actual program might require
a refactoring.

Marcel Müller

11/16/2008 4:18:00 PM

0

Frank Neuhaus wrote:
> I have a large file that I am opening with an std::ifstream. This file
> contains a number of objects. My classes know how to deserialize from
> a std::istream, so right now, I am just passing this std::ifstream to
> my class constructors and they read themselfes from the stream
[...]
> I would expect it to read a big chunk into
> memory, and then have my deserialization basically work right inside
> memory. Could this be? What else could be the cause of the slowdown?
> How could I make this faster?

Well, the iostream classes...

If you are talking about really much data, the big chunks must be in the
order of a few megabytes to get rid of the awful access times of common
direct accessible storage devices. Maybe your file system cache does the
job for you, maybe not. At least the standard buffers of the I/O
libraries are not that large.

Have you checked whether the deserialization basically eats CPU
resources or more I/O?
Furthermore, are you using portable I/O? This usually ends up with
working byte by byte and many shift operations.

Some operating systems have a way of specifying sequential access to a
stream. This can significantly improve the cache efficiency and the
throughput. Unfortunately neither C nor C++ has a standard way to set
such flags.

As a start you might tweak the buffers of the underlying filebuf.
(Method setbuf)


Marcel

Hendrik Schober

11/18/2008 9:26:00 AM

0

Juha Nieminen wrote:
> Frank Neuhaus wrote:
>> What else could be the cause of the slowdown?
>> How could I make this faster?
>
> Although I have not measured in many years if this has changed with
> more modern compilers, at least years ago the C++ streams were
> significantly slower than the C streams in most (if not all) compilers.
> I'm not exactly sure about why this is so.

Usually because C++ streams are implemented on top of C streams.
A decade or so ago Dietmar Kühl had an implementation that wasn't,
but he didn't maintain it for years. It was supposed to be fast.

> [...]

Schobi

Frank Neuhaus

11/18/2008 11:32:00 AM

0


"Marcel Müller" <news.5.maazl@spamgourmet.org> schrieb im Newsbeitrag
news:492047d2$0$31331$9b4e6d93@newsspool4.arcor-online.net...
> Frank Neuhaus wrote:
>> I have a large file that I am opening with an std::ifstream. This file
>> contains a number of objects. My classes know how to deserialize from
>> a std::istream, so right now, I am just passing this std::ifstream to
>> my class constructors and they read themselfes from the stream
> [...]
>> I would expect it to read a big chunk into
>> memory, and then have my deserialization basically work right inside
>> memory. Could this be? What else could be the cause of the slowdown?
>> How could I make this faster?
>
> Well, the iostream classes...
>
> If you are talking about really much data, the big chunks must be in the
> order of a few megabytes to get rid of the awful access times of common
> direct accessible storage devices. Maybe your file system cache does the
> job for you, maybe not. At least the standard buffers of the I/O libraries
> are not that large.
>
> Have you checked whether the deserialization basically eats CPU resources
> or more I/O?
> Furthermore, are you using portable I/O? This usually ends up with working
> byte by byte and many shift operations.

Hm its a bit hard to benchmark in my app but i strongly believe it was IO.

> Some operating systems have a way of specifying sequential access to a
> stream. This can significantly improve the cache efficiency and the
> throughput. Unfortunately neither C nor C++ has a standard way to set such
> flags.
>
> As a start you might tweak the buffers of the underlying filebuf. (Method
> setbuf)

I tried that with no success (changed it to 5 mb buffersize but the
performance didnt improve).
I just replaced the stream with a custom class that asynchronously reads
chunks of 10 mb using an additional io thread (all with fopen/fread/...).
Now the performance is ok. Note to self: dont use iostream again for large
amounts of data...

Thanks
Frank