Asp Forum - Ruby, memory and speed

Guillaume Marcais

8/17/2006 1:54:00 PM

I have a script that aggregates data from multiple file, store it all
in a hash, and then emit a summary on standard input. The input files
(text files) are fairly big, like 4 of about 50Mb and 4 of about 350Mb.
The hash will grow to about 500 000 keys. The memory footprint of the
ruby process as reported by top is above 2 Gigs.

When the script start, it processes the files at a speed of 10K/s or
so. Not lightening fast, but will get the job done. As time goes on,
the speed drops down to 100 bytes/s or less, while still taking 100%
CPU time. Unbearable. The machine it is running on is pretty good:
4xAMD Opteron 64bit, 32G memory, local scsi raided drive.

Does the performance of Ruby collapse past a certain memory usage? Like
the GC kicks in all the time.

Any clue on how to speed this up? Any help appreciated.

Guillaume.

The code is as followed:

delta and snps are IOs. reads is a hash. max is an integer (4 in my
case).
It expects a line starting with a '>' on delta. Then it reads some
information on delta (and discard the rest) and some more information
on snps (if present). All this is then recorded in the reads hash file.
Each entry entry in the hash are arrays with the 4 best match found so
far.

def delta_reorder(delta, snps, reads, max = nil)
l = delta.gets or return
snps_a = nil
loop do
l =~ /^>(\S+)\s+(\S+)/ or break
contig_name, read_name = $1, $2
read = (reads[read_name] ||= [])
loop do
l = delta.gets or break
l[0] == ?> and break
cs, ce, rs, re, er = l.scan(/\d+/)
er_i = er.to_i
cs && ce && rs && re && er or break
l = delta.gets while l && l != "0\n"
if snps
snps_a = []
er_i.times { l << snps.gets or break; snps_a << l.split[-1] }
end
score = (re.to_i - rs.to_i).abs - 6 * er_i
if max
# i = read.bsearch_upper_boundary { |x| score <=> x[1] }
# read.insert(i, [contig_name, score, cs, ce, rs, re, er,
snps_a])
# read.slice!(max..-1) if read.size > max
if read.size >= max
min = read.min { |x, y| x[1] <=> y[1] }
if score > min[1]
min.replace([contig_name, score, cs, ce, rs, re, er,
snps_a])
end
else
read << [contig_name, score, cs, ce, rs, re, er, snps_a]
end
else
if !read[0] || score > read[0][1]
read[0] = [contig_name, score, cs, ce, rs, re, er, snps_a]
end
end
end
end
end

Example of data (comments after # are mine, not present in file):

# read_name (hash key) is gnl|ti|379331986
>gi|56411835|ref|NC_004353.2| gnl|ti|379331986 1281640 769
246697 246940 722 479 22 22 0 # Keep this info. Collect 22 lines
from snps IO
0 # Skip
440272 440723 156 617 41 41 0 # Keep this info. Collect 41 lines
from snps IO
147 # Skip 'til 0
-22
-206
-1
-1
-1
-1
-1
-1
-1
-1
-1
0
441263 441492 384 152 17 17 0 # Keep. Collect lines from snps.
-44 # Skip 'til 0
-1
-1
-1
37
0
>gi|56411835|ref|NC_004353.2| gnl|ti|379331989 1281640 745 # and so
forth...
453805 453934 130 1 8 8 0
0

13 Answers

Morton Goldberg

8/17/2006 2:35:00 PM

On Aug 17, 2006, at 9:54 AM, Guillaume Marcais wrote:
> Does the performance of Ruby collapse past a certain memory usage?
> Like the GC kicks in all the time.

I think you got it.

Regards, Morton

Robert Klemme

8/17/2006 2:42:00 PM

On 17.08.2006 15:54, Guillaume Marcais wrote:
> I have a script that aggregates data from multiple file, store it all in
> a hash, and then emit a summary on standard input. The input files (text
> files) are fairly big, like 4 of about 50Mb and 4 of about 350Mb. The
> hash will grow to about 500 000 keys. The memory footprint of the ruby
> process as reported by top is above 2 Gigs.
>
> When the script start, it processes the files at a speed of 10K/s or so.
> Not lightening fast, but will get the job done. As time goes on, the
> speed drops down to 100 bytes/s or less, while still taking 100% CPU
> time. Unbearable. The machine it is running on is pretty good: 4xAMD
> Opteron 64bit, 32G memory, local scsi raided drive.
>
> Does the performance of Ruby collapse past a certain memory usage? Like
> the GC kicks in all the time.
>
> Any clue on how to speed this up? Any help appreciated.
>
> Guillaume.
>
>
> The code is as followed:
>
> delta and snps are IOs. reads is a hash. max is an integer (4 in my case).
> It expects a line starting with a '>' on delta. Then it reads some
> information on delta (and discard the rest) and some more information on
> snps (if present). All this is then recorded in the reads hash file.
> Each entry entry in the hash are arrays with the 4 best match found so far.
>
> def delta_reorder(delta, snps, reads, max = nil)
> l = delta.gets or return
> snps_a = nil
> loop do
> l =~ /^>(\S+)\s+(\S+)/ or break
> contig_name, read_name = $1, $2

Small optimization, which will help only if delta_reorder is called ofen:

read = (reads[read_name.freeze] ||= [])

Background: a Hash will dup a non frozen string to avoid nasty effects
if the original changes.

<snip/>

To make people's lives who want to play with this easier you could
provide a complete test set (original script + data files).

I don't fully understand your processing but maybe there's an option to
improve this algorithm wise.

Kind regards

robert

Bill Kelly

8/17/2006 5:01:00 PM

From: "Francis Cianfrocca" <garbagecat10@gmail.com>
>
> This is a perfect example of what I've noticed many times: Ruby's
> performance is perfectly fast and acceptable until the working set gets a
> certain (not terribly large) size, then it falls off a cliff. GC perhaps has
> something to do with it, but I suspect that's only a small part of the
> problem.

Can anyone speculate as to what other subsystems or facets of Ruby's
architecture, besides GC, might possibly cause this?

Regards,

Bill

Guillaume Marcais

8/17/2006 5:31:00 PM

Le 17 août 06, à 10:45, Robert Klemme a écrit :

> On 17.08.2006 15:54, Guillaume Marcais wrote:
>> I have a script that aggregates data from multiple file, store it all
>> in a hash, and then emit a summary on standard input. The input files
>> (text files) are fairly big, like 4 of about 50Mb and 4 of about
>> 350Mb. The hash will grow to about 500 000 keys. The memory footprint
>> of the ruby process as reported by top is above 2 Gigs.
>> When the script start, it processes the files at a speed of 10K/s or
>> so. Not lightening fast, but will get the job done. As time goes on,
>> the speed drops down to 100 bytes/s or less, while still taking 100%
>> CPU time. Unbearable. The machine it is running on is pretty good:
>> 4xAMD Opteron 64bit, 32G memory, local scsi raided drive.
>> Does the performance of Ruby collapse past a certain memory usage?
>> Like the GC kicks in all the time.
>> Any clue on how to speed this up? Any help appreciated.
>> Guillaume.
>> The code is as followed:
>> delta and snps are IOs. reads is a hash. max is an integer (4 in my
>> case).
>> It expects a line starting with a '>' on delta. Then it reads some
>> information on delta (and discard the rest) and some more information
>> on snps (if present). All this is then recorded in the reads hash
>> file.
>> Each entry entry in the hash are arrays with the 4 best match found
>> so far.
>> def delta_reorder(delta, snps, reads, max = nil)
>> l = delta.gets or return
>> snps_a = nil
>> loop do
>> l =~ /^>(\S+)\s+(\S+)/ or break
>> contig_name, read_name = $1, $2
>
> Small optimization, which will help only if delta_reorder is called
> ofen:
>
> read = (reads[read_name.freeze] ||= [])
>
> Background: a Hash will dup a non frozen string to avoid nasty effects
> if the original changes.
>
> <snip/>
>
> To make people's lives who want to play with this easier you could
> provide a complete test set (original script + data files).

Will do, when I get to my office.

Guillaume.

>
> I don't fully understand your processing but maybe there's an option
> to improve this algorithm wise.
>
> Kind regards
>
> robert
>
>

Francis Cianfrocca

8/17/2006 5:40:00 PM

On 8/17/06, Bill Kelly <billk@cts.com> wrote:
> From: "Francis Cianfrocca" <garbagecat10@gmail.com>
> >
> > This is a perfect example of what I've noticed many times: Ruby's
> > performance is perfectly fast and acceptable until the working set gets a
> > certain (not terribly large) size, then it falls off a cliff. GC perhaps has
> > something to do with it, but I suspect that's only a small part of the
> > problem.
>
> Can anyone speculate as to what other subsystems or facets of Ruby's
> architecture, besides GC, might possibly cause this?
>

Total, rank speculation here, but I don't suspect GC because I
consistently see this kind of problem even when a lot of objects are
being created and not many are becoming garbage. My wild
unsubstantiated guess is that Ruby creates data structures in memory
that are not designed to maximize processor- and L2- cache
utilization. Perhaps data objects overflow cache lines, or they are
heavily pointer-driven (requiring more memory-bus cycles than is
optimal), or they exhibit poor locality of reference.

Guillaume Marcais

8/17/2006 5:49:00 PM

Le 17 août 06, à 11:16, Francis Cianfrocca a écrit :

> On 8/17/06, Guillaume Marcais <guslist@free.fr> wrote:
>>
>> I have a script that aggregates data from multiple file, store it all
>> in a hash, and then emit a summary on standard input. The input files
>> (text files) are fairly big, like 4 of about 50Mb and 4 of about
>> 350Mb.
>> The hash will grow to about 500 000 keys. The memory footprint of the
>> ruby process as reported by top is above 2 Gigs.
>
>
>
> This is a perfect example of what I've noticed many times: Ruby's
> performance is perfectly fast and acceptable until the working set
> gets a
> certain (not terribly large) size, then it falls off a cliff. GC
> perhaps has
> something to do with it, but I suspect that's only a small part of the
> problem.
>

Is there any tuning of GCC so it kicks in less frequently when the
memory consumption is large? Or could it be that the Hash algorithm
chokes when the number of keys gets large (like > 100_000)?

I guess I could re-implement in C or C++, but it saddens me because it
is a quick script for (probably) a one time task and doing string
processing is so much easier in Ruby. I'll try in Perl first.

> Before I spend a lot of time understanding your algorithm: is there a
> specific reason why you need to keep the entire set in memory? You say
> you're generating summary output at the end of the run, but can you
> accumulate your summary in an object of fixed size?

No, I can't (at least I don't see how). I have n reads split into ns
files and m contigs split into ms files. Then all ns read files are
match against all ms contig files, which leads to ns * ms match (or
delta) files. And each reads may have one or many match with any of the
contigs.

I want to extract for each read the 4 "best" matches. (the score is
defined as the length of the match minus 6 times the number of errors
in this match (we are talking about DNA alignment, so a match can have
a small number of mismatches, due to sequencing errors, mutations,
etc.)). The snps here are just added information, not crucial, but
probably adds to the problem because it increases the size of the data
set. It is the exact position of all the single mismatches inside of a
match. The number of line (one mismatch per line) is given by the
sub-header in the delta file.

So for every set of reads, I parse the ms delta files corresponding and
extract for the best 4 match for each read. The routing showed before
parses only one file. The same hash is used for all ms files.

>
> Or does your summary depend on some kind of transformation on the
> entire set
> (like a numeric sort)? If so, there are things you can do to improve
> the
> algorithm so you're not keeping a large working set in memory.
>

Maybe. I haven't seen it yet. And I went first with the dumb algorithm
as it is an ad-hock script, not part of some great piece of software.

Thanks,
Guillaume.

Mauricio Fernández

8/17/2006 7:13:00 PM

On Thu, Aug 17, 2006 at 10:54:04PM +0900, Guillaume Marcais wrote:
[...]
> When the script start, it processes the files at a speed of 10K/s or
> so. Not lightening fast, but will get the job done. As time goes on,
> the speed drops down to 100 bytes/s or less, while still taking 100%
> CPU time. Unbearable. The machine it is running on is pretty good:
> 4xAMD Opteron 64bit, 32G memory, local scsi raided drive.
>
> Does the performance of Ruby collapse past a certain memory usage? Like
> the GC kicks in all the time.
>
> Any clue on how to speed this up? Any help appreciated.
[...]

Have you tried to increment GC_MALLOC_LIMIT in gc.c?

--
Mauricio Fernandez - http://eige... - singular Ruby

David Vallner

8/17/2006 8:16:00 PM

On Thu, 17 Aug 2006 19:49:03 +0200, Guillaume Marcais <guslist@free.fr>
wrote:
> Is there any tuning of GCC so it kicks in less frequently when the
> memory consumption is large? Or could it be that the Hash algorithm
> chokes when the number of keys gets large (like > 100_000)?

If I remember a past thread correctly, the Ruby GC is set to keep memory
usage below 8 MB by default. This is specified is determined by a #define
in the interpreter source code, and I don't think it's really an adaptive
algorithm. If GC is the performance problem, you could try changing that
constant and rebuilding the interpreter.

As for the Hash problem, profiling could probably tell you that. Of
course, profiling with such a huge dataset when you already know the code
gets anomalously slow might take really, really long.

David Vallner

guillaume.marcais

8/18/2006 12:49:00 AM

Mauricio Fernandez wrote:
> On Thu, Aug 17, 2006 at 10:54:04PM +0900, Guillaume Marcais wrote:
> [...]
> > When the script start, it processes the files at a speed of 10K/s or
> > so. Not lightening fast, but will get the job done. As time goes on,
> > the speed drops down to 100 bytes/s or less, while still taking 100%
> > CPU time. Unbearable. The machine it is running on is pretty good:
> > 4xAMD Opteron 64bit, 32G memory, local scsi raided drive.
> >
> > Does the performance of Ruby collapse past a certain memory usage? Like
> > the GC kicks in all the time.
> >
> > Any clue on how to speed this up? Any help appreciated.
> [...]
>
> Have you tried to increment GC_MALLOC_LIMIT in gc.c?

No. Any hint on what value I can raised it to? Any upper limit that I
should not cross?

Guillaume.
>
> --
> Mauricio Fernandez - http://eige... - singular Ruby

guillaume.marcais

8/18/2006 12:49:00 AM

comp.lang.ruby

Ruby, memory and speed

Guillaume Marcais

Morton Goldberg

Robert Klemme

Bill Kelly

Guillaume Marcais

Francis Cianfrocca

Guillaume Marcais

Mauricio Fernández

David Vallner

guillaume.marcais

guillaume.marcais

x Login to ForumsZone