[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

simple garbage problem

cchayden.nyt

12/31/2008 4:29:00 PM

When I run the program:

STDIN.each_line { |line| line.split("\t") }

with a big input file, the process size grows rapidly, reaching 1G in
less than 1 minute.
If I let it go, it continues to grow at this rate until the whole
system fails.
This is ruby 1.8.6 on fedora 9.

Is this a known problem?
Is there some way to work around it?
4 Answers

Andrew Timberlake

1/2/2009 8:30:00 AM

0

[Note: parts of this message were removed to make it a legal post.]

On Fri, Jan 2, 2009 at 10:10 AM, <cchayden.nyt@gmail.com> wrote:

> When I run the program:
>
> STDIN.each_line { |line| line.split("\t") }
>
> with a big input file, the process size grows rapidly, reaching 1G in
> less than 1 minute.
> If I let it go, it continues to grow at this rate until the whole
> system fails.
> This is ruby 1.8.6 on fedora 9.
>
> Is this a known problem?
> Is there some way to work around it?
>
>
I haven't done much input processing in ruby but I've used this with no
problems on large files:

while line = STDIN.gets
...
end

Just check that your file line breaks and the line breaks being used by
gets/each_line are the same.
Also make sure you actually have line breaks (sometimes you just gotta
check)

Andrew
http://ramblingso...

Roger Pack

1/2/2009 10:27:00 PM

0

> Is this a known problem?
> Is there some way to work around it?

Which version of ruby are you using? Very old ones have some leaks.
Besides that you may be able to apply some recent patches to GC [1] and
see if they help.
-=r
[1] http://www.ruby-...to...
--
Posted via http://www.ruby-....

Robert Klemme

1/4/2009 7:29:00 PM

0

On 31.12.2008 17:28, cchayden.nyt@gmail.com wrote:
> When I run the program:
>
> STDIN.each_line { |line| line.split("\t") }

Is it really only this line? How do you feed the big file to stdin?
Does the big file actually _have_ lines? Because if not you would
likely see that behavior because each_line needs to read at least until
it finds a line terminator.

> with a big input file, the process size grows rapidly, reaching 1G in
> less than 1 minute.
> If I let it go, it continues to grow at this rate until the whole
> system fails.

What does this mean? Does the kernel panic? Or does the Ruby process
terminate with an error?

> This is ruby 1.8.6 on fedora 9.

You do not accidentally have switched off GC completely, do you?

> Is this a known problem?
> Is there some way to work around it?

1.8.6 is not really current any more. I would upgrade if possible - at
least get the latest version of the package. For more advice we have a
bit too little information, I am afraid.

When I try this with my cygwin version memory consumption stays at
roughly 3MB:

robert@fussel ~
$ perl -e 'foreach $i (1..10000000) {print $i, "--\t--\t--\t--\n";}' > | ruby -e 'STDIN.each_line { |line| line.split("\t") }'

robert@fussel ~
$ ruby -v
ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]

robert@fussel ~
$

But if I read a file that does not have lines behavior is as you
describe, memory goes up and up

ruby -e 'STDIN.each_line { |line| line.split("\t") }' </dev/zero

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end

Charles Oliver Nutter

1/6/2009 9:47:00 AM

0

cchayden.nyt@gmail.com wrote:
> When I run the program:
>
> STDIN.each_line { |line| line.split("\t") }
>
> with a big input file, the process size grows rapidly, reaching 1G in
> less than 1 minute.
> If I let it go, it continues to grow at this rate until the whole
> system fails.
> This is ruby 1.8.6 on fedora 9.
>
> Is this a known problem?
> Is there some way to work around it?

Might be worth trying on JRuby. We've had no such reports, and the
garbage collector is rock-solid.

- Charlie