[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Memory Question

Zach Dennis

2/26/2006 9:51:00 AM

I am doing some large queries with Mysql and the memory that gets allocated never seems to go back
to the system. In this test I am querying 47,000 records.

My setup for my test is ubuntu breezy, rails 1.0, mysql 5.0.18 and a compiled mysql 2.7 ruby driver
running in production mode. I have wrote a helper to my program which monitors object count, memory
utilization and a threshold of 10% to determine how many objects are sticking around and how many
are being garbage collected.

On first run...

Loaded suite test/unit/table1_test
Started
String count 73685
Building query 21Mb
String count 73745

Received query results 89Mb
String count 901034
Threshold breaker String (827236) started w/ 73743 ended w/ 900979
Threshold breaker Table1 47000 started w/ 0 ended w/ 47000

Starting GC 89Mb
String count 25041
Done with GC 82Mb
Threshold breaker String -48704 started w/ 73743 ended w/ 25039

The first block of text shows that when I started my test ruby was utilizing 21Mb of memory and over
70,000 strings were in existance. After the query results were constructed there was over 900,000
strings in existance and my ruby process had grown to 89Mb. The threshold is shown as a gain in
47,000 Table1 objects and 827,236 Strings since the the first object count was captured (which
occured right before the original String count).

After garbage collecting the Stringcount is down 48,704 from when it was first captured, and there
is no threshold break for Table1 because when the program first started there were 0 in existance.
Thus meaning that 0 Table1 objects are in existance. However the memory size never seems to leave
82Mb. It never seems to leave that size.

If I query again, memory goes up to 133Mb, after all Strings and ActiveRecord models have been
garbage collected, memory goes down ever so slightly again. And this continues as long as I keep
querying.

The test is broken out into three methods:
- test_build_mem_usage (count objects, perform query, store results in local variable )
- test_gc (count objects, GC.start)
- test_z (done, recount objects since GC is done)

I guess my biggest unknown at the moment is...as I do large queries is ruby just hanging onto that
space? Why would it keep growing for the next time I did 47,000 item query, if it already had unused
space available from my last query?

At the bottom of this post is the actual test schema and test code I was using.

Zach

---- start schema ----
create table table1 (
id int unsigned not null auto_increment,
description varchar(255),
store_name varchar(255),
address1 varchar(40),
address2 varchar(40),
city varchar(40),
state varchar(15),
zip_code varchar(5),
primary key( id )
)TYPE=MyISAM;


---- start test code ----
# hook into the Rails environment
require File.dirname(__FILE__) + '/../test_helper'

require 'table1'

class Object
def count_objects
objects = Hash.new{ |h,k| h[k]=0 }
ObjectSpace.each_object{ |obj| objects[obj.class] += 1 }
objects
end

def print_threshold_breakers hsh1, hsh2, threshold
# threshold is in percentages
threshold = 1.0 + 1.0 / threshold

hsh2.each_key do |key|
max_num = hsh1[key] * threshold
min_num = hsh1[key] / threshold

if hsh2[key] > max_num or hsh2[key] < min_num
putsf "Threshold breaker #{key.to_s}", "(#{hsh2[key]-hsh1[key]}) started w/ #{hsh1[key]}
ended w/ #{hsh2[key]}"
end
end
end

def count_objects_for clazz
c = 0
ObjectSpace.each_object{ |o| c+=1 if o.is_a? clazz }
c
end

def mem_usage
# get the top two lines from top
line_arr = `ps -p #{Process.pid} -F`.split( /\n/ )

# split the line array into columns of headers and data
arr1, arr2 = line_arr.map{ |line| line.split( /\s+/ ) }

# force the same number of elements in arr2 as there are in arr1 by joining any leftover
elements
column_arr = [ arr1 ]
column_arr << arr2[0 .. arr1.size-2] + arr2[arr1.size-1 .. arr2.size-1].join( ' ' ).to_a

# get column/data key pair array
keypair_arr = column_arr.transpose

# create hash
hsh = {}
keypair_arr.each{ |e| hsh[e[0]] = e[1] }

# grab results from RSS, which are stored in Kb
(hsh[ 'RSS' ].to_i / 1024.0).round.to_s << "Mb"
end

def putsf label, *args
printf( "%-40.40s %-40s\n", label.to_s, args.join( ' ' ) )
end

def print_class_count clazz
putsf "#{clazz.name} count", count_objects_for( clazz )
end


end


class TableTest < Test::Unit::TestCase

def test_build_mem_usage
print_class_count String
putsf 'Building mem usage', mem_usage
h1 = @@h1 = count_objects
print_class_count String

records = Table1.find :all, :limit=>47000
@@oid = records.object_id
h2 = count_objects

putsf 'Done building mem usage', mem_usage
print_class_count String

print_threshold_breakers h1, h2, 10
sleep 2
puts
end

def test_starting_gc
putsf 'Starting GC', mem_usage

h1 = count_objects
GC.start

h2 = @@h2 = count_objects

print_class_count String
putsf 'Done with GC', mem_usage
print_threshold_breakers h1, h2, 10
puts
end

def test_z
test_build_mem_usage
test_starting_gc

test_build_mem_usage
test_starting_gc

test_build_mem_usage
test_starting_gc

print_class_count String
print_class_count Table1
putsf 'Done', mem_usage
print_threshold_breakers @@h1, @@h2, 10
puts

ObjectSpace.each_object{ |obj| puts "FOUND THE RECORD ARRAY " if obj.object_id == @@oid }
# ObjectSpace.each_object { |obj| puts obj if obj.is_a? String }
end


end



27 Answers

Eric Hodel

2/26/2006 12:01:00 PM

0

On Feb 26, 2006, at 1:51 AM, zdennis wrote:

> I am doing some large queries with Mysql and the memory that gets
> allocated never seems to go back to the system. In this test I am
> querying 47,000 records.
>
> My setup for my test is ubuntu breezy, rails 1.0, mysql 5.0.18 and
> a compiled mysql 2.7 ruby driver running in production mode. I have
> wrote a helper to my program which monitors object count, memory
> utilization and a threshold of 10% to determine how many objects
> are sticking around and how many are being garbage collected.
>
> On first run...
>
> Loaded suite test/unit/table1_test
> Started
> String count 73685
> Building query 21Mb
> String count 73745
>
> Received query results 89Mb
> String count 901034
> Threshold breaker String (827236) started w/
> 73743 ended w/ 900979
> Threshold breaker Table1 47000 started w/ 0 ended
> w/ 47000
>
> Starting GC 89Mb
> String count 25041
> Done with GC 82Mb
> Threshold breaker String -48704 started w/ 73743
> ended w/ 25039
>
> The first block of text shows that when I started my test ruby was
> utilizing 21Mb of memory and over 70,000 strings were in existance.
> After the query results were constructed there was over 900,000
> strings in existance and my ruby process had grown to 89Mb. The
> threshold is shown as a gain in 47,000 Table1 objects and 827,236
> Strings since the the first object count was captured (which
> occured right before the original String count).
>
> After garbage collecting the Stringcount is down 48,704 from when
> it was first captured, and there is no threshold break for Table1
> because when the program first started there were 0 in existance.
> Thus meaning that 0 Table1 objects are in existance. However the
> memory size never seems to leave 82Mb. It never seems to leave that
> size.
>
> If I query again, memory goes up to 133Mb, after all Strings and
> ActiveRecord models have been garbage collected, memory goes down
> ever so slightly again. And this continues as long as I keep querying.

Ruby does not feature a compacting garbage collector so you may
simply have a lot of nearly-empty pages of memory that cannot be
returned to the operating system.

If you still suspect a leak I suggest you first see if you can
achieve the same results by simply generating objects without any DB
and also with just the mysql driver, no Rails.

--
Eric Hodel - drbrain@segment7.net - http://blog.se...
This implementation is HODEL-HASH-9600 compliant

http://trackmap.rob...




minkoo.seo@gmail.com

2/26/2006 5:32:00 PM

0

Eric Hodel wrote:
> On Feb 26, 2006, at 1:51 AM, zdennis wrote:
>
> Ruby does not feature a compacting garbage collector so you may
> simply have a lot of nearly-empty pages of memory that cannot be
> returned to the operating system.
>

What? Is this true? It's quite disappointing. Is there any master plan
in the ruby language for supporting compacting GC?

- Minkoo Seo

Zach Dennis

2/26/2006 10:47:00 PM

0

Eric Hodel wrote:
> Ruby does not feature a compacting garbage collector so you may simply
> have a lot of nearly-empty pages of memory that cannot be returned to
> the operating system.
>
> If you still suspect a leak I suggest you first see if you can achieve
> the same results by simply generating objects without any DB and also
> with just the mysql driver, no Rails.
>

Well I did some digging, and I'm left with more questions. The following script does NOT collect the
800,000 strings created (this doesn't use Mysql, Rails or anything else, just ruby core). I ran this
in accordance with a couple other scripts which utilize 150mb+ of memory, to see if the os would
somehow force ruby to do better GC'ing, it didn't and swap was used. On my system this script
consistently used 42Mb of memory, and the string count never went below 800,000. Why would the
strings not get gc'd when they are out of scope ?

----------test1.rb----------------
def count_objects_for clazz
c = 0
ObjectSpace.each_object{ |o| c+=1 if o.is_a? clazz }
c
end

class A
def run
arr = []
800000.times { arr << "d" * 7 }

puts "check now"
sleep 10
end
end

puts count_objects_for( String )
A.new.run
puts count_objects_for( String )
GC.start
puts count_objects_for( String )

Thread.new do
loop do
puts count_objects_for( String )
GC.start
sleep 5
end
end.join
-------end test1.rb-------------


Send test, same script as above but slight modification to call Array#clear. Now this actually
collects the Strings. I am specifically removing reference in the below code by clearing the Array
arr. But in the above code I don't reference arr or it's value anywhere outside of the method it is
in, so after that call occurs shouldn't it be fair game for the garbage collector?

-------- test2.rb-----------
def count_objects_for clazz
c = 0
ObjectSpace.each_object{ |o| c+=1 if o.is_a? clazz }
c
end

class A
def run
arr = []
800000.times { arr << "d" * 7 }
puts "check now"
sleep 10
arr.clear
end
end

puts count_objects_for( String )
A.new.run
puts count_objects_for( String )
GC.start
puts count_objects_for( String )

Thread.new do
loop do
puts count_objects_for( String )
GC.start
sleep 5
end
end.join
-------end test1.rb-------------

I modified this second script to run with 8 million string also. When the Strings are GC'd the ruby
processes memory usage goes down, but not as much as I'd think. With 8 million strings I am getting
330+Mb of memory in use, but when I GC them, memory only seems to go 191Mb, where I would think it
would fall back down in the single/double digit's, perhaps 15M or 20M. Any reason why ruby doesn't
let go? Is this a problem? Right now I have ran this with ruby 1.8.3 (2005-06-23) [i486-linux], and
ruby 1.8.4 (2005-12-24) [powerpc-linux] with similar results.

Zach









greg.kujawa

2/27/2006 1:47:00 AM

0

Minkoo Seo wrote:


> Eric Hodel wrote:
> > On Feb 26, 2006, at 1:51 AM, zdennis wrote:
>
> > Ruby does not feature a compacting garbage collector so you may
> > simply have a lot of nearly-empty pages of memory that cannot be
> > returned to the operating system.
>
>
>
> What? Is this true? It's quite disappointing. Is there any master plan
> in the ruby language for supporting compacting GC?
>
> - Minkoo Seo

I have personally seen similar behavior. I have a Win32 Ruby CTI
listener process (based on the library of my http://tsapi.rub...
project) that I leave on my desktop 24x7 that serves like a caller ID
box for incoming phone calls to my extension. The looping script I have
sleeps 1 second each iteration to help keep the CPU from maxing out.
All I am doing is polling an event queue on the CTI server every
second. Not sure about what GC'ing would be going on behind the scenes
since the objects stay in memory due to the looping construct. But
after about a day or so the memory utilization starts to get to about
24 MB. Then after a few more days it tops 40 MB. If I don't restart the
script I see problems with other NTVDM processes running alongside it.
Perhaps I need to explicitly open/close the sockets each pass?

H.Yamamoto

2/27/2006 4:56:00 AM

0

Hello.

Ruby's GC is conservative, so there is no guarantee object is freed even if
object is not reachable from GC's root.

But anyway, probably I found the problem on ELTS_SHARED.

/////////////////////////////////

def pause
GC.start
$stdout.puts "measure memory and hit any key..."
$stdout.flush
$stdin.getc
end

pause

a = Array.new(10000){ "." * 1000 } # huge memory

pause

a.map!{|s| s[-100..-1]} # memory stays large

pause

a.map!{|s| s[-3..-1]} # reduces memory

pause


/////////////////////////////////

This is because rb_str_substr (string.c) 's

else if (len > sizeof(struct RString)/2 &&
beg + len == RSTRING(str)->len && !FL_TEST(str, STR_ASSOC)) {
str2 = rb_str_new3(rb_str_new4(str));
RSTRING(str2)->ptr += RSTRING(str2)->len - len;
RSTRING(str2)->len = len;
}

is executed at

a.map!{|s| s[-100..-1]} # memory stays large

rb_str_new3 generates ELS_SHARED RString which holds original RString.

When original string becomes unreachable, it should be garbage collected.
But ELTS_SHARED substring references it (RString#aux->shared), so not collected
until substring itself becomes unreachable.

I haven't confirmed this is really cause of your problem, but there is possibility
this hidden huge string eats memory. (maybe same thing happens on Array)



Zach Dennis

2/27/2006 8:37:00 PM

0

H. Yamamoto,

I posted my post and your reply to ruby-core. Hopefully this can get clarified or as you suspect,
the problem you potentially found can be fixed, Thank you for your reply,

Zach

H.Yamamoto wrote:
> Hello.
>
> Ruby's GC is conservative, so there is no guarantee object is freed even if
> object is not reachable from GC's root.
>
> But anyway, probably I found the problem on ELTS_SHARED.
>
> /////////////////////////////////
>
> def pause
> GC.start
> $stdout.puts "measure memory and hit any key..."
> $stdout.flush
> $stdin.getc
> end
>
> pause
>
> a = Array.new(10000){ "." * 1000 } # huge memory
>
> pause
>
> a.map!{|s| s[-100..-1]} # memory stays large
>
> pause
>
> a.map!{|s| s[-3..-1]} # reduces memory
>
> pause
>
>
> /////////////////////////////////
>
> This is because rb_str_substr (string.c) 's
>
> else if (len > sizeof(struct RString)/2 &&
> beg + len == RSTRING(str)->len && !FL_TEST(str, STR_ASSOC)) {
> str2 = rb_str_new3(rb_str_new4(str));
> RSTRING(str2)->ptr += RSTRING(str2)->len - len;
> RSTRING(str2)->len = len;
> }
>
> is executed at
>
> a.map!{|s| s[-100..-1]} # memory stays large
>
> rb_str_new3 generates ELS_SHARED RString which holds original RString.
>
> When original string becomes unreachable, it should be garbage collected.
> But ELTS_SHARED substring references it (RString#aux->shared), so not collected
> until substring itself becomes unreachable.
>
> I haven't confirmed this is really cause of your problem, but there is possibility
> this hidden huge string eats memory. (maybe same thing happens on Array)
>
>



James Hammerton

5/12/2010 10:20:00 PM

0

don diddi wrote:
>
>
> "Maria" <down@heel.org.uk> wrote in message
> news:Equdnez22d_g4nfWnZ2dnUVZ8g9i4p2d@bt.com...
>> Unemployment up
>> Long term unemployment up
>> Unemployed under 25's up
>> Economically inactive up
>> Number in employment down
>> Number of job vacancies down
>> Number claming JSA down
>> ?????
>
> Get used to it, the Tories are in power

LOL! They haven't done anything yet and you're already blaming them for
the mess Labour left behind...

James

--
James Hammerton,
http://jhammerton.word...
http://www.magnacartaplus...

don diddi

5/12/2010 11:53:00 PM

0



"James Hammerton" <jah.usenet@yahoo.co.uk> wrote in message
news:850nruFnqdU3@mid.individual.net...
> don diddi wrote:
>>
>>
>> "Maria" <down@heel.org.uk> wrote in message
>> news:Equdnez22d_g4nfWnZ2dnUVZ8g9i4p2d@bt.com...
>>> Unemployment up
>>> Long term unemployment up
>>> Unemployed under 25's up
>>> Economically inactive up
>>> Number in employment down
>>> Number of job vacancies down
>>> Number claming JSA down
>>> ?????
>>
>> Get used to it, the Tories are in power
>
> LOL! They haven't done anything yet and you're already blaming them for
> the mess Labour left behind...

Try reading it.

I didn't blame them for anything that HAS happened.
I'm just stating that it will be the normal thing when their cuts start.

Ophelia

5/13/2010 9:07:00 AM

0



"James Hammerton" <jah.usenet@yahoo.co.uk> wrote in message
news:850nruFnqdU3@mid.individual.net...
> don diddi wrote:
>>
>>
>> "Maria" <down@heel.org.uk> wrote in message
>> news:Equdnez22d_g4nfWnZ2dnUVZ8g9i4p2d@bt.com...
>>> Unemployment up
>>> Long term unemployment up
>>> Unemployed under 25's up
>>> Economically inactive up
>>> Number in employment down
>>> Number of job vacancies down
>>> Number claming JSA down
>>> ?????
>>
>> Get used to it, the Tories are in power
>
> LOL! They haven't done anything yet and you're already blaming them for
> the mess Labour left behind...

No matter how ludicrous, they can't help themselves.
--
--
https://www.shop.helpforhero...

Harry Merrick

5/13/2010 9:30:00 AM

0

don diddi wrote:
> "Maria" <down@heel.org.uk> wrote in message
> news:Equdnez22d_g4nfWnZ2dnUVZ8g9i4p2d@bt.com...
>> Unemployment up
>> Long term unemployment up
>> Unemployed under 25's up
>> Economically inactive up
>> Number in employment down
>> Number of job vacancies down
>> Number claming JSA down
>> ?????
>
> Get used to it, the Tories are in power

ROTFLMAO!! Yes, indeed! NOT before it was time! I jusy hope that it is not
too late! At least we now have the best brains in the country governing us,
unlike that squalid, depressing, socialist lot.

--
Harry Merrick.