[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Minimizing memory allocations

Ilmari Heikkinen

1/22/2006 2:13:00 PM

So there I was this morning, staring at an ObjectSpace counter tellingme that I'm allocating 1500 Arrays and 10000 Floats per frame. Whichpretty much ground my framerate to ground by requiring a 0.2s GC runevery other frame. So I decided to get down and rid my code of as manyallocations as possible.The first thing I discovered was a bit of code looking like this (notto mention that each was actually getting called several times perframe due to a bug):@mtime = ([@mtime] + ([@stroke||nil,@fill||nil].compact+children).map{|c| c.mtime}).maxQuite unreadable, and it was responsible for a large part of the Arrayallocations too. A quick change whittled Array allocation count forthat method to 0, with the price of making it less idiomatic:children.each{|c| cm = c.mtime @mtime = cm if cm > @mtime}if @stroke sm = @stroke.mtime @mtime = sm if sm > @mtimeendif @fill fm = @fill.mtime @mtime = fm if fm > @mtimeendNow the Array allocations dropped down to hundreds, a much morereasonable number, but still way too much compared to what washappening in the frame. The only thing that should've changed was onenumber. So the extra 500 Arrays were a bit of a mystery.Some investigation revealed places where I was usingArray#each_with_index. Very nice, very idiomatic, very allocating anew Array on each iteration. So replace by the following and watch thealloc counts fall:i = 0arr.each{|e| do_stuff_with e i += 1}By doing that in a couple of strategic places and some otheroptimizations, the Array allocation count fell to 150. Of which 90were allocated in the object Z-sorting method, which'd require a Cimplementation to get its allocation count to 0. The Array allocationfight was heading towards diminishing returns, and my current scenedidn't need to use Z-sorting, so I turned my attention to the Floats.By now, the Float count had also dropped a great deal, but it wasstill a hefty 3000 Floats per frame. With each float weighing 16bytes, that was nearly 3MB per second when running at 60fps. Searchingfor the method that was allocating all those Floats, i ran intosomething weird. #transform was allocating 6-32 Floats per call. Andit's one of the functions that get called for every scene object, inevery frame. Also, it's written in C.That left me stymied. Surely there must be some mistake, I thought,the C function didn't seem to be allocating _any_ Ruby objects. Butlittle did I know.The C function called the NUM2DBL-macro in several places to turn Rubynumbers into doubles. Reading the source for NUM2DBL told that itcalls the rb_num2dbl C function. Which takes a Ruby number and returnsa C double. Reading the source to rb_num2dbl revealed this:01361 double01362 rb_num2dbl(val)01363 VALUE val;01364 {01365 switch (TYPE(val)) {01366 case T_FLOAT:01367 return RFLOAT(val)->value;0136801369 case T_STRING:01370 rb_raise(rb_eTypeError, "no implicit conversion to floatfrom string");01371 break;0137201373 case T_NIL:01374 rb_raise(rb_eTypeError, "no implicit conversion to floatfrom nil");01375 break;0137601377 default:01378 break;01379 }0138001381 return RFLOAT(rb_Float(val))->value;01382 }rb_Float gets called on all Fixnums and Bignums, which there happenedto be quite a deal of in my scene state arrays. Checking out rb_Floatgave the explanation for the Float allocations:01326 switch (TYPE(val)) {01327 case T_FIXNUM:01328 return rb_float_new((double)FIX2LONG(val));0132901333 case T_BIGNUM:01334 return rb_float_new(rb_big2dbl(val));In order to turn a Fixnum into a double, it's allocating a new Float!With that figured out, I took and rewrote rb_num2dbl as rb_num_to_dbl,this time handling Fixnums and Bignums as special cases as well:double rb_num_to_dbl( VALUE val ){ switch (TYPE(val)) { case T_FLOAT: return RFLOAT(val)->value; case T_FIXNUM: return (double)FIX2LONG(val); case T_BIGNUM: return rb_big2dbl(val); case T_STRING: rb_raise(rb_eTypeError, "no implicit conversion to float from string"); break; case T_NIL: rb_raise(rb_eTypeError, "no implicit conversion to float from nil"); break; default: break; } return RFLOAT(rb_Float(val))->value;}The result? Float allocations fell to 700 per frame from the original3000. And now I'm getting a GC run "only" every 36 frames. Not perfectby any means, but a decent start.Have stories of your own? Tips for memory management? Ways to trackallocations? Post them, please.Cheers,Ilmari
18 Answers

Eero Saynatkari

1/22/2006 3:35:00 PM

0

Ilmari Heikkinen wrote:
> So there I was this morning, staring at an ObjectSpace counter telling
> me that I'm allocating 1500 Arrays and 10000 Floats per frame. Which
> pretty much ground my framerate to ground by requiring a 0.2s GC run
> every other frame. So I decided to get down and rid my code of as many
> allocations as possible.
>
> < snip due to ruby-forum restrictions />
>
> In order to turn a Fixnum into a double, it's allocating a new Float!
> With that figured out, I took and rewrote rb_num2dbl as rb_num_to_dbl,
> this time handling Fixnums and Bignums as special cases as well:
>
> < snip />
>
> The result? Float allocations fell to 700 per frame from the original
> 3000. And now I'm getting a GC run "only" every 36 frames. Not perfect
> by any means, but a decent start.

Nice! I wonder if this would be eligible for core patching?

> Have stories of your own? Tips for memory management? Ways to track
> allocations? Post them, please.

Nope, but I enjoyed reading this one, thanks!

> Cheers,
> Ilmari


E

--
Posted via http://www.ruby-....


Ilmari Heikkinen

1/22/2006 3:54:00 PM

0

On 1/22/06, Ilmari Heikkinen <ilmari.heikkinen@gmail.com> wrote:> pretty much ground my framerate to ground by requiring a 0.2s GC runArgh, sorry, magnitude error. The correct GC run time is 0.02s. Not sobad, but still a 60fps -> 20fps glitch.

Gavin Kistner

1/22/2006 5:13:00 PM

0

On Jan 22, 2006, at 8:54 AM, Ilmari Heikkinen wrote:

> On 1/22/06, Ilmari Heikkinen <ilmari.heikkinen@gmail.com> wrote:
>> pretty much ground my framerate to ground by requiring a 0.2s GC run
>
> Argh, sorry, magnitude error. The correct GC run time is 0.02s. Not so
> bad, but still a 60fps -> 20fps glitch.

In a completely separate world (Lua code running under a scene graph
written in C++; no Ruby anywhere) I recently hit a place where I
thought I needed to allocate ~200 Vector objects per frame.

(I was using a recursive function to calculate 3D bezier curves[1],
which needed to allocate and preserve 4 new Vector objects each call.)

It was causing noticeable lurching when the GC kicked in
occasionally. I found two interesting things:

1) Running Lua's GC manually *every frame* resulted in better memory
performance and faster overall framerate than running it every 2 or
10 or 50 or 100 frames. My only (lame) guess was that waiting longer
yielded a larger memory pool too wade through when looking for items
to GC. (?)

2) Because I really didn't need to preserve the 200 Vectors from
frame to frame (the final results of the recursive calculation were
copied into the position vectors for points on the line), I was able
to remove the per-frame memory allocations altogether by abstracting
the Vector allocation into a pooled-vector manager. Doing this gave
me far-better frame rates than I was getting with the GC-every-frame
approach.

This isn't applicable specifically to Ruby, but applicable generally:
when you can't remove memory allocations, see if you can at least re-
use them.

In an attempt to make this post Ruby-specific, I give you a pooled
object manager that I've just written, based on the Lua version I
wrote at work. You create a pool by specifying an object that is the
template/factory, and a method to call on that object (defaults to
'new'). Every time you ask for an object, it will hand you one from
the pool, or create a new instance. The #reset method makes all items
in the pool re-usable again (i.e. call at the start of a new frame).


class ObjectPool
def initialize( template, method=:new, template_in_pool=false )
@template = template
@method = method
@pool = [ ]
if template_in_pool
@pool << template
end
reset
end

# Make all items in the pool available again
def reset
@next_available = 0
end

# Remove references to all items not currently in use
def drain
@pool[ @next_available..-1 ] = nil
end

# Return a new item from the pool, creating a new one if needed
def next
unless item = @pool[ @next_available ]
@pool << ( item = @template.send( @method ) )
end
@next_available = @next_available + 1
item
end

def inspect
"<ObjectPool of #{@pool.size} #{@template}>"
end
end

class Vector
attr_accessor :x, :y, :z
def initialize( x=0, y=0, z=0 ) @x, @y, @z = x,y,z end
def clone() self.class.new( @x, @y, @z ) end
def inspect() "<Vector:0x#{object_id.to_s(16)} #@x,#@y,#@z>" end
end

##################################################################
# Showing how to create a pool of class instances
##################################################################
pool = ObjectPool.new( Vector )
p pool
#=> <ObjectPool of 0 Vector>

3.times{ |i|
v = pool.next
v.x = v.y = v.z = i
p v
}
#=> <Vector:0x195518 0,0,0>
#=> <Vector:0x195392 1,1,1>
#=> <Vector:0x19520c 2,2,2>

p pool
#=> <ObjectPool of 3 Vector>

pool.reset
3.times{ p pool.next }
#=> <Vector:0x195518 0,0,0>
#=> <Vector:0x195392 1,1,1>
#=> <Vector:0x19520c 2,2,2>

p pool
#=> <ObjectPool of 3 Vector>


##################################################################
# Showing how to create a pool based off of a template object
##################################################################
v = Vector.new( 1, 2, 3 )
pool2 = ObjectPool.new( v, :clone, true )
p pool2
#=> <ObjectPool of 1 #<Vector:0x32ad64>>

3.times{ p pool2.next }
#=> <Vector:0x1956b2 1,2,3>
#=> <Vector:0x194672 1,2,3>
#=> <Vector:0x1944ec 1,2,3>

pool2.reset
3.times{ p pool2.next }
#=> <Vector:0x1956b2 1,2,3>
#=> <Vector:0x194672 1,2,3>
#=> <Vector:0x1944ec 1,2,3>

p pool2
#=> <ObjectPool of 3 #<Vector:0x32ad64>>



[1] http://www.antigrain.com/research/adaptive_bezier/index.ht...




Timothy Goddard

1/22/2006 10:18:00 PM

0

I don't know anything about Lua, but Ruby is unlikely to see and speed
benefits from garbage collecting every frame unless you're deep in
swap. Ruby uses a "mark and sweep" garbage collection scheme, which I
believe means that garbage collecting time is mostly proportional to
the number of current, referenced objects, not the amount of junk left
behind.

John Carter

1/23/2006 3:15:00 AM

0

Paul Brannan

1/26/2006 3:56:00 PM

0

On Wed, Jan 25, 2006 at 12:45:41AM +0900, Timothy Goddard wrote:
> I don't know anything about Lua, but Ruby is unlikely to see and speed
> benefits from garbage collecting every frame unless you're deep in
> swap. Ruby uses a "mark and sweep" garbage collection scheme, which I
> believe means that garbage collecting time is mostly proportional to
> the number of current, referenced objects, not the amount of junk left
> behind.

The marking time is proportional to the number of referenced objects,
and the sweeping time is proportional to the number of objects that get
swept. In general, the longer the time between sweeps, the more objects
need to be swept.

Invoking the GC every frame probably won't improve average frame rate,
but it may help decrease the variance of the time it takes to process
each frame (so you get more consistent performance).

Paul




Dave Howell

1/26/2006 5:44:00 PM

0


On Jan 22, 2006, at 19:14, John Carter wrote:

> Trick Two...
>
> Memoization
>
> class Foo
> def initialize( thing)
> end
> end
>
> foo = Foo.new( thing)
>
> becomes...
>
> class Foo
> @@memo = Hash.new{|hash,key| hash[key] = Foo.new( key)}
>
> def create_foo( thing)
> @@memo[thing]
> end
>
> def initialize( thing)
> end
>
> end
>
>
> foo = Foo.create_foo( thing)

Er, um, huh?

All foo-links are tapping into a global-to-class hash called @@memo.
OK...
Foo.new has been rendered useless, apparently torturing anybody who
forgets by returning nil as a non-error?
Foo.create_foo(thing) does, well, I have no idea. What's "thing"
supposed to be? And why is create_foo using square brackets?

Sigh. Sometimes Ruby is *too* idiomatic.

What *is* "memo-ization?"



MenTaLguY

1/26/2006 6:17:00 PM

0

Quoting Dave Howell <groups@grandfenwick.net>:

> Sigh. Sometimes Ruby is *too* idiomatic.
>
> What *is* "memo-ization?"

It isn't actually a ruby-specific term:

http://en.wikipedia.org/wiki/M...

-mental


Dave Howell

1/26/2006 7:49:00 PM

0


On Jan 26, 2006, at 10:16, mental@rydia.net wrote:

> Quoting Dave Howell <groups@grandfenwick.net>:
>
>> Sigh. Sometimes Ruby is *too* idiomatic.
>>
>> What *is* "memo-ization?"
>
> It isn't actually a ruby-specific term:

I didn't think it was. That's why the comment and the question are in
separate paragraphs.

I might be able to guess what it is if I could read the Ruby code. But
I can't figure out what the code does.



Alex Combas

1/28/2006 12:19:00 AM

0

Hello,
I would really like to learn more about memoization
but unfortunately the gem doesnt seem to install properly on
my system (Ubuntu Linux) and I was told that in order
to get downloaded gems to work I would probably have
to uninstall the apt versions of ruby and rebuild everything
from source.

I dont really want to do that, so I was wondering if
the memoization library is small enough that I could
just put a copy into my path and include it that way,
rather than as an installed gem.

Does that sound like a sane alternative, or is my only
option to rip the ruby deb's out of my system and
start from scratch?


--
Alex Combas
http://noodlejunkie.blo...