Asp Forum - Re: JRuby performance questions answered

Isaac Gouy

11/6/2007 11:08:00 PM

Quoting Charles Oliver Nutter <charles.nutter / sun.com>:

> znmeb / cesmail.net wrote:
>> Quoting Charles Oliver Nutter <charles.nutter / sun.com>:
>>
>>> Many people believed we'd never be faster than the C
implementation,
>>> and many still think we're slower. Now that I've set that record
>>> straight, any questions?
>>
>> 1. How long will it be before Alioth has some *reasonable* numbers
=20
>> for jRuby? As of yesterday, they still have you significantly =20
>> slower than MRI. So I need to take jRuby out of my slides for =20
>> RubyConf :) ... I
>
> The current published Alioth numbers are based on JRuby 1.0(ish),
which
> was generally 2-3x slower than MRI. I'm hoping the numbers will be
> updated soon after the 1.1 releases...but it probably won't happen
> until 1.1 final comes out in December. If someone else wants to
re-run
> them for us, it would make us very happy :)

An "Update Programming Language" "Feature Request" will usually get our
attention.

Coincidentally, I did grab 1.1b1 so the benchmarks game has new
measurements

http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&...

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail...

28 Answers

Roger Pack

11/6/2007 11:39:00 PM

>
> Coincidentally, I did grab 1.1b1 so the benchmarks game has new
> measurements
>
> http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&...

Wow it would appear that jruby is indeed faster, and indeed uses a lot
more memory :) (or maybe that's just startup overhead). thanks for a
good program!
--
Posted via http://www.ruby-....

Charles Oliver Nutter

11/7/2007 2:35:00 AM

Roger Pack

11/7/2007 3:32:00 AM

> It's another well-known fact about running on the JVM that we have to
> suck it up and accept there's an initial memory chunk eaten up by every
> JVM process. If one excludes that initial cost, most measurements have
> us using less memory than C Ruby...so for very large apps we end up
> coming out ahead. But for small, short apps, the initial slow startup
> and high memory usage is going to be a battle we fight for a long time.
>
> - Charlie

If you run multiple threads I assume there isn't an extra memory cost
for that--is that right?
--
Posted via http://www.ruby-....

Clifford Heath

11/7/2007 4:27:00 AM

Roger Pack wrote:
> If you run multiple threads I assume there isn't an extra memory cost
> for that--is that right?

Every thread is going to need its own stack, but that'll be small
compared to the startup overhead. I'm sure Charles will elaborate.

I'm guessing that JRuby still doesn't support continuations...? I
think that would require the "spaghetti stack" model, which would
remove most of the per-thread initial stack overhead.

Clifford Heath.

Charles Oliver Nutter

11/7/2007 7:27:00 AM

Roger Pack wrote:
>> It's another well-known fact about running on the JVM that we have to
>> suck it up and accept there's an initial memory chunk eaten up by every
>> JVM process. If one excludes that initial cost, most measurements have
>> us using less memory than C Ruby...so for very large apps we end up
>> coming out ahead. But for small, short apps, the initial slow startup
>> and high memory usage is going to be a battle we fight for a long time.
>>
>> - Charlie
>
> If you run multiple threads I assume there isn't an extra memory cost
> for that--is that right?

Yes, generally. They won't be as light as Ruby's green threads, but then
Ruby's threads can't actually run in parallel anyway.

- Charlie

Charles Oliver Nutter

11/7/2007 7:27:00 AM

Clifford Heath wrote:
> Roger Pack wrote:
>> If you run multiple threads I assume there isn't an extra memory cost
>> for that--is that right?
>
> Every thread is going to need its own stack, but that'll be small
> compared to the startup overhead. I'm sure Charles will elaborate.

Our threads will be a lot more expensive than Ruby's, but a lot cheaper
than a separate process in either world.

> I'm guessing that JRuby still doesn't support continuations...? I
> think that would require the "spaghetti stack" model, which would
> remove most of the per-thread initial stack overhead.

Our official stance is that JRuby won't support continuations until the
JVM does. We could emulate them by forcing a stackless implementation,
but it would be *drastically* slower than what we have now.

- Charlie

Charles Oliver Nutter

11/7/2007 7:29:00 AM

Roger Pack wrote:
>> Coincidentally, I did grab 1.1b1 so the benchmarks game has new
>> measurements
>>
>> http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&...
>
> Wow it would appear that jruby is indeed faster, and indeed uses a lot
> more memory :) (or maybe that's just startup overhead). thanks for a
> good program!

I just committed an addition to JRuby that allows you to spin up a
"server" JRuby instance (using "Nailgun") in the background and feed it
commands. See the startup difference using this:

normal:

~/NetBeansProjects/jruby $ time jruby -e "puts 'hello'"
hello

real 0m1.944s
user 0m1.511s
sys 0m0.138s

nailgun:

~/NetBeansProjects/jruby $ time jruby-ng -e "puts 'hello'"
hello

real 0m0.103s
user 0m0.006s
sys 0m0.009s

Here's a post from the JRuby list describing how to use this, for those
of you that are interested. Also, this allows you to avoid the startup
memory cost for every command you run since you can just issue commands
to that running server and it will re-use memory. After running a bunch
of commands on my system, that server process was still happily under
60M, and never went any higher.

...

I've got Nailgun working with JRuby just great now.

bin/jruby-ng-server
bin/jruby-ng

If you want to use the server, say if you're going to be running a lot
of command-line tools, just spin it up in the background somewhere.

jruby-ng-server > /dev/null 2> /dev/null &

And then use the jruby-ng command instead, or alias it to "jruby"

alias jruby=jruby-ng

You'll need to make the ng client command on your platform, by running
'make' under bin/nailgun, but then everything should function correctly.

jruby-ng -e "puts 'here'"

The idea is that users will have a new option to try. For JRuby, where
we have no global variables, no dependencies on static fields, and
already depend on our ability to spin up many JRuby instances in a
single JVM, this ends up working very well. It's building off features
we already provide, and giving users the benefit of a fast,
pre-initialized JVM without the startup hit.

I think we're probably going to ship with this for JRuby 1.1 now. It's
working really well. I've managed to resolve the CWD issue by defining
my own "nailMain" next to our existing "main", and ENV vars are being
passed along as well. The one big remaining complication I don't have an
answer for just yet is OS signals; they get registered only in the
server process, so signals from the client don't propagate through. It's
fixable of course, by having the client register and the server just
listen for client signal events, but that isn't supported in the current
NG. So there's some work to do.

All the NG stuff is in JRuby trunk right now. Give it a shot. I'm
interested in hearing opinions on it.

- Charlie

Roger Pack

11/9/2007 11:55:00 PM

> Wow it would appear that jruby is indeed faster, and indeed uses a lot
> more memory :) (or maybe that's just startup overhead). thanks for a
> good program!

I wonder if jruby uses reference counting for its ruby objects (or if it
even matters), and if not maybe someday it would :) I'm just in a pro
reference counting mood these days :)

-Roge

--
Posted via http://www.ruby-....

Rick DeNatale

11/10/2007 1:39:00 PM

On 11/9/07, Roger Pack <rogerpack2005@gmail.com> wrote:
>
> > Wow it would appear that jruby is indeed faster, and indeed uses a lot
> > more memory :) (or maybe that's just startup overhead). thanks for a
> > good program!
>
> I wonder if jruby uses reference counting for its ruby objects (or if it
> even matters), and if not maybe someday it would :) I'm just in a pro
> reference counting mood these days :)

I very much doubt it.

Roger, you REALLY need to read the literature on GC which has been
accumulating for the past 50 years.

Reference counting is pretty much an obsolete approach to GC. It was
probably the first approach taken for lisp back in the 1950s. Other
language implementations usually started with reference counting (e.g.
the first Smalltalk).

It's main advantage is that it's easy to understand. On the other hand
it incurs a large overhead since counts need to be
incremented/decremented on every assignment. It can't detect circular
lists of dead objects. In early Smalltalk programs when reference
counting was used, you needed to explicitly nil out references to
break such chains. There's also the issue of the overhead for storing
the reference count, and how many bits to allocate. Most reference
counting implementations punt when the reference count overflows, they
treat a 'full' count as an infinite count and no longer decrement it,
leading to more uncollectable objects.

Mark and sweep, such as is used in the Ruby 1.8 implementation quickly
replaced reference counting as the simplest GC considered for real
use.

More modern GCs tend to use copying GCs which move live objects to new
heap blocks leaving the dead ones behind. And most use generational
scavenging which takes advantage of the observation that most objects
either die quite young, or live a long time. This approach was
pioneered by David Ungar in the Berkeley implementation of
Smalltalk-80. And this is the kind of GC typically used in JVMs
today.

Which particular GC approach is best for Ruby is subject to some study.

Many of the usages of ruby aren't quite like those of Java, or
Smalltalk. I had dinner with a former colleague, who happens to be
the lead developer of the IBM J9 java virtual machine, and he made the
observation that Java, and Smalltalk before it have a long history of
having their VMs tuned for long running processes. On the other hand
many Ruby usages are get in and get out. These use cases mean that
it's more valuable to have rapid startup than perfect GC in the sense
that all dead objects are reclaimed quickly, not that any of the
current GCs guarantee the latter.

So the best GC for Ruby might not be the same as would be used for a
JVM or Smalltalk VM, but I'm almost certain it would be a reference
counter.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

M. Edward (Ed) Borasky

11/10/2007 8:35:00 PM

Rick DeNatale wrote:
> Reference counting is pretty much an obsolete approach to GC. It was
> probably the first approach taken for lisp back in the 1950s. Other
> language implementations usually started with reference counting (e.g.
> the first Smalltalk).
>
> It's main advantage is that it's easy to understand.

I don't think reference counting is any easier to understand than pure
mark-and-sweep or pure stop-and-copy. The main advantage of reference
counting in my opinion is that its restrictions force you to kick some
features out of your language design if you want to use it. :)

> Mark and sweep, such as is used in the Ruby 1.8 implementation quickly
> replaced reference counting as the simplest GC considered for real
> use.

My recollection is that mark-and-sweep was the original, and that
reference counting came later.

> More modern GCs tend to use copying GCs which move live objects to new
> heap blocks leaving the dead ones behind. And most use generational
> scavenging which takes advantage of the observation that most objects
> either die quite young, or live a long time. This approach was
> pioneered by David Ungar in the Berkeley implementation of
> Smalltalk-80. And this is the kind of GC typically used in JVMs
> today.

Bah ... I actually found a reference a couple of days ago on this
(http://portal.acm.org/citation.cf...). If you're not signed up
for the ACM library it will cost you money to read it. But essentially
"pure" mark-and-sweep was replaced by stop-and-copy, which compacts the
heap. Then generational mark-and-sweep came along and "rehabilitated"
mark-and-sweep. Note the publication date -- 1990. The abstract is free
-- it reads:

"Stop-and-copy garbage collection has been preferred to mark-and-sweep
collection in the last decade because its collection time is
proportional to the size of reachable data and not to the memory size.
This paper compares the CPU overhead and the memory requirements of the
two collection algorithms extended with generations, and finds that
mark-and-sweep collection requires at most a small amount of additional
CPU overhead (3-6%) but, requires an average of 20% (and up to 40%) less
memory to achieve the same page fault rate. The comparison is based on
results obtained using trace-driven simulation with large Common Lisp
programs."

> Which particular GC approach is best for Ruby is subject to some study.

I think at least for Rails on Linux, someone (assuming funding) could
collect and analyze plenty of data. I'd actually be surprised if someone
*isn't* doing it, although I know *I'm* not. ;)

> Many of the usages of ruby aren't quite like those of Java, or
> Smalltalk. I had dinner with a former colleague, who happens to be
> the lead developer of the IBM J9 java virtual machine, and he made the
> observation that Java, and Smalltalk before it have a long history of
> having their VMs tuned for long running processes. On the other hand
> many Ruby usages are get in and get out. These use cases mean that
> it's more valuable to have rapid startup than perfect GC in the sense
> that all dead objects are reclaimed quickly, not that any of the
> current GCs guarantee the latter.

Well ... OK. If you want to distinguish between long running (server)
and rapid startup (client), that's fine. But look at the marketplace. We
have servers, we have laptop clients, we have desktop clients, we have
mobile clients, and we have bazillions of non-user-programmable
computers like DVD players, iPods, in-vehicle navigation systems, etc.

Now while the hard-core hackers like me wouldn't buy an iPod or a DVD
player, preferring instead to add hard drive space to a real computer,
Apple isn't exactly going broke making iPods and iPhones that are (for
the moment, anyhow) closed to "outsiders". And I'm guessing that, while
you *can* run Ruby on, say, an embedded ARM/Linux platform, most of the
software in those gizmos is written in C and heavily optimized.

I've got a couple of embedded toolkits, and I've actually built Ruby for
them, but when you only have 32 MB of RAM, you don't want to collect
garbage -- you don't even want to *generate* garbage! So I wouldn't
personally spend much time thinking about garbage collection for rapid
startup. If you want rapid startup, you're going to have as much binding
as possible done at compile time -- you aren't even going to compile a
Ruby script to an AST when you start a process up.

> So the best GC for Ruby might not be the same as would be used for a
> JVM or Smalltalk VM, but I'm almost certain it would be a reference
> counter.

Did you mean to say, "not be a reference counter"?

comp.lang.ruby

Re: JRuby performance questions answered

Isaac Gouy

Roger Pack

Charles Oliver Nutter

Roger Pack

Clifford Heath

Charles Oliver Nutter

Charles Oliver Nutter

Charles Oliver Nutter

Roger Pack

Rick DeNatale

M. Edward (Ed) Borasky

x Login to ForumsZone