[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Ruby 1.9.x Concurrency

Steve Ross

6/9/2009 6:24:00 AM

Poking through the Apple press releases today, I sat up and took
notice when I saw that they were putting a fair amount of pretty
public emphasis on concurrency as the silver bullet for faster
computing when Snow Leopard comes out. If we stipulate that
concurrency is fundamentally a good solution of a certain class of
problems, here were the questions I immediately had:

- My understanding is that the 1.9 implementation of threads is to use
native threads. But the caveat is that the GIL is still in place. What
does this mean in practice as it applies to increasing throughput by
distributing load across processor cores?

- I'm trying to parse the fiber vs. thread distinction and it feels to
me like fibers are a leaner, meaner version of the 1.8.x green
threads, but that they will always run on the same core. Am I missing
something here?

Thanks,

Steve

19 Answers

glenn gillen

6/9/2009 7:54:00 AM

0

Steve,

I found this post quite useful in filling in the missing pieces for
me, may be helpful if you have the same missing piece :)

http://www.igvita.com/2009/05/13/fibers-cooperative-schedulin...

Regards,

GLenn

James Gray

6/9/2009 2:17:00 PM

0

On Jun 9, 2009, at 1:23 AM, s.ross wrote:

> Poking through the Apple press releases today, I sat up and took
> notice when I saw that they were putting a fair amount of pretty
> public emphasis on concurrency as the silver bullet for faster
> computing when Snow Leopard comes out. If we stipulate that
> concurrency is fundamentally a good solution of a certain class of
> problems, here were the questions I immediately had:
>
> - My understanding is that the 1.9 implementation of threads is to
> use native threads. But the caveat is that the GIL is still in
> place. What does this mean in practice as it applies to increasing
> throughput by distributing load across processor cores?
>
> - I'm trying to parse the fiber vs. thread distinction and it feels
> to me like fibers are a leaner, meaner version of the 1.8.x green
> threads, but that they will always run on the same core. Am I
> missing something here?

What I think you're getting at here is, yes, threading still isn't the
way to get real concurrency on Ruby 1.9. If you really want to do two
things at once, you're going to need processes in Ruby.

James Edward Gray II

Eleanor McHugh

6/9/2009 3:07:00 PM

0

On 9 Jun 2009, at 15:16, James Gray wrote:
> On Jun 9, 2009, at 1:23 AM, s.ross wrote:
>> Poking through the Apple press releases today, I sat up and took
>> notice when I saw that they were putting a fair amount of pretty
>> public emphasis on concurrency as the silver bullet for faster
>> computing when Snow Leopard comes out. If we stipulate that
>> concurrency is fundamentally a good solution of a certain class of
>> problems, here were the questions I immediately had:
>>
>> - My understanding is that the 1.9 implementation of threads is to
>> use native threads. But the caveat is that the GIL is still in
>> place. What does this mean in practice as it applies to increasing
>> throughput by distributing load across processor cores?
>>
>> - I'm trying to parse the fiber vs. thread distinction and it feels
>> to me like fibers are a leaner, meaner version of the 1.8.x green
>> threads, but that they will always run on the same core. Am I
>> missing something here?
>
> What I think you're getting at here is, yes, threading still isn't
> the way to get real concurrency on Ruby 1.9. If you really want to
> do two things at once, you're going to need processes in Ruby.

Indeed. And on a Unix box (like Snow Leopard) there really isn't a
good excuse for ignoring them unless your concurrency needs are pretty
trivial. Use threads for simple things like downloading multiple web
pages concurrently, but for significant processing jobs chuck your
data down a pipe to a process and let the OS take care of proper
scheduling etc.

And for some quick coverage of this sort of thing grab the "Ruby
Plumber's Guide" presentation from the link in my sig, or wait for
James's RubyKaigi presentation which I suspect will be both more
coherent and more detailed :)


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-...
----
raise ArgumentError unless @reality.responds_to? :reason


Charles O Nutter

6/9/2009 4:07:00 PM

0

On Tue, Jun 9, 2009 at 9:16 AM, James Gray<james@grayproductions.net> wrote=
:
> What I think you're getting at here is, yes, threading still isn't the wa=
y
> to get real concurrency on Ruby 1.9. =C2=A0If you really want to do two t=
hings at
> once, you're going to need processes in Ruby.

Or just use JRuby, and real concurrent/parallel threads will just work
out of the box :)

- Charlie

Eleanor McHugh

6/9/2009 5:13:00 PM

0

On 9 Jun 2009, at 17:06, Charles Oliver Nutter wrote:
> On Tue, Jun 9, 2009 at 9:16 AM, James
> Gray<james@grayproductions.net> wrote:
>> What I think you're getting at here is, yes, threading still isn't
>> the way
>> to get real concurrency on Ruby 1.9. If you really want to do two
>> things at
>> once, you're going to need processes in Ruby.
>
> Or just use JRuby, and real concurrent/parallel threads will just work
> out of the box :)

And on Windows too. Damn smug JVM users ;p


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-...
----
raise ArgumentError unless @reality.responds_to? :reason


James Gray

6/9/2009 5:38:00 PM

0

On Jun 9, 2009, at 10:07 AM, Eleanor McHugh wrote:

> And for some quick coverage of this sort of thing grab the "Ruby
> Plumber's Guide" presentation from the link in my sig, or wait for
> James's RubyKaigi presentation which I suspect will be both more
> coherent and more detailed :)

No pressure! :)

Seriously, I doubt I be as detailed. I'm going to show some basics
and a few techniques that have worked for me.

I'm giving the same talk as a preview to OK.rb this Thursday, so come
see us if your near OKC.

James Edward Gray II

Tony Arcieri

6/9/2009 5:44:00 PM

0

[Note: parts of this message were removed to make it a legal post.]

On Tue, Jun 9, 2009 at 10:06 AM, Charles Oliver Nutter
<headius@headius.com>wrote:

> Or just use JRuby, and real concurrent/parallel threads will just work
> out of the box :)
>

Well, as best they can on Ruby. You guys have done some really great work
on that, but Ruby's approach to threading is rather poor.

--
Tony Arcieri
medioh.com

Steve Ross

6/9/2009 6:11:00 PM

0

Thanks for all the answers. I guess the main issue I'm trying to get
my head around is that since Moore's law doesn't quite seem to be
keeping up with the demand for processing power, multi-code and multi-
processor hardware solutions are predominant in the field now. The
question I'm trying to answer is not how great the threading solution
is but rather whether the threading solution solves the problem of
distributing workload without incurring more overhead than it's worth.

I implemented a simple MacRuby app that just fills a list from a
database. Doing the database query and populating the 20 or so items
that show at any given time still gave the app a kind of sluggish
feel. Once I separated this into a separate thread, removing the
block, the simple fact that the UI was alive made the app seem
"perkier."

More to the point, I created an app using MRI that relied on
downloading a boatload of information from a Web service. Single
threaded, this took about 20 minutes, where using multiple threads, it
was accomplished in 3-5 minutes. However, this one involves a good
deal of trickery so as not to step on buffers in the net/http
libraries (or something underlying).

So I come back to the question: As we find ourselves with resources
that scale across processing units, how best does Ruby solve the
problem and what role do Fibers play in that solution, if any?

Thanks again,

Steve


On Jun 9, 2009, at 10:44 AM, Tony Arcieri wrote:

> On Tue, Jun 9, 2009 at 10:06 AM, Charles Oliver Nutter
> <headius@headius.com>wrote:
>
>> Or just use JRuby, and real concurrent/parallel threads will just
>> work
>> out of the box :)
>>
>
> Well, as best they can on Ruby. You guys have done some really
> great work
> on that, but Ruby's approach to threading is rather poor.
>
> --
> Tony Arcieri
> medioh.com


Tony Arcieri

6/9/2009 6:35:00 PM

0

[Note: parts of this message were removed to make it a legal post.]

On Tue, Jun 9, 2009 at 12:11 PM, s.ross <cwdinfo@gmail.com> wrote:

> More to the point, I created an app using MRI that relied on downloading a
> boatload of information from a Web service. Single threaded, this took about
> 20 minutes, where using multiple threads, it was accomplished in 3-5
> minutes. However, this one involves a good deal of trickery so as not to
> step on buffers in the net/http libraries (or something underlying).
>
> So I come back to the question: As we find ourselves with resources that
> scale across processing units, how best does Ruby solve the problem and what
> role do Fibers play in that solution, if any?


Having written what's probably the fastest concurrent HTTP fetcher available
in Ruby, here's a bit on how it worked in practice:

We set the system up to allow N HTTP fetching "agents", each of which would
attach to a message queue and indicate their availability for accepting
jobs. Want it to go faster? Just make N bigger.

A command and control process would then pick and idle fetcher agent and
send it a batch of URLs to fetch.

It used a lightweight concurrency library I wrote called Revactor which is
based around Fibers. Each fetcher process used 64 Fibers which would pull
from the URL buffer in a round robin fashion. If you're curious how this
works, the core logic for this process is distributed as part of Revactor's
standard library:

http://github.com/tarcieri/revactor/blob/master/lib/revactor/http_...

We ran one of these fetcher processes per CPU core of the systems we were
running them on. They were rather CPU intensive as they did a lot of regex
processing on the fetched documents. That said, it didn't take much: we
were able to suck in 30 megabits of data at once using just four processes
running on a single quad core system.

--
Tony Arcieri
medioh.com

James Gray

6/9/2009 8:20:00 PM

0

On Jun 9, 2009, at 1:11 PM, s.ross wrote:

> The question I'm trying to answer is not how great the threading
> solution is but rather whether the threading solution solves the
> problem of distributing workload without incurring more overhead
> than it's worth.

While threading does solve that problem in some languages, it not
really for that in Ruby. In Ruby, threading is for separating off
action that will need to wait at some point (generally on I/O
operation) so you can keep working on other things while they do.

> I implemented a simple MacRuby app that just fills a list from a
> database. Doing the database query and populating the 20 or so items
> that show at any given time still gave the app a kind of sluggish
> feel. Once I separated this into a separate thread, removing the
> block, the simple fact that the UI was alive made the app seem
> "perkier."

Sure. Your Thread paused waiting for the database I/O, but the rest
of the application kept moving. That's a good example of where Ruby's
threads help.

> More to the point, I created an app using MRI that relied on
> downloading a boatload of information from a Web service. Single
> threaded, this took about 20 minutes, where using multiple threads,
> it was accomplished in 3-5 minutes. However, this one involves a
> good deal of trickery so as not to step on buffers in the net/http
> libraries (or something underlying).

Again, at each Thread hit a waiting period, others had a chance to
run. When it was single threaded, you had to serially wait through
each pause.

The important thing to realize about all of the above is that they
didn't go faster because you were suddenly doing more than one thing
at a time. MRI doesn't do that. You just arranged to spend less time
waiting. That's nice, but it's not true concurrency.

> So I come back to the question: As we find ourselves with resources
> that scale across processing units, how best does Ruby solve the
> problem and what role do Fibers play in that solution, if any?

When you really want to do two things at once with Ruby, you want more
processes. fork() is your friend. :)

James Edward Gray II