[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Drb communication problem and crash

Laurent

10/12/2007 6:01:00 PM

Hi,

So first of all a little context of what I'm trying to do.
I have Rails app that needs quite a bit of computation and I want to run
the different queries in a number of different processes. To do so, I'm
trying to implement the following system:

Rails --> Drb Query Dispatcher --> Drb Query Runner

Rails sends a job to the query dispatcher which load balances the jobs
over serveral query runners.

The whole system works and then suddenly hangs. When it hangs I get the
following message on the Drb Query Dispatcher:

message type 0x54 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x43 arrived from server while idle
message type 0x5a arrived from server while idle

Then I can see that there is still some action for some of the queries
until it freezes completely.

Did anyone encounter similar problems? Or knows where I could fine at
least the signification of these messages?

Thanks alot!
I'll be glad to give more information if needed.

PS: I have tried to implement this with the Slave library but ran into
even more trouble with logs just making nonsense (looked like some
memory corruption somewhere)
--
Posted via http://www.ruby-....

8 Answers

Eric Hodel

10/12/2007 8:27:00 PM

0

On Oct 12, 2007, at 11:00 , Laurent Francioli wrote:
> So first of all a little context of what I'm trying to do.
> I have Rails app that needs quite a bit of computation and I want
> to run
> the different queries in a number of different processes. To do so,
> I'm
> trying to implement the following system:
>
> Rails --> Drb Query Dispatcher --> Drb Query Runner
>
> Rails sends a job to the query dispatcher which load balances the jobs
> over serveral query runners.
>
> The whole system works and then suddenly hangs. When it hangs I get
> the
> following message on the Drb Query Dispatcher:
>
> message type 0x54 arrived from server while idle
> message type 0x44 arrived from server while idle
> message type 0x43 arrived from server while idle
> message type 0x5a arrived from server while idle

I don't see where this message is coming from in DRb, or ruby.

> Then I can see that there is still some action for some of the queries
> until it freezes completely.
>
> Did anyone encounter similar problems? Or knows where I could fine at
> least the signification of these messages?

grep your code for 'while idle', that will help.

--
Poor workers blame their tools. Good workers build better tools. The
best workers get their tools to do the work for them. -- Syndicate Wars



Laurent

10/14/2007 9:27:00 PM

0

Hi,

Thanks for your quick answer! Well the message definitely doesn't come
from my code. If it really doesn't come from Drb neither Ruby, maybe it
is a system message?

Also, I've read your Seattle.rb presentation slides and on one of your
slides you seem to say that ACL shouldnt be used and could cause
deadlocks; is this right? I'm asking cause we're using it in our system
to restrain the accepted calls from the localhost only.

Another thing I noticed is that my version not using the Slave lib
actually does produce the same behavior (variables mix-up, etc). It
looks like the communication between the server and clients has some
troubles. I also noticed that the problems occur more often with
increasing number of servers running.
Hope that helps a bit...

I'll keep you posted if I get new clues or even better...a fix!

Thanks!
Laurent

--
Posted via http://www.ruby-....

Laurent

10/15/2007 4:50:00 PM

0

So I finally found the problem! The message I repported actually came
from Postgres.

The problem was that I had a connection to the DB at the moment of the
fork (both using the Slave lib and my own forking stuff). It seems that
this was somehow passed onto the child processes and interfered with the
child access to the DB. I'm not 100% sure why since the child processes
were actually creating their own connections anyway.
But I'm sure it came from there tho since it is completely stable now!

Thanks alot for your quick reply!
Laurent

--
Posted via http://www.ruby-....

Eric Hodel

10/15/2007 6:37:00 PM

0

On Oct 14, 2007, at 14:27 , Laurent Francioli wrote:
> Thanks for your quick answer! Well the message definitely doesn't come
> from my code. If it really doesn't come from Drb neither Ruby,
> maybe it
> is a system message?
>
> Also, I've read your Seattle.rb presentation slides and on one of your
> slides you seem to say that ACL shouldnt be used and could cause
> deadlocks; is this right? I'm asking cause we're using it in our
> system
> to restrain the accepted calls from the localhost only.

I don't recall, which URL?

--
Poor workers blame their tools. Good workers build better tools. The
best workers get their tools to do the work for them. -- Syndicate Wars



Eric Hodel

10/15/2007 6:38:00 PM

0

On Oct 15, 2007, at 09:50 , Laurent Francioli wrote:

> So I finally found the problem! The message I repported actually came
> from Postgres.
>
> The problem was that I had a connection to the DB at the moment of the
> fork (both using the Slave lib and my own forking stuff). It seems
> that
> this was somehow passed onto the child processes and interfered
> with the
> child access to the DB. I'm not 100% sure why since the child
> processes
> were actually creating their own connections anyway.
> But I'm sure it came from there tho since it is completely stable now!

If it still had the file descriptor open, it would be copied.

--
Poor workers blame their tools. Good workers build better tools. The
best workers get their tools to do the work for them. -- Syndicate Wars



Laurent

10/16/2007 4:45:00 PM

0


> I don't recall, which URL?

Ok, it's really
old...http://blog.segment7.net/articles/2006/04/22/drb-an-introduction-an...
and as I said earlier, since I only had the slides and not the commment
on them I couldn't be sure :)

Btw, really nice presentation! It find it pretty difficult to get good
doc on Drb and that's a great piece!

Thanks,
Laurent
--
Posted via http://www.ruby-....

Eric Hodel

10/16/2007 7:33:00 PM

0

On Oct 16, 2007, at 09:44 , Laurent Francioli wrote:
>> I don't recall, which URL?
>
> Ok, it's really
> old...http://blog.segment7.net/articles/2006/04/...
> introduction-and-overview
> and as I said earlier, since I only had the slides and not the
> commment
> on them I couldn't be sure :)
>
> Btw, really nice presentation! It find it pretty difficult to get good
> doc on Drb and that's a great piece!

Ah, even with an ACL it is still possible for people to do bad stuff
to your DRb processes. ACLs by themselves won't cause deadlocks, but
they can't prevent malice.

--
Poor workers blame their tools. Good workers build better tools. The
best workers get their tools to do the work for them. -- Syndicate Wars



Laurent

10/17/2007 8:04:00 AM

0

Eric Hodel wrote:
> On Oct 16, 2007, at 09:44 , Laurent Francioli wrote:
>> doc on Drb and that's a great piece!
> Ah, even with an ACL it is still possible for people to do bad stuff
> to your DRb processes. ACLs by themselves won't cause deadlocks, but
> they can't prevent malice.

Ok, thanks for the explanation! :)
--
Posted via http://www.ruby-....