[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Deadlock in DRb

Lars Christensen

4/22/2008 11:16:00 AM

In a program with two DRb servers running (two time start_service), i
get the following deadlock after a while of running with a client
connecting to both servers:

deadlock 0x284c748: sleep:J(0x2c84f7c) (main) - server.rb:54
deadlock 0x2c84f7c: sleep:F(4) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
944
deadlock 0x2d01338: sleep:F(5) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
566
deadlock 0x2c854cc: sleep:F(3) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
944
deadlock 0x2cff81c: sleep:S - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
626
c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:626: Thread(0x2cff81c): deadlock
(fatal)

How can I debug this issue? I don't understand why it is a deadlock at
all, since drb.rb:944 is a call to Socket#accept, which does not
depend purely on other Ruby threads.

Any ideas?

Lars
5 Answers

Robert Klemme

4/22/2008 1:46:00 PM

0

2008/4/22, Lars Christensen <larsch@belunktum.dk>:
> In a program with two DRb servers running (two time start_service), i

Why do you have two servers?

> get the following deadlock after a while of running with a client
> connecting to both servers:
>
> deadlock 0x284c748: sleep:J(0x2c84f7c) (main) - server.rb:54
> deadlock 0x2c84f7c: sleep:F(4) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
> 944
> deadlock 0x2d01338: sleep:F(5) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
> 566
> deadlock 0x2c854cc: sleep:F(3) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
> 944
> deadlock 0x2cff81c: sleep:S - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
> 626
> c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:626: Thread(0x2cff81c): deadlock
> (fatal)
>
> How can I debug this issue? I don't understand why it is a deadlock at
> all, since drb.rb:944 is a call to Socket#accept, which does not
> depend purely on other Ruby threads.
>
> Any ideas?

For a deadlock you need at least two resources that are locked in
different order. Maybe you have synchronized calls across the two
servers that deadlock.

You could use set_trace_func to trace program execution until the
deadlock and look at the execution flow.

Kind regards

robert


--
use.inject do |as, often| as.you_can - without end

Lars Christensen

4/23/2008 12:46:00 PM

0

On Apr 22, 3:45 pm, Robert Klemme <shortcut...@googlemail.com> wrote:
> 2008/4/22, Lars Christensen <lar...@belunktum.dk>:
>
> > In a program with two DRb servers running (two time start_service), i
>
> Why do you have two servers?

Well... legacy. I have converted my application to having only 1 DRb
service started, but the same problem occurs. I still get a deadlock
after the clients have been connecting for a while.

> For adeadlockyou need at least two resources that are locked in
> different order.  Maybe you have synchronized calls across the two
> servers thatdeadlock.

My main thread is blocked by DRb.thread.join. All other threads are
inside the DRb library on either Socket#accept, #read or #write.

How can there be a deadlock if a thread is waiting in a Socket#accept
call? As I understand the Ruby deadlock detection is simply fires when
there is no thread to run.

> You could use set_trace_func to trace program execution until thedeadlockand look at the execution flow.

I have tried this, but it doesn't show anything other that the
deadlock report from Ruby, i.e. that the threads are calling
Socket#accept, #read or #write and Thread#join.

Lars

Robert Klemme

4/23/2008 1:11:00 PM

0

2008/4/23, Lars Christensen <larsch@belunktum.dk>:
> On Apr 22, 3:45 pm, Robert Klemme <shortcut...@googlemail.com> wrote:
> > 2008/4/22, Lars Christensen <lar...@belunktum.dk>:
>
> > > In a program with two DRb servers running (two time start_service), i
> >
> > Why do you have two servers?
>
> Well... legacy. I have converted my application to having only 1 DRb
> service started, but the same problem occurs. I still get a deadlock
> after the clients have been connecting for a while.

Too bad.

> > For adeadlockyou need at least two resources that are locked in
> > different order. Maybe you have synchronized calls across the two
> > servers thatdeadlock.
>
> My main thread is blocked by DRb.thread.join. All other threads are
> inside the DRb library on either Socket#accept, #read or #write.

And, are there any locks held?

> How can there be a deadlock if a thread is waiting in a Socket#accept
> call? As I understand the Ruby deadlock detection is simply fires when
> there is no thread to run.
>
> > You could use set_trace_func to trace program execution until thedeadlockand look at the execution flow.
>
> I have tried this, but it doesn't show anything other that the
> deadlock report from Ruby, i.e. that the threads are calling
> Socket#accept, #read or #write and Thread#join.

These issues are next to impossible to debug without access to code
and an understanding of what the app really does. I'm afraid, I can't
help you further right now.

Kind regards

robert

--
use.inject do |as, often| as.you_can - without end

Ezra Zygmuntowicz

4/23/2008 6:20:00 PM

0


On Apr 23, 2008, at 6:11 AM, Robert Klemme wrote:
> 2008/4/23, Lars Christensen <larsch@belunktum.dk>:
>> On Apr 22, 3:45 pm, Robert Klemme <shortcut...@googlemail.com> wrote:
>>> 2008/4/22, Lars Christensen <lar...@belunktum.dk>:
>>
>>>> In a program with two DRb servers running (two time
>>>> start_service), i
>>>
>>> Why do you have two servers?
>>
>> Well... legacy. I have converted my application to having only 1 DRb
>> service started, but the same problem occurs. I still get a deadlock
>> after the clients have been connecting for a while.
>
> Too bad.
>
>>> For adeadlockyou need at least two resources that are locked in
>>> different order. Maybe you have synchronized calls across the two
>>> servers thatdeadlock.
>>
>> My main thread is blocked by DRb.thread.join. All other threads are
>> inside the DRb library on either Socket#accept, #read or #write.
>
> And, are there any locks held?
>
>> How can there be a deadlock if a thread is waiting in a Socket#accept
>> call? As I understand the Ruby deadlock detection is simply fires
>> when
>> there is no thread to run.
>>
>>> You could use set_trace_func to trace program execution until
>>> thedeadlockand look at the execution flow.
>>
>> I have tried this, but it doesn't show anything other that the
>> deadlock report from Ruby, i.e. that the threads are calling
>> Socket#accept, #read or #write and Thread#join.
>
> These issues are next to impossible to debug without access to code
> and an understanding of what the app really does. I'm afraid, I can't
> help you further right now.
>
> Kind regards
>
> robert



What version and patch level of ruby do you have? If you have ruby
1.8.6 and the patch level is less than p111 then you have a faulty
ruby interpreter with broken threading that can cause these deadlocks.
Make sure you are using ruby 1.8.5 or ruby 1.8.6p11 minimum.

Cheers-


- Ezra Zygmuntowicz
-- Founder & Software Architect
-- ezra@engineyard.com
-- EngineYard.com


Lars Christensen

4/24/2008 12:12:00 PM

0

On Apr 23, 8:19 pm, Ezra Zygmuntowicz <ezmob...@gmail.com> wrote:
>         What version and patch level of ruby do you have? If you have ruby  
> 1.8.6 and the patch level is less than p111 then you have a faulty  
> ruby interpreter with broken threading that can cause these deadlocks.  
> Make sure you are using ruby 1.8.5 or ruby 1.8.6p11 minimum.

Had the same problem with 1.8.6p111. I finally tracked down the
problem to a bug in Process.create from the 'win32-process' gem. Some
code added to this function afterversion 0.5.5 would call CloseHandle
on something that was not a handle but a process or thread ID. When
these are the same as socket handles, etc, the process would sometimes
deadlock, sometimes simply close a listening socket, fail in
Socket#accept, or go into infinte loops.

http://rubyforge.org/tracker/index.php?func=detail&aid=19753&group_id=85&am....

I was able to work around it by setting :close_handles => false in the
call to Process#create.

Lars