Asp Forum - Subtle bug: Telnet / socket / thread?

Mark Probert

11/23/2004 1:44:00 AM

Hi.

I was wondering if anyone has advice on how to debug the following
problem. I am using Ruby 1.8.1, establishing telnet sessions whilst in
a thread. Most of the time, it is fine. Every now and then, the
socket.syswrite() in telnet seems to send characters bound for another
socket / thread. Or so it seems.

The trouble is that the issue appears to be timing related. If I turn on
the dump log in Telnet, then the error, and the extra characters on
input, go away (a short term fix). So, I am not sure on how to isolate
the issue.

The problem is reproducible in the sense that I can get the rubbish output
almost everytime. Unfortunately, the combination of factors is pretty
complex and the test setup is not easily reproduciable.

Any thoughts?

-mark.

7 Answers

Bill Kelly

11/23/2004 2:31:00 AM

Hi,

From: "Mark Probert" <probertm@nospam-acm.org>
>
> I was wondering if anyone has advice on how to debug the following
> problem. I am using Ruby 1.8.1, establishing telnet sessions whilst in
> a thread. Most of the time, it is fine. Every now and then, the
> socket.syswrite() in telnet seems to send characters bound for another
> socket / thread. Or so it seems.

Sorry I can't be of more help.... I just wanted to mention that
I have an application that is a telnet/VT100 server that has
worked reliably on 1.6.8, 1.8.0, 1.8.1, and 1.8.2 now.. However,
I have never used socket.syswrite()... Only #send and #recv...

(I have had an unexpected issue with select() saying "ok" and
then UDPSocket#recv hanging, but I've been able to work around that
in what seems to be a reliable way. (Hundreds of days uptime.))

So anyway - if it's not a drastic change to your code to try
using *only* #send and #recv, it might be worth a try, just to
see whether the problem disappears? . . . Keeping in mind that
send and recv only transmit "up to" the number of bytes you
request, so you'll need to go in a loop if you want to be sure
to transmit/recv the full amout...

Hope this helps,

Regards,

Bill

Mark Probert

11/23/2004 3:19:00 AM

Hi ..

"Bill Kelly" <billk@cts.com> wrote:
>
> From: "Mark Probert" <probertm@nospam-acm.org>
>>
>> I was wondering if anyone has advice on how to debug the following
>> problem. I am using Ruby 1.8.1, establishing telnet sessions whilst
>> in a thread. Most of the time, it is fine. Every now and then, the
>> socket.syswrite() in telnet seems to send characters bound for
>> another socket / thread. Or so it seems.
>
> Sorry I can't be of more help.... Only #send and #recv...
>
Thanks, Bill.

the @sock.syswrite() is the base call in all of Telnet, underlying the
print(), puts(), cmd() and so on. It is the primative that is called to
send data to the host.

As an update, I managed to get the system to 'fail' with dump-log turned
on. the dump log records that the command is corrupted prior to it
being sent. The code flow looks like:

puts "sending cmd -- #{c}" # c is correct here
@conn.write(c) # @conn is a Telnet object --> calls Telnet.write()

def write(string)
length = string.length
while 0 < length
IO::select(nil, [@sock])
@dumplog.log_dump('>', string[-length..-1]) if @options.has_key?
"Dump_log") # <--- string is bad here! length -=
@sock.syswrite(string[-length..-1])
end
end

so, i can only assume that one of the other threads is writing to the
this string. Is this possible? Or is there some other way that the
passed string can become corrupt?

Perplexed ...

-mark.

James Gray

11/23/2004 3:27:00 AM

On Nov 22, 2004, at 9:23 PM, Mark Probert wrote:

> so, i can only assume that one of the other threads is writing to the
> this string. Is this possible? Or is there some other way that the
> passed string can become corrupt?

This sounds like the Threads are sharing this String resource. Did it
exist outside of the Threads when you created them? If so, did you
pass into the Thread with something like:

Thread.new(outer_string) do |thread_local_string|
# ...
end

?

James Edward Gray II

Mark Probert

11/23/2004 3:42:00 AM

hi ..
James Edward Gray II <james@grayproductions.net> wrote:
>
>> so, i can only assume that one of the other threads is writing to the
>> this string. Is this possible? Or is there some other way that the
>> passed string can become corrupt?
>
> This sounds like the Threads are sharing this String resource. Did it
> exist outside of the Threads when you created them? If so, did you
> pass into the Thread with something like:
>
I don't think so. The telnet object is completely wrapped in a class and
each instance of the class is unique to each thread (it is constructed
inside the thread). There are no class variables. And the variables
leading into the @conn.write() call are all local.

This is tricky 'cause the problem is intermittent.

Thanks,

-mark.

Sam Roberts

11/23/2004 3:44:00 AM

Quoteing james@grayproductions.net, on Tue, Nov 23, 2004 at 12:27:25PM +0900:
> On Nov 22, 2004, at 9:23 PM, Mark Probert wrote:
>
> >so, i can only assume that one of the other threads is writing to the
> >this string. Is this possible? Or is there some other way that the
> >passed string can become corrupt?

I had a bug recently with strings being corrupted, it was because
a string was shared, and was modified by another piece of code.

I don't know about your app, but perhaps you could freeze the strings
either before you pass them to telnet, or inside telnet, which would
allow you to detect the modfier, if thats whats happening.

Cheers,
sam

> This sounds like the Threads are sharing this String resource. Did it
> exist outside of the Threads when you created them? If so, did you
> pass into the Thread with something like:
>
> Thread.new(outer_string) do |thread_local_string|
> # ...
> end
>
> ?
>
> James Edward Gray II
>
>

Bill Kelly

11/23/2004 5:34:00 AM

Hi,

From: "Mark Probert" <probertm@nospam-acm.org>
> "Bill Kelly" <billk@cts.com> wrote:
> >
> > Sorry I can't be of more help.... Only #send and #recv...
> >
> the @sock.syswrite() is the base call in all of Telnet, underlying the
> print(), puts(), cmd() and so on. It is the primative that is called to
> send data to the host.

To the best of my (limited) knowledge, #send and #recv
are lower level than syswrite(). *Assuming* #send and #recv
translate in Ruby to the Berkeley socket system functions
of the same names. (I have not looked at ruby's implementation
of #syswrite and #send / #recv... so.. I could be mistaken. I'm
just judging by their name and corresponding behavior.)

> As an update, I managed to get the system to 'fail' with dump-log turned
> on. the dump log records that the command is corrupted prior to it
> being sent. The code flow looks like:
>
> puts "sending cmd -- #{c}" # c is correct here
> @conn.write(c) # @conn is a Telnet object --> calls Telnet.write()
>
> def write(string)
> length = string.length
> while 0 < length
> IO::select(nil, [@sock])
> @dumplog.log_dump('>', string[-length..-1]) if @options.has_key?
> "Dump_log") # <--- string is bad here! length -=
> @sock.syswrite(string[-length..-1])
> end
> end
>
> so, i can only assume that one of the other threads is writing to the
> this string. Is this possible? Or is there some other way that the
> passed string can become corrupt?

What if you try freezing "c" ? Maybe c.freeze before the
printout at the top verifying it's correct... Perhaps another
thread is unexpectedly modifying it?

HTH,

Regards,

Bill

James Gray

11/23/2004 2:37:00 PM

On Nov 22, 2004, at 11:33 PM, Bill Kelly wrote:

> To the best of my (limited) knowledge, #send and #recv
> are lower level than syswrite(). *Assuming* #send and #recv
> translate in Ruby to the Berkeley socket system functions
> of the same names. (I have not looked at ruby's implementation
> of #syswrite and #send / #recv... so.. I could be mistaken. I'm
> just judging by their name and corresponding behavior.)

My big confusing stems from both of these approaches. Isn't one of the
big advantages of using a thread design that you can use the higher
level IO calls without fear of blocking?

> What if you try freezing "c" ? Maybe c.freeze before the
> printout at the top verifying it's correct... Perhaps another
> thread is unexpectedly modifying it?

This is really the key to the solution, whether through freezing or
not. If the String is being modified externally, you have to isolate
the chunk of your code doing it. Ruby doesn't randomly modify the
contents of your variables, we hope.

James Edward Gray II

comp.lang.ruby

Subtle bug: Telnet / socket / thread?

Mark Probert

Bill Kelly

Mark Probert

James Gray

Mark Probert

Sam Roberts

Bill Kelly

James Gray

x Login to ForumsZone