Asp Forum - signalling between threads using socket read/write calls causing a hang

avik.ghosh

8/10/2004 2:50:00 AM

Hello,

I need help in fixing a race condition involving pthread calls.

I have an application that creates two threads, say Th#1 and Th#2
(besides the thread manager ).

Th#2, which is created using pthread_create by Th#1 ( the main
thread), manages various sockets using 'select' - one of these being a
socket ( not a pipe ) whose peer is held by Th#1.

When Th#1 has data to be sent out, it fills a buffer and sends out a
message of a few bytes onto the signalling socket. It then does a
blocking wait on a pipe.

When Th#2 returns from 'select' indicating data on the signalling
socket, it flushes the data in the common buffer and then sends a
'done' signal on the pipe.

Basically, signals are sent in both ways, over a socket in one
direction and over a pipe in the other. ( The above application works
on Windows as well, with appropriate wrappers round the pthread calls
for thread creation and lock handling. This is why Th#2 reads the
request signal over a socket and not a pipe - it has to be selectable.
)

What I observe is that when I have a large number of messages to send,
strace shows that Th#2 hangs in rt_sigsuspend(). Sometimes this
happens after several thousand iterations, sometimes after only a few
hundred. At this point, it seems that Th#1 has filled the buffer, sent
out the signal message on the socket, and is waiting on the 'done'
message on the pipe, but Th#2 is stuck in rt_sigsuspend().

I do not have any signal handlers in my application, besides ignoring
SIG_PIPE.

I do use several pthread_mutex objects to lock various shared data
structures, but I don't think that they could be the problem.

The box runs RedHat 7.3 and the version of Linux is 2.4.20-18.7bigmem.
The glibc, libpthread are all as packaged by RedHat.

Is it wrong to use 'read'/'write' calls on pipes/sockets in this way ?

Am I supposed to take any precautions, like masking certain signals
etc ?

Thanks,

Avik.

6 Answers

Joe Seigh

8/10/2004 10:45:00 AM

Avik Ghosh wrote:
>
> Hello,
>
> I need help in fixing a race condition involving pthread calls.
....
> When Th#2 returns from 'select' indicating data on the signalling
> socket, it flushes the data in the common buffer and then sends a
> 'done' signal on the pipe.
>
...
> )
>
> What I observe is that when I have a large number of messages to send,
> strace shows that Th#2 hangs in rt_sigsuspend(). Sometimes this
> happens after several thousand iterations, sometimes after only a few
> hundred. At this point, it seems that Th#1 has filled the buffer, sent
> out the signal message on the socket, and is waiting on the 'done'
> message on the pipe, but Th#2 is stuck in rt_sigsuspend().
>

What do you mean flush the data in the common buffer?

Joe Seigh

avik.ghosh

8/10/2004 2:16:00 PM

Joe Seigh <jseigh_01@xemaps.com> wrote in message news:<4118A958.34EABF79@xemaps.com>...
> Avik Ghosh wrote:
> >
> > Hello,
> >
> > I need help in fixing a race condition involving pthread calls.
> ...
> > When Th#2 returns from 'select' indicating data on the signalling
> > socket, it flushes the data in the common buffer and then sends a
> > 'done' signal on the pipe.
> >
> ..
> > )
> >
> > What I observe is that when I have a large number of messages to send,
> > strace shows that Th#2 hangs in rt_sigsuspend(). Sometimes this
> > happens after several thousand iterations, sometimes after only a few
> > hundred. At this point, it seems that Th#1 has filled the buffer, sent
> > out the signal message on the socket, and is waiting on the 'done'
> > message on the pipe, but Th#2 is stuck in rt_sigsuspend().
> >
>
> What do you mean flush the data in the common buffer?
>
> Joe Seigh

Hello Joe,

Thanks for looking into the problem I am facing.

Th#2 copies the data from the common buffer into another buffer which
only it operates. It then attempts to write out this data onto one of
the sockets that it manages. If only part of the data is sent, it
inserts the socket into the select fd_set for writing. It then signals
Th#1 by sending the 'done' message on the pipe, to indicate that it
has 'flushed the buffer', i.e, it has handled the data.

By 'buffer', I mean a simple struct which has a malloc()ed char *
pointer, and integers to indicate the current length and the malloc
size.

I will try to run the same application on Solaris today to see if I
face the same race condition.

Regards,

Avik.

Joe Seigh

8/10/2004 2:29:00 PM

Avik Ghosh wrote:
>
> Joe Seigh <jseigh_01@xemaps.com> wrote in message news:<4118A958.34EABF79@xemaps
> > What do you mean flush the data in the common buffer?
> >
>
> Th#2 copies the data from the common buffer into another buffer which
> only it operates. It then attempts to write out this data onto one of
> the sockets that it manages. If only part of the data is sent, it
> inserts the socket into the select fd_set for writing. It then signals
> Th#1 by sending the 'done' message on the pipe, to indicate that it
> has 'flushed the buffer', i.e, it has handled the data.
>
> By 'buffer', I mean a simple struct which has a malloc()ed char *
> pointer, and integers to indicate the current length and the malloc
> size.
>
> I will try to run the same application on Solaris today to see if I
> face the same race condition.
>

You're talking messages but read() and write() don't operate on messages, they
operate on a byte stream. There's nothing wrong with using read or write as
long as you realize there is no correspondence between the sizes of data what
you write and the sizes that you read except that the sum of what you read is
always less than or equal to the size that you write.

Joe Seigh

avik.ghosh

8/10/2004 6:23:00 PM

Joe Seigh <jseigh_01@xemaps.com> wrote in message news:<4118DDE5.53F678E6@xemaps.com>...
> Avik Ghosh wrote:
> >
> > Joe Seigh <jseigh_01@xemaps.com> wrote in message news:<4118A958.34EABF79@xemaps
> > > What do you mean flush the data in the common buffer?
> > >
> >
> > Th#2 copies the data from the common buffer into another buffer which
> > only it operates. It then attempts to write out this data onto one of
> > the sockets that it manages. If only part of the data is sent, it
> > inserts the socket into the select fd_set for writing. It then signals
> > Th#1 by sending the 'done' message on the pipe, to indicate that it
> > has 'flushed the buffer', i.e, it has handled the data.
> >
> > By 'buffer', I mean a simple struct which has a malloc()ed char *
> > pointer, and integers to indicate the current length and the malloc
> > size.
> >
> > I will try to run the same application on Solaris today to see if I
> > face the same race condition.
> >
>
> You're talking messages but read() and write() don't operate on messages, they
> operate on a byte stream. There's nothing wrong with using read or write as
> long as you realize there is no correspondence between the sizes of data what
> you write and the sizes that you read except that the sum of what you read is
> always less than or equal to the size that you write.
>
> Joe Seigh

Hi,

Sorry, I should have been a bit more clear. A 'message' is just a
stream of bytes, as you mention. It is a message from the point of
view of the application layer on top, complete with header etc. The
Th#1 and Th#2 ( which are fast becoming old acquaintances ) that I
mention only know bytestreams.

In a nutshell, the design is this :

Th#2 has a number of sockets to read and write data from. One of these
is a socket whose peer is Th#1, and when a brief byte sequence is read
from this socket, Th#2 knows to copy data from a buffer ( into which
data has been copied by Th#1 prior to sending the byte sequence ) to
another buffer and to acknowledge receipt to Th#1 by writing another
byte sequence onto a pipe. This data is then written out onto a socket
as part of Th#2's standard processing loop.

I should mention that the Th#2 loop is part of a standard messaging
library that has been in production for years, and is quite stable.
Only standard read/write/select calls are used. It runs on Solaris,
Linux and Windows, so there is no special Unix magic involved.

In the application that I mention, I have encapsulated the main event
loop ( Th#2 ) in a thread. The communication between this event loop
thread and the main thread is using the socket/pipe combination that I
have described.

I compile using _REENTRANT for good measure, besides -Wall,
-Wmissing-prototypes and other switches.

Am I correct in assuming read() and write(), along with select() can
be safely used with pthreads ? Do I have to do something special, like
masking signals ?( as I mention, I do not handle any signals, other
than ignoring SIG_PIPE )

One thing I noticed about the strace output :

When Th#1 is sending a stream of messages to Th#2 ( signalling back
and forth as above )

I see several calls to kill(pid, RTMIN) ( where pid is the process id
of Th#2 ) interspersed with the write() and read() calls to the socket
and pipe respectively. This does not seem to cause any problem, as the
application continues correctly.

However, the application hangs the moment Th#2 calls rt_sigsuspend(),
immediately following a successful call to rt_sigprocmask(SIG_SETMASK,
NULL, [RTMIN], 8)

I feel I must be missing something obvious, like compiling without
using some special flags or something.

Thanks again for your interest.

Avik.

Joe Seigh

8/10/2004 7:39:00 PM

Avik Ghosh wrote:

> Am I correct in assuming read() and write(), along with select() can
> be safely used with pthreads ? Do I have to do something special, like
> masking signals ?( as I mention, I do not handle any signals, other
> than ignoring SIG_PIPE )

You're not assuming that because you did two writes to a socket that you
will get two reads from the socket? Messages can be combined.

Joe Seigh

avik.ghosh

8/11/2004 12:40:00 AM

avik.ghosh@gmail.com (Avik Ghosh) wrote in message news:<6f04c0dd.0408101023.5b8db4af@posting.google.com>...
> Joe Seigh <jseigh_01@xemaps.com> wrote in message news:<4118DDE5.53F678E6@xemaps.com>...
> > Avik Ghosh wrote:
> > >
> > > Joe Seigh <jseigh_01@xemaps.com> wrote in message news:<4118A958.34EABF79@xemaps
> > > > What do you mean flush the data in the common buffer?
> > > >
> > >
> > > Th#2 copies the data from the common buffer into another buffer which
> > > only it operates. It then attempts to write out this data onto one of
> > > the sockets that it manages. If only part of the data is sent, it
> > > inserts the socket into the select fd_set for writing. It then signals
> > > Th#1 by sending the 'done' message on the pipe, to indicate that it
> > > has 'flushed the buffer', i.e, it has handled the data.
> > >
> > > By 'buffer', I mean a simple struct which has a malloc()ed char *
> > > pointer, and integers to indicate the current length and the malloc
> > > size.
> > >
> > > I will try to run the same application on Solaris today to see if I
> > > face the same race condition.
> > >
> >
> > You're talking messages but read() and write() don't operate on messages, they
> > operate on a byte stream. There's nothing wrong with using read or write as
> > long as you realize there is no correspondence between the sizes of data what
> > you write and the sizes that you read except that the sum of what you read is
> > always less than or equal to the size that you write.
> >
> > Joe Seigh
>
>
> Hi,
>
> Sorry, I should have been a bit more clear. A 'message' is just a
> stream of bytes, as you mention. It is a message from the point of
> view of the application layer on top, complete with header etc. The
> Th#1 and Th#2 ( which are fast becoming old acquaintances ) that I
> mention only know bytestreams.
>
> In a nutshell, the design is this :
>
> Th#2 has a number of sockets to read and write data from. One of these
> is a socket whose peer is Th#1, and when a brief byte sequence is read
> from this socket, Th#2 knows to copy data from a buffer ( into which
> data has been copied by Th#1 prior to sending the byte sequence ) to
> another buffer and to acknowledge receipt to Th#1 by writing another
> byte sequence onto a pipe. This data is then written out onto a socket
> as part of Th#2's standard processing loop.
>
> I should mention that the Th#2 loop is part of a standard messaging
> library that has been in production for years, and is quite stable.
> Only standard read/write/select calls are used. It runs on Solaris,
> Linux and Windows, so there is no special Unix magic involved.
>
> In the application that I mention, I have encapsulated the main event
> loop ( Th#2 ) in a thread. The communication between this event loop
> thread and the main thread is using the socket/pipe combination that I
> have described.
>
> I compile using _REENTRANT for good measure, besides -Wall,
> -Wmissing-prototypes and other switches.
>
> Am I correct in assuming read() and write(), along with select() can
> be safely used with pthreads ? Do I have to do something special, like
> masking signals ?( as I mention, I do not handle any signals, other
> than ignoring SIG_PIPE )
>
> One thing I noticed about the strace output :
>
> When Th#1 is sending a stream of messages to Th#2 ( signalling back
> and forth as above )
>
> I see several calls to kill(pid, RTMIN) ( where pid is the process id
> of Th#2 ) interspersed with the write() and read() calls to the socket
> and pipe respectively. This does not seem to cause any problem, as the
> application continues correctly.
>
> However, the application hangs the moment Th#2 calls rt_sigsuspend(),
> immediately following a successful call to rt_sigprocmask(SIG_SETMASK,
> NULL, [RTMIN], 8)
>
> I feel I must be missing something obvious, like compiling without
> using some special flags or something.
>
> Thanks again for your interest.
>
> Avik.

Right, I have found the problem, and, thankfully, it is in my code.

What I had omitted to mention, is that the event processing thread (
Th#2 ) handles application level timers ( also done through select )
as well as sockets.

In my application, there is a timer which kicks in every now and then.
This timer obtains a lock, does some processing and releases it.

The race occurs when a large number of exchanges are taking place
between Th#1 and Th#2 as described above, and the timer mentioned
above expires just when Th#1 has signalled, but Th#2 has not yet been
woken up from select().

The select() call returns because Th#1 has sent a signal indicating
data has to be sent, but also, at this time, the timer needs to be
run. The timer code is attempted to run first, and deadlocks in trying
to get the lock ( this lock is held by Th#1 ).

Sorry about the false alarm - but I feel more confident about my
entire application after crawling all over it since yesterday !

Thanks,

Avik.

comp.programming.threads

signalling between threads using socket read/write calls causing a hang

avik.ghosh

Joe Seigh

avik.ghosh

Joe Seigh

avik.ghosh

Joe Seigh

avik.ghosh

x Login to ForumsZone