Asp Forum - Re: read and readline hanging

Olivier Lefevre

1/25/2008 4:31:00 PM

Hi Steve,

Thanks for the answer. Yes this is tricky. I have done it in Java
before, where you can, e.g., set up a thread to pump stuff out of
both stderr and stdout continuously but my python is too rudimentary
for that. There is a key difference anyway: in Java you can write

while (br.readLine() != null) { <pump> }

where br is the buffered reader in which you've wrapped the stdout
of the child process and it will not hang. But in python eventually
stdout.readline() hangs. This is a real nuisance: why can't it just
return None?

> 1. The subprocess has stopped producing output.

Indeed, if I do this interactively, I can tell after 3 lines that I've
gotten all there is to get right now and the fourth readline() call
hangs. But how can I find out *programmatically* that there is no more
input?

> If you are only reading its standard output, are you sure that the
> subprocess is flushing its buffers so you can recognize it's time to
> provide more input?

The subprocess in a purely passive position: it is an interpreter: I
send it commands and I read the answers. The python side is in charge.

> 2. The subprocess has blocked because it has filled its stderr buffer
> and is waiting for something (i.e. your program) to read it.

No, that's not it or it would hang immediately. I can get a few lines
out of stdout and then I hang because I can't tell when it's time to stop
pumping. But you are right that if there was something on stderr I would
be in trouble.

Regards,

-- O.L.

11 Answers

Marc 'BlackJack' Rintsch

1/25/2008 8:32:00 PM

On Fri, 25 Jan 2008 17:31:16 +0100, Olivier Lefevre wrote:

> Thanks for the answer. Yes this is tricky. I have done it in Java
> before, where you can, e.g., set up a thread to pump stuff out of
> both stderr and stdout continuously but my python is too rudimentary
> for that.

The `trheading` module is modeled after Java's threading API.

> There is a key difference anyway: in Java you can write
>
> while (br.readLine() != null) { <pump> }
>
> where br is the buffered reader in which you've wrapped the stdout
> of the child process and it will not hang. But in python eventually
> stdout.readline() hangs. This is a real nuisance: why can't it just
> return None?

Because that would be quite annoying because most of the time people want
blocking behavior.

>> 1. The subprocess has stopped producing output.
>
> Indeed, if I do this interactively, I can tell after 3 lines that I've
> gotten all there is to get right now and the fourth readline() call
> hangs. But how can I find out *programmatically* that there is no more
> input?

You can't.

>> If you are only reading its standard output, are you sure that the
> > subprocess is flushing its buffers so you can recognize it's time to
> > provide more input?
>
> The subprocess in a purely passive position: it is an interpreter: I
> send it commands and I read the answers. The python side is in charge.

This doesn't answer if the interpreter doesn't flush its output buffer
after every line.

Ciao,
Marc 'BlackJack' Rintsch

Thomas Bellman

1/25/2008 10:33:00 PM

Olivier Lefevre <lefevrol@yahoo.com> wrote:

>> 1. The subprocess has stopped producing output.

> Indeed, if I do this interactively, I can tell after 3 lines that I've
> gotten all there is to get right now and the fourth readline() call
> hangs.

Can you really? How do you know if the program has finished or
if it is just taking a very long time to produce the next line in
its response? Unless there is some way to differentiate between
the last line all the other lines of a response, you can't really
be sure.

> But how can I find out *programmatically* that there is no more
> input?

It is possible to check if there is something more to read at the
moment, but you can't check if the subprocess will produce more
to read in the future.

To check if there is something to read at this very moment, you
can use any of the following methods:

- select.select()
- the FIONREAD ioctl (the ioctl() function lives in the fcntl
module, and the FIONREAD constant is in the termios module)
- set the underlying file descriptor in non-blocking mode:
flags = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, flags | os.O_NDELAY)
After that, reads on the pipe will raise an IOError exception
with the errorcode EWOULDBLOCK.
- start a thread that does blocking reads from the pipe, and
puts the chunks it reads on a queue for your main thread to
grab.

For the last approach, you might be interrested in my asyncproc
module, which does exactly that. You can download it from
<http://www.lysator.liu.se/~bellman/download/asyncp....

However, none of these approaches absolves you from the necessity
of knowing when one response ends. You still need to solve that
problem.

The proper way is to define a protocol between your program and
the subprocess, in which you can clearly tell when you have
reached the end of a response. Then you need to get the program
you are calling to adher to that protocol, of course...

The SMTP protocol is a good example of how this can look. In
SMTP, each response to a command consists of a number of lines.
Each line has a three-digit response code, an "end of response"
flag, and a text message. The "end of response" flag is a space
(" ") for the last line in the response, and a dash ("-") for all
the other lines. The response to an EHLO command can look like
this:

250-sellafield Hello localhost [127.0.0.1], pleased to meet you
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-EXPN
250-VERB
250-8BITMIME
250-SIZE
250-DSN
250-ETRN
250-DELIVERBY
250 HELP

Since there is a space instead of a dash after the "250" code in
the last line above, the SMTP client knows that there won't be
any more lines in response to its command.

If you can't get the program you are calling to follow some
protocol like this, then you can only make guesses. Sometimes
you can make fairly good guesses, and sometimes it will be more
or less impossible...

--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden
"Life IS pain, highness. Anyone who tells ! bellman @ lysator.liu.se
differently is selling something." ! Make Love -- Nicht Wahr!

Olivier Lefevre

1/27/2008 6:58:00 PM

> The `trheading` module is modeled after Java's threading API.

OK. Thanks for the hint. However BufferedReader.readline() does
not block in Java, so it is still difficult to transpose.

>> But how can I find out *programmatically* that there is no more
>> input?
>
> You can't.

How do people handle this, then? Reading from a process that will
block if you ask too much yet won't let you know how much there is
to read right now has to be some kind of FAQ.

> This doesn't answer if the interpreter doesn't flush its output buffer
> after every line.

I think it must otherwise you might get incomplete answers or no
answers at the interactive prompt and that never happens. It may
not flush its buffer after every line but it must flush them at
the end of an answer.

-- O.L.

Olivier Lefevre

1/27/2008 7:09:00 PM

>> Indeed, if I do this interactively, I can tell after 3 lines that I've
>> gotten all there is to get right now and the fourth readline() call
>> hangs.
>
> Can you really?

Yes interactively: at the command prompt, you can tell when it's over
because you know the command you just sent and whether it requires an
answer and of which kind. Also, even if there is no answer you get a
fresh prompt when the interpreter is done.

> Unless there is some way to differentiate between the last line
> and all the other lines of a response, you can't really be sure.

Yes, that has since occurred to me. I need to echo some magic string
after each command to know that I reached the end of the answer to
the previous command. In interactive mode the prompt fulfills that
role.

> To check if there is something to read at this very moment, you
> can use any of the following methods:

Thanks for all the suggestions! That is just what I needed.

> - select.select()
> - the FIONREAD ioctl (the ioctl() function lives in the fcntl
> module, and the FIONREAD constant is in the termios module)
> - set the underlying file descriptor in non-blocking mode:
> flags = fcntl.fcntl(fd, fcntl.F_GETFL)
> fcntl.fcntl(fd, fcntl.F_SETFL, flags | os.O_NDELAY)
> After that, reads on the pipe will raise an IOError exception
> with the errorcode EWOULDBLOCK.

That sounds like the simplest approach.

> - start a thread that does blocking reads from the pipe, and
> puts the chunks it reads on a queue for your main thread to
> grab.

Yes but my python threading is worse than rudimentary. I will look
into the `trheading` module suggested by the other poster.

> For the last approach, you might be interested in my asyncproc
> module, which does exactly that. You can download it from
> <http://www.lysator.liu.se/~bellman/download/asyncp....

OK, I'll look into that, too.

Thanks again,

-- O.L.

Thomas Bellman

1/27/2008 8:21:00 PM

Olivier Lefevre <lefevrol@yahoo.com> wrote:

>> Can you really?

> Yes interactively: at the command prompt, you can tell when it's over
> because you know the command you just sent and whether it requires an
> answer and of which kind. Also, even if there is no answer you get a
> fresh prompt when the interpreter is done.

Then you just need to encode that knowledge into your program.

>> Unless there is some way to differentiate between the last line
>> and all the other lines of a response, you can't really be sure.

> Yes, that has since occurred to me. I need to echo some magic string
> after each command to know that I reached the end of the answer to
> the previous command. In interactive mode the prompt fulfills that
> role.

And hope that that "magic string" does not occur somewhere within
the response...

> Yes but my python threading is worse than rudimentary. I will look
> into the `trheading` module suggested by the other poster.

I think you would be better off looking into the correctly spelled
'threading' module rather than the misspelled 'trheading' module. :-)

--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden
"God is real, but Jesus is an integer." ! bellman @ lysator.liu.se
! Make Love -- Nicht Wahr!

kar1107@gmail.com

1/28/2008 2:41:00 AM

On Jan 27, 11:08 am, Olivier Lefevre <lefev...@yahoo.com> wrote:
> >> Indeed, if I do this interactively, I can tell after 3 lines that I've
> >> gotten all there is to get right now and the fourth readline() call
> >> hangs.
>
> > Can you really?
>
> Yes interactively: at the command prompt, you can tell when it's over
> because you know the command you just sent and whether it requires an
> answer and of which kind. Also, even if there is no answer you get a
> fresh prompt when the interpreter is done.

Consider pexpect module. It solves the exact problem you have. You can
give a r.e. for prompt and it will take care to wait until collecting
all output. It basically simulates a human typing to an interpreter.

Karthik.

>
> > Unless there is some way to differentiate between the last line
> > and all the other lines of a response, you can't really be sure.
>
> Yes, that has since occurred to me. I need to echo some magic string
> after each command to know that I reached the end of the answer to
> the previous command. In interactive mode the prompt fulfills that
> role.
>
> > To check if there is something to read at this very moment, you
> > can use any of the following methods:
>
> Thanks for all the suggestions! That is just what I needed.
>
> > - select.select()
> > - the FIONREAD ioctl (the ioctl() function lives in the fcntl
> > module, and the FIONREAD constant is in the termios module)
> > - set the underlying file descriptor in non-blocking mode:
> > flags = fcntl.fcntl(fd, fcntl.F_GETFL)
> > fcntl.fcntl(fd, fcntl.F_SETFL, flags | os.O_NDELAY)
> > After that, reads on the pipe will raise an IOError exception
> > with the errorcode EWOULDBLOCK.
>
> That sounds like the simplest approach.
>
> > - start a thread that does blocking reads from the pipe, and
> > puts the chunks it reads on a queue for your main thread to
> > grab.
>
> Yes but my python threading is worse than rudimentary. I will look
> into the `trheading` module suggested by the other poster.
>
> > For the last approach, you might be interested in my asyncproc
> > module, which does exactly that. You can download it from
> > <http://www.lysator.liu.se/~bellman/download/asyncp....
>
> OK, I'll look into that, too.
>
> Thanks again,
>
> -- O.L.

Marc 'BlackJack' Rintsch

1/28/2008 10:06:00 AM

On Sun, 27 Jan 2008 19:58:27 +0100, Olivier Lefevre wrote:

>>> But how can I find out *programmatically* that there is no more
>>> input?
>>
>> You can't.
>
> How do people handle this, then? Reading from a process that will
> block if you ask too much yet won't let you know how much there is
> to read right now has to be some kind of FAQ.

It's impossible to handle if the external process does not tell you
somehow if there's still data ahead or if it is finished. Then there's
only the closing of the file on the process' side that tells you the
definitive end of the data.

>> This doesn't answer if the interpreter doesn't flush its output buffer
>> after every line.
>
> I think it must otherwise you might get incomplete answers or no
> answers at the interactive prompt and that never happens. It may
> not flush its buffer after every line but it must flush them at
> the end of an answer.

The buffering behavior at the interactive prompt is very often different
from connections via pipes. If stdout of a process is connected to a
terminal the standard C library chooses line buffering but if it is
connected to a pipe or redirected to a file it chooses block buffering
instead.

In such cases the `pexpect` module might be a solution.

Ciao,
Marc 'BlackJack' Rintsch

Olivier Lefevre

1/28/2008 7:30:00 PM

> The buffering behavior at the interactive prompt is very often different
> from connections via pipes.

I hadn't thought of that. I will ask on the Octave list.

Thanks,

-- O.L.

Olivier Lefevre

1/28/2008 7:34:00 PM

>> Yes, that has since occurred to me. I need to echo some magic string
>> after each command to know that I reached the end of the answer to
>> the previous command. In interactive mode the prompt fulfills that
>> role.
>
> And hope that that "magic string" does not occur somewhere within
> the response...

In a numerical setting there are strings you can be pretty sure will
not occur, esp. alone on their own line. It might be harder if you
were dealing with arbitrary text but I'm not.

> I think you would be better off looking into the correctly spelled
> 'threading' module rather than the misspelled 'trheading' module. :-)

That was a just a copy-and-paste from the original reply. It would
not have caused me to segfault not to find a module named 'trhreading':
I'm a human, not a 'puter ;-)

-- O.L.

Olivier Lefevre

1/28/2008 7:35:00 PM

pexpect looks promising, thanks.

-- O.L.

comp.lang.python

Re: read and readline hanging

Olivier Lefevre

Marc 'BlackJack' Rintsch

Thomas Bellman

Olivier Lefevre

Olivier Lefevre

Thomas Bellman

kar1107@gmail.com

Marc 'BlackJack' Rintsch

Olivier Lefevre

Olivier Lefevre

Olivier Lefevre

x Login to ForumsZone