Asp Forum - Re: polling for output from a subprocess module

Thomas Bellman

2/4/2008 10:50:00 AM

jakub.hrozek@gmail.com wrote:

> try:
> test = Popen(test_path,
> stdout=PIPE,
> stderr=PIPE,
> close_fds=True,
> env=test_environ)

> while test.poll() == None:
> ready = select.select([test.stderr], [], [])

> if test.stderr in ready[0]:
> t_stderr_new = test.stderr.readlines()
> if t_stderr_new != []:
> print "STDERR:", "\n".join(t_stderr_new)
> t_stderr.extend(t_stderr_new)
[...]
> The problem is, that it seems that all the output from the subprocess
> seems to be coming at once. Do I need to take a different approach?

The readlines() method will read until it reaches end of file (or
an error occurs), not just what is available at the moment. You
can see that for your self by running:

$ python -c 'import sys; print sys.stdin.readlines()'

The call to sys.stdin.readlines() will not return until you press
Ctrl-D (or, I think, Ctrl-Z if you are using MS-Windows).

However, the os.read() function will only read what is currently
available. Note, though, that os.read() does not do line-based
I/O, so depending on the timing you can get incomplete lines, or
multiple lines in one read.

--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden
"Adde parvum parvo magnus acervus erit" ! bellman @ lysator.liu.se
(From The Mythical Man-Month) ! Make Love -- Nicht Wahr!

5 Answers

Christian Heimes

2/4/2008 2:53:00 PM

Thomas Bellman wrote:
> The readlines() method will read until it reaches end of file (or
> an error occurs), not just what is available at the moment. You
> can see that for your self by running:

Bad idea ;)

readlines() on a subprocess Popen instance will block when you PIPE more
than one stream and the buffer of the other stream is full.

You can find some insight at http://bugs.python.org.... I
discussed the matter with Guido a while ago.

Christian

jakub.hrozek

2/4/2008 8:51:00 PM

On 4 Ún, 11:49, Thomas Bellman <bell...@lysator.liu.se> wrote:
> jakub.hro...@gmail.com wrote:
> > try:
> > test = Popen(test_path,
> > stdout=PIPE,
> > stderr=PIPE,
> > close_fds=True,
> > env=test_environ)
> > while test.poll() == None:
> > ready = select.select([test.stderr], [], [])
> > if test.stderr in ready[0]:
> > t_stderr_new = test.stderr.readlines()
> > if t_stderr_new != []:
> > print "STDERR:", "\n".join(t_stderr_new)
> > t_stderr.extend(t_stderr_new)
> [...]
> > The problem is, that it seems that all the output from the subprocess
> > seems to be coming at once. Do I need to take a different approach?
>
> The readlines() method will read until it reaches end of file (or
> an error occurs), not just what is available at the moment. You
> can see that for your self by running:
>
> $ python -c 'import sys; print sys.stdin.readlines()'
>
> The call to sys.stdin.readlines() will not return until you press
> Ctrl-D (or, I think, Ctrl-Z if you are using MS-Windows).
>
> However, the os.read() function will only read what is currently
> available. Note, though, that os.read() does not do line-based
> I/O, so depending on the timing you can get incomplete lines, or
> multiple lines in one read.
>

Right, I didn't realize that. I'll try the os.read() method. Reading
what's available (as opposed to whole lines) shouldn't be an issue in
this specific case. Thanks for the pointer!

Thomas Bellman

2/5/2008 6:14:00 PM

Christian Heimes <lists@cheimes.de> writes:

> Thomas Bellman wrote:
>> The readlines() method will read until it reaches end of file (or
>> an error occurs), not just what is available at the moment. You
>> can see that for your self by running:

> Bad idea ;)

Why is it a bad idea to see how the readlines() method behaves?

> readlines() on a subprocess Popen instance will block when you PIPE more
> than one stream and the buffer of the other stream is full.

> You can find some insight at http://bugs.python.org.... I
> discussed the matter with Guido a while ago.

Umm... Yes, you are correct that the code in the original post
also has a deadlock problem. I missed that. But saying that it
is the readline() method that is blocking is a bit misleading,
IMHO. Both processes will be blocking, in a deadly embrace.
It's a problem that has been known since the concept of inter-
process communication was invented, and isn't specific to the
readlines() method in Python.

But the OP *also* has the problem that I described in my reply.
Even if he only PIPE:d one of the output streams from his
subprocess, he would only receive its output when the subprocess
finished (if it ever does), not as it is produced.

(To those that don't understand why the OP's code risks a deadly
embrace: if a process (A) writes significant amounts of data to
both its standard output and standard error, but the process that
holds the other end of those streams (process B) only reads data
from one of those streams, process A will after a while fill the
operating system's buffers for the other stream. When that
happens, the OS will block process A from running until process B
reads data from that stream too, freeing up buffer space. If
process B never does that, then process A will never run again.

The OP must therefore do a select() on both the standard output
and standard error of his subprocess, and use os.read() to
retrieve the output from both streams to free up buffer space in
the pipes.)

--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden
"We don't understand the software, and ! bellman @ lysator.liu.se
sometimes we don't understand the hardware, !
but we can *see* the blinking lights!" ! Make Love -- Nicht Wahr!

Ivo

2/5/2008 8:55:00 PM

Thomas Bellman wrote:
> jakub.hrozek@gmail.com wrote:
>
>> try:
>> test = Popen(test_path,
>> stdout=PIPE,
>> stderr=PIPE,
>> close_fds=True,
>> env=test_environ)
>
>> while test.poll() == None:
>> ready = select.select([test.stderr], [], [])
>
>> if test.stderr in ready[0]:
>> t_stderr_new = test.stderr.readlines()
>> if t_stderr_new != []:
>> print "STDERR:", "\n".join(t_stderr_new)
>> t_stderr.extend(t_stderr_new)
> [...]
>> The problem is, that it seems that all the output from the subprocess
>> seems to be coming at once. Do I need to take a different approach?
>
> The readlines() method will read until it reaches end of file (or
> an error occurs), not just what is available at the moment. You
> can see that for your self by running:
>
> $ python -c 'import sys; print sys.stdin.readlines()'
>
> The call to sys.stdin.readlines() will not return until you press
> Ctrl-D (or, I think, Ctrl-Z if you are using MS-Windows).
>
> However, the os.read() function will only read what is currently
> available. Note, though, that os.read() does not do line-based
> I/O, so depending on the timing you can get incomplete lines, or
> multiple lines in one read.
>
>
be carefull that you specify how much you want to read at a time,
otherwise it cat be that you keep on reading.

Specify read(1024) or somesuch.

In case of my PPCEncoder I recompiled the mencoder subprocess to deliver
me lines that end with \n.

If anyone can tell me how to read a continues stream than I am really
interested.

cya

Thomas Bellman

2/6/2008 7:19:00 AM

Ivo <noreply@ivonet.nl> wrote:

> Thomas Bellman wrote:

>> However, the os.read() function will only read what is currently
>> available. Note, though, that os.read() does not do line-based
>> I/O, so depending on the timing you can get incomplete lines, or
>> multiple lines in one read.
>>
>>
> be carefull that you specify how much you want to read at a time,
> otherwise it cat be that you keep on reading.

> Specify read(1024) or somesuch.

Well, of course you need to specify how much you want to read.
Otherwise os.read() throws an exception:

>>> import sys, os
>>> os.read(sys.stdin.fileno())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: read() takes exactly 2 arguments (1 given)

> In case of my PPCEncoder I recompiled the mencoder subprocess to deliver
> me lines that end with \n.

> If anyone can tell me how to read a continues stream than I am really
> interested.

I have never had any problem when using the os.read() function,
as long as I understand the effects of output buffering in the
subprocess. The file.read() method is a quite different animal.

(And then there's the problem of getting mplayer/mencoder to
output any *useful* information, but that is out of the scope of
this newsgroup. :-)

--
Thomas Bellman, Lysator Computer Club, Linköping University, Sweden
"God is real, but Jesus is an integer." ! bellman @ lysator.liu.se
! Make Love -- Nicht Wahr!

comp.lang.python

Re: polling for output from a subprocess module

Thomas Bellman

Christian Heimes

jakub.hrozek

Thomas Bellman

Ivo

Thomas Bellman

x Login to ForumsZone