合并Python脚本的子进程'stdout和stderr，同时保持它们可区分

I would like to direct a python script's subprocess' stdout and stdin into the same file. What I don't know is how to make the lines from the two sources distinguishable? (For example prefix the lines from stderr with an exclamation mark.)

我想将python脚本的子进程'stdout和stdin指向同一个文件。我不知道的是如何使两个来源的线条区分开来? (例如,带有感叹号的stderr行前缀。)

In my particular case there is no need for live monitoring of the subprocess, the executing Python script can wait for the end of its execution.

在我的特定情况下,不需要对子进程进行实时监视,执行的Python脚本可以等待其执行结束。

6 个解决方案

#1

tsk = subprocess.Popen(args,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)

subprocess.STDOUT is a special flag that tells subprocess to route all stderr output to stdout, thus combining your two streams.

subprocess.STDOUT是一个特殊标志,它告诉subprocess将所有stderr输出路由到stdout,从而组合你的两个流。

btw, select doesn't have a poll() in windows. subprocess only uses the file handle number, and doesn't call your file output object's write method.

顺便说一下,select在Windows中没有poll()。 subprocess仅使用文件句柄号,而不调用文件输出对象的write方法。

to capture the output, do something like:

捕获输出,执行以下操作:

logfile = open(logfilename, 'w')

while tsk.poll() is None:
    line = tsk.stdout.readline()
    logfile.write(line)

#2

I found myself having to tackle this problem recently, and it took a while to get something I felt worked correctly in most cases, so here it is! (It also has the nice side effect of processing the output via a python logger, which I've noticed is another common question here on *).

我发现自己最近不得不解决这个问题,而且在大多数情况下需要一段时间才能得到一些我认为正常工作的东西,所以这就是它! (它还具有通过python记录器处理输出的良好副作用,我注意到这是*上的另一个常见问题)。

Here is the code:

这是代码:

import sys
import logging
import subprocess
from threading import Thread

logging.basicConfig(stream=sys.stdout,level=logging.INFO)
logging.addLevelName(logging.INFO+2,'STDERR')
logging.addLevelName(logging.INFO+1,'STDOUT')
logger = logging.getLogger('root')

pobj = subprocess.Popen(['python','-c','print 42;bargle'], 
    stdout=subprocess.PIPE, stderr=subprocess.PIPE)

def logstream(stream,loggercb):
    while True:
        out = stream.readline()
        if out:
            loggercb(out.rstrip())       
        else:
            break

stdout_thread = Thread(target=logstream,
    args=(pobj.stdout,lambda s: logger.log(logging.INFO+1,s)))

stderr_thread = Thread(target=logstream,
    args=(pobj.stderr,lambda s: logger.log(logging.INFO+2,s)))

stdout_thread.start()
stderr_thread.start()

while stdout_thread.isAlive() and stderr_thread.isAlive():
     pass

Here is the output:

这是输出:

STDOUT:root:42
STDERR:root:Traceback (most recent call last):
STDERR:root:  File "<string>", line 1, in <module>
STDERR:root:NameError: name 'bargle' is not defined

You can replace the subprocess call to do whatever you want, I just chose running python with a command that I knew would print to both stdout and stderr. The key bit is reading stderr and stdout each in a separate thread. Otherwise you may be blocking on reading one while there is data ready to be read on the other.

您可以替换子进程调用以执行您想要的任何操作,我只选择使用我知道将打印到stdout和stderr的命令运行python。关键位是在一个单独的线程中读取stderr和stdout。否则,当有数据准备好在另一个上读取时,您可能会阻止读取一个。

#3

If you want to interleave to get roughly the same order that you would if you ran the process interactively then you need to do what the shell does and poll stdin/stdout and write in the order that they poll.

如果要交错以获得与以交互方式运行进程时大致相同的顺序,则需要执行shell所执行的操作并轮询stdin / stdout并按其轮询的顺序进行写入。

Here's some code that does something along the lines of what you want - in this case sending the stdout/stderr to a logger info/error streams.

这里有一些代码可以按照您的需要执行某些操作 - 在这种情况下,将stdout / stderr发送到记录器信息/错误流。

tsk = subprocess.Popen(args,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

poll = select.poll()
poll.register(tsk.stdout,select.POLLIN | select.POLLHUP)
poll.register(tsk.stderr,select.POLLIN | select.POLLHUP)
pollc = 2

events = poll.poll()
while pollc > 0 and len(events) > 0:
  for event in events:
    (rfd,event) = event
    if event & select.POLLIN:
      if rfd == tsk.stdout.fileno():
        line = tsk.stdout.readline()
        if len(line) > 0:
          logger.info(line[:-1])
      if rfd == tsk.stderr.fileno():
        line = tsk.stderr.readline()
        if len(line) > 0:
          logger.error(line[:-1])
    if event & select.POLLHUP:
      poll.unregister(rfd)
      pollc = pollc - 1
    if pollc > 0: events = poll.poll()
tsk.wait()

#4

At the moment all other answers don't handle buffering on the child subprocess' side if the subprocess is not a Python script that accepts -u flag. See "Q: Why not just use a pipe (popen())?" in the pexpect documentation.

目前,如果子进程不是接受-u标志的Python脚本,则所有其他答案都不会处理子进程侧的缓冲。请参阅“问:为什么不使用管道(popen())?”在pexpect文档中。

To simulate -u flag for some of C stdio-based (FILE*) programs you could try stdbuf.

要为某些基于C stdio(FILE *)的程序模拟-u标志,您可以尝试使用stdbuf。

If you ignore this then your output won't be properly interleaved and might look like:

如果忽略这一点,那么输出将无法正确交错,可能如下所示:

stderr
stderr
...large block of stdout including parts that are printed before stderr...

You could try it with the following client program, notice the difference with/without -u flag (['stdbuf', '-o', 'L', 'child_program'] also fixes the output):

您可以使用以下客户端程序尝试它,注意与/不用-u标志的区别(['stdbuf',' - o','L','child_program']也修复了输出):

#!/usr/bin/env python
from __future__ import print_function
import random
import sys
import time
from datetime import datetime

def tprint(msg, file=sys.stdout):
    time.sleep(.1*random.random())
    print("%s %s" % (datetime.utcnow().strftime('%S.%f'), msg), file=file)

tprint("stdout1 before stderr")
tprint("stdout2 before stderr")
for x in range(5):
    tprint('stderr%d' % x, file=sys.stderr)
tprint("stdout3 after stderr")

On Linux you could use pty to get the same behavior as when the subprocess runs interactively e.g., here's a modified @T.Rojan's answer:

在Linux上你可以使用pty来获得与子进程以交互方式运行时相同的行为,例如,这是一个修改过的@ T.Rojan的答案:

import logging, os, select, subprocess, sys, pty

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

master_fd, slave_fd = pty.openpty()
p = subprocess.Popen(args,stdout=slave_fd, stderr=subprocess.PIPE, close_fds=True)
with os.fdopen(master_fd) as stdout:
    poll = select.poll()
    poll.register(stdout, select.POLLIN)
    poll.register(p.stderr,select.POLLIN | select.POLLHUP)

    def cleanup(_done=[]):
        if _done: return
        _done.append(1)
        poll.unregister(p.stderr)
        p.stderr.close()
        poll.unregister(stdout)
        assert p.poll() is not None

    read_write = {stdout.fileno(): (stdout.readline, logger.info),
                  p.stderr.fileno(): (p.stderr.readline, logger.error)}
    while True:
        events = poll.poll(40) # poll with a small timeout to avoid both
                               # blocking forever and a busy loop
        if not events and p.poll() is not None:
            # no IO events and the subprocess exited
            cleanup()
            break

        for fd, event in events:
            if event & select.POLLIN: # there is something to read
                read, write = read_write[fd]
                line = read()
                if line:
                    write(line.rstrip())
            elif event & select.POLLHUP: # free resources if stderr hung up
                cleanup()
            else: # something unexpected happened
                assert 0
sys.exit(p.wait()) # return child's exit code

It assumes that stderr is always unbuffered/line-buffered and stdout is line-buffered in an interactive mode. Only full lines are read. The program might block if there are non-terminated lines in the output.

它假定stderr始终是无缓冲/行缓冲的,并且stdout在交互模式下是行缓冲的。只读全行。如果输出中存在非终止行,程序可能会阻止。

#5

I suggest you write your own handlers, something like (not tested, I hope you catch the idea):

我建议你编写自己的处理程序,类似于(未经测试,我希望你能抓住这个想法):

class my_buffer(object):
    def __init__(self, fileobject, prefix):
        self._fileobject = fileobject
        self.prefix = prefix
    def write(self, text):
        return self._fileobject.write('%s %s' % (self.prefix, text))
    # delegate other methods to fileobject if necessary

log_file = open('log.log', 'w')
my_out = my_buffer(log_file, 'OK:')
my_err = my_buffer(log_file, '!!!ERROR:')
p = subprocess.Popen(command, stdout=my_out, stderr=my_err, shell=True)

#6

You may write the stdout/err to a file after the command execution. In the example below I use pickling so I am sure I will be able to read without any particular parsing to differentiate between the stdout/err and at some point I could dumo the exitcode and the command itself.

您可以在命令执行后将stdout / err写入文件。在下面的例子中,我使用酸洗,所以我确信我能够在没有任何特定解析的情况下阅读以区分stdout / err,并且在某些时候我可以使用exitcode和命令本身。

import subprocess
import cPickle

command = 'ls -altrh'
outfile = 'log.errout'
pipe = subprocess.Popen(command, stdout = subprocess.PIPE,
                        stderr = subprocess.PIPE, shell = True)
stdout, stderr = pipe.communicate()

f = open(outfile, 'w')
cPickle.dump({'out': stdout, 'err': stderr},f)
f.close()

#1

tsk = subprocess.Popen(args,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)