Performance of subprocess.check_output vs subprocess.call

Question

Reading the docs, both subprocess.call and subprocess.check_output are use-cases of subprocess.Popen. One minor difference is that check_output will raise a Python error if the subprocess returns a non-zero exit status. The greater difference is emphasized in the bit about check_output (my emphasis):

The full function signature is largely the same as that of the Popen constructor, except that stdout is not permitted as it is used internally. All other supplied arguments are passed directly through to the Popen constructor.

So how is stdout “used internally”? Let’s compare call and check_output:

call

def call(*popenargs, **kwargs):
    return Popen(*popenargs, **kwargs).wait()

check_output

def check_output(*popenargs, **kwargs):
    if 'stdout' in kwargs:
        raise ValueError('stdout argument not allowed, it will be overridden.')
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
    output, unused_err = process.communicate()
    retcode = process.poll()
    if retcode:
        cmd = kwargs.get("args")
        if cmd is None:
            cmd = popenargs[0]
        raise CalledProcessError(retcode, cmd, output=output)
    return output

communicate

Now we have to look at Popen.communicate as well. Doing this, we notice that for one pipe, communicate does several things which simply take more time than simply returning Popen().wait(), as call does.

For one thing, communicate processes stdout=PIPE whether you set shell=True or not. Clearly, call does not. It just lets your shell spout whatever… making it a security risk, as Python describes here.

Secondly, in the case of check_output(cmd, shell=True) (just one pipe)… whatever your subprocess sends to stdout is processed by a thread in the _communicate method. And Popen must join the thread (wait on it) before additionally waiting on the subprocess itself to terminate!

Plus, more trivially, it processes stdout as a list which must then be joined into a string.

In short, even with minimal arguments, check_output spends a lot more time in Python processes than call does.

call

check_output

communicate

Leave a Comment Cancel reply