subprocess

Resources

run

When scripting a task it is common to need to run a external process, for instance a program to do a particular data analysis. This external process will be a subprocess launched by our Python running process. The subprocess module includes the run function to run external processes.

Let’s image that we want to run the ls command (dir in windows).

from subprocess import run

cmd = ['ls']
run(cmd)
comprehensions.html
comprehensions.qmd
comprehensions.quarto_ipynb
counter.qmd
counter.quarto_ipynb
enumerate.html
enumerate.qmd
enumerate.quarto_ipynb
everyday_python.html
everyday_python.qmd
lambda.html
lambda.qmd
lambda.quarto_ipynb
paths.html
paths.qmd
paths.quarto_ipynb
range.html
range.qmd
range.quarto_ipynb
shutil.html
shutil.qmd
shutil.quarto_ipynb
subprocess.qmd
subprocess.quarto_ipynb
typing.html
typing.qmd
typing.quarto_ipynb
zip.qmd
zip.quarto_ipynb
CompletedProcess(args=['ls'], returncode=0)

Run, by default, expects a list of strings, not a string with the command. For instance, image that our command includes a parameter.

from subprocess import run

cmd = ['ls', '-l']
run(cmd)
total 604
-rw-rw-r-- 1 jose jose 79268 ene 29 08:33 comprehensions.html
-rw-rw-r-- 1 jose jose  4361 nov  3 15:07 comprehensions.qmd
-rw-rw-r-- 1 jose jose  6545 ene 29 08:33 comprehensions.quarto_ipynb
-rw-rw-r-- 1 jose jose  3719 nov  3 15:37 counter.qmd
-rw-rw-r-- 1 jose jose  5739 ene 29 08:32 counter.quarto_ipynb
-rw-rw-r-- 1 jose jose 57828 ene 29 08:33 enumerate.html
-rw-rw-r-- 1 jose jose  1786 oct 30 14:43 enumerate.qmd
-rw-rw-r-- 1 jose jose  3137 ene 29 08:33 enumerate.quarto_ipynb
-rw-rw-r-- 1 jose jose 43157 ene 29 08:33 everyday_python.html
-rw-rw-r-- 1 jose jose   640 oct 30 09:43 everyday_python.qmd
-rw-rw-r-- 1 jose jose 55731 ene 29 08:33 lambda.html
-rw-rw-r-- 1 jose jose  2191 oct 31 11:41 lambda.qmd
-rw-rw-r-- 1 jose jose  3434 ene 29 08:33 lambda.quarto_ipynb
-rw-rw-r-- 1 jose jose 64511 ene 29 08:33 paths.html
-rw-rw-r-- 1 jose jose  4608 sep 15  2024 paths.qmd
-rw-rw-r-- 1 jose jose  6430 ene 29 08:33 paths.quarto_ipynb
-rw-rw-r-- 1 jose jose 68112 ene 29 08:33 
CompletedProcess(args=['ls', '-l'], returncode=0)
range.html
-rw-rw-r-- 1 jose jose  1657 oct 31 10:02 range.qmd
-rw-rw-r-- 1 jose jose  3159 ene 29 08:33 range.quarto_ipynb
-rw-rw-r-- 1 jose jose 63259 ene 29 08:33 shutil.html
-rw-rw-r-- 1 jose jose  2885 nov  3 14:38 shutil.qmd
-rw-rw-r-- 1 jose jose  4383 ene 29 08:33 shutil.quarto_ipynb
-rw-rw-r-- 1 jose jose  3762 nov  3 15:48 subprocess.qmd
-rw-rw-r-- 1 jose jose  6913 ene 29 08:32 subprocess.quarto_ipynb
-rw-rw-r-- 1 jose jose 57211 ene 29 08:33 typing.html
-rw-rw-r-- 1 jose jose  2450 oct 30 10:54 typing.qmd
-rw-rw-r-- 1 jose jose  3721 ene 29 08:33 typing.quarto_ipynb
-rw-rw-r-- 1 jose jose  1356 oct 30 14:46 zip.qmd
-rw-rw-r-- 1 jose jose  2549 ene 29 08:32 zip.quarto_ipynb

In any case the run function will launch the external process and will also wait for the process to finish, and only then will the function return a CompletedProcess object.

from subprocess import run

cmd = ['ls', '-l']
process = run(cmd)
print(process.returncode)
total 604
-rw-rw-r-- 1 jose jose 79268 ene 29 08:33 comprehensions.html
-rw-rw-r-- 1 jose jose  4361 nov  3 15:07 comprehensions.qmd
-rw-rw-r-- 1 jose jose  6545 ene 29 08:33 comprehensions.quarto_ipynb
-rw-rw-r-- 1 jose jose  3719 nov  3 15:37 counter.qmd
-rw-rw-r-- 1 jose jose  5739 ene 29 08:32 counter.quarto_ipynb
-rw-rw-r-- 1 jose jose 57828 ene 29 08:33 enumerate.html
-rw-rw-r-- 1 jose jose  1786 oct 30 14:43 enumerate.qmd
-rw-rw-r-- 1 jose jose  3137 ene 29 08:33 enumerate.quarto_ipynb
-rw-rw-r-- 1 jose jose 43157 ene 29 08:33 everyday_python.html
-rw-rw-r-- 1 jose jose   640 oct 30 09:43 everyday_python.qmd
-rw-rw-r-- 1 jose jose 55731 ene 29 08:33 lambda.html
-rw-rw-r-- 1 jose jose  2191 oct 31 11:41 lambda.qmd
-rw-rw-r-- 1 jose jose  3434 ene 29 08:33 lambda.quarto_ipynb
-rw-rw-r-- 1 jose jose 64511 ene 29 08:33 paths.html
-rw-rw-r-- 1 jose jose  4608 sep 15  2024 paths.qmd
-rw-rw-r-- 1 jose jose  6430 ene 29 08:33 paths.quarto_ipynb
-rw-rw-r-- 1 jose jose 68112 ene 29 08:33 0
range.html
-rw-rw-r-- 1 jose jose  1657 oct 31 10:02 range.qmd
-rw-rw-r-- 1 jose jose  3159 ene 29 08:33 range.quarto_ipynb
-rw-rw-r-- 1 jose jose 63259 ene 29 08:33 shutil.html
-rw-rw-r-- 1 jose jose  2885 nov  3 14:38 shutil.qmd
-rw-rw-r-- 1 jose jose  4383 ene 29 08:33 shutil.quarto_ipynb
-rw-rw-r-- 1 jose jose  3762 nov  3 15:48 subprocess.qmd
-rw-rw-r-- 1 jose jose  6913 ene 29 08:32 subprocess.quarto_ipynb
-rw-rw-r-- 1 jose jose 57211 ene 29 08:33 typing.html
-rw-rw-r-- 1 jose jose  2450 oct 30 10:54 typing.qmd
-rw-rw-r-- 1 jose jose  3721 ene 29 08:33 typing.quarto_ipynb
-rw-rw-r-- 1 jose jose  1356 oct 30 14:46 zip.qmd
-rw-rw-r-- 1 jose jose  2549 ene 29 08:32 zip.quarto_ipynb

Launching without waiting

return code

Every process once is finished return a return code or exit status. This return code is an integer and the standard is to return 0 when everything has been fine or any other number in the event of an error happening in the subprocess. You can access to the exit code of the subprocess.

If you want the run function to fail in the event of the called process having any problem you could use the check argument.

stdout and stdin

You can store the result of stdout and stderr as properties of the completed process object.

from subprocess import run

cmd = ['ls', '/hello']
process = run(cmd, capture_output=True)
print(process.stdout)
print(process.stderr)
b''
b"ls: no se puede acceder a '/hello': No existe el fichero o el directorio\n"

Be aware that, by default, the standard output streams will be binary objects, if you want them to be strings you have to provide an encoding.

from subprocess import run

cmd = ['ls', '/hello']
process = run(cmd, capture_output=True, encoding='utf-8')
print(process.stdout)
print(process.stderr)

ls: no se puede acceder a '/hello': No existe el fichero o el directorio

Popen

The run function will wait for the subprocess to finnish before returning. If you just want to launch the process, but not wait for it to finish you can use the Popen class.

The use of Popen is very similar to the use of run, the main difference being that Popen will return a Popen object immediately, without waiting for it to finish.

Once you have that object, you could check if the process has already finished or you could also wait for the process to finish.

from subprocess import Popen

cmd = ['ls']
process = Popen(cmd)
print(process.poll())
print(process.wait())
print(process.returncode)
None
comprehensions.html
comprehensions.qmd
comprehensions.quarto_ipynb
counter.qmd
counter.quarto_ipynb
enumerate.html
enumerate.qmd
enumerate.quarto_ipynb
everyday_python.html
everyday_python.qmd
lambda.html
lambda.qmd
lambda.quarto_ipynb
paths.html
paths.qmd
paths.quarto_ipynb
range.html
range.qmd
range.quarto_ipynb
shutil.html
shutil.qmd
shutil.quarto_ipynb
subprocess.qmd
subprocess.quarto_ipynb
typing.html
typing.qmd
typing.quarto_ipynb
zip.qmd
zip.quarto_ipynb
0
0