Our computer runs subprocesses all the time. In fact, just by reading this article, you’re running a lot of processes like a network manager, or the internet browser itself. The cool thing about this is that any action we do on our computer involves invoking a subprocess. That remains true even if we are writing a simple “hello world” script in python. The concept of the subprocess may seem obscure even if you’ve been learning programming for a while. This article will take a deep look at the main concept of the subprocess, and how to use the Python subprocess standard library. By the end of this tutorial, you will:
Understand the concept of subprocess Have learned the basics of the Python subprocess library Practiced your Python skills with useful examples
Let’s get into it
The concept of subprocess
Broadly saying, a subprocess is a computer process created by another process. We can think of a subprocess as a tree, in which each parent process has child processes running behind it. I know this can be quite confusing, but let’s see it with a simple graphic.
There are several ways we can visualize the process running on our computer. For example, in UNIX (Linux & MAC) we have htop, which is an interactive process viewer.
The tree mode is the most useful tool to take a look at the running subprocesses. We can activate it with F5. If we take a look closely at the command section, we can notice the structure of the processes running on our computer. It all starts with /sbin/init which is the command that starts each process on our computer. From that point, we can see the beginning of other processes like xfce4-screenshoter and the xfce4-terminal (Which conduces to even more subprocess) Taking a look at Windows, we have the mythical task manager which results useful when killing those crashing programs on our machine.
Now we have a crystal clear concept. Let’s see how we can implement subprocesses in Python.
Subprocesses in Python
A subprocess in Python is a task that a python script delegates to the Operative system (OS). The subprocess library allows us to execute and manage subprocesses directly from Python. That involves working with the standard input stdin, standard output stdout, and return codes. We don’t have to install it with PIP, since it’s part of the Python standard library. Therefore we can start using subprocesses in python just by importing the module. To check the python version you currently have, just run. In case the Python version you get is 2.x you can use the following command Continuing with the topic, the main idea behind the subprocess library is to be able to interact with the OS by executing any commands we want, directly from the Python interpreter. That means we can do whatever we want, as long as our OS allows us (And as long as you don’t remove your root filesystem 😅). Let’s see how to use it by creating a simple script that lists the files of the current directory.
First subprocess application
First, let’s create a file list_dir.py. This will be the file where we’re going to experiment listing files. Now let’s open that file and use the following code. First, we are importing the subprocess module, and then using the function run which runs, the command we pass as an argument. This function was introduced in Python 3.5, as a friendly shortcut to subprocess.Popen. The subprocess.run function allows us to run a command and wait for it to finish, in contrast to Popen where we have the option to call communicate later. Talking about the code output, ls is a UNIX command that lists the files of the directory you’re in. Therefore, If you run this command, you’ll get a list of the files present in the current directory. This may seem too simple, and you’re right. You want to take a full approach to all the power the shell brings you. So let’s learn how to pass arguments to the shell with subprocess. For example to list also the hidden files (Those that start with a dot), and also list all the metadata of the files, we write the following code. We’re running this command as a string and using the argument shell. That means we’re invoking a shell at the start of the execution of our subprocess, and the command argument is interpreted directly by the shell. However, the use shell=True has many downsides, and the worst are the possible security leaks. You can read about them in the official documentation. The best way to pass commands to the run function is to use a list where lst[0] is the command to call (ls in this case) and lst[n] are the arguments of that command. If we do so, our code will look like this. If we want to store the standard output of a subprocess in a variable, we can do it by setting the argument capture_output to true. To access the output of a process, we use the instance attribute stdout. In this case, we want to store the output as a string, instead of bytes and we can do so by setting the text argument as true. Perfect, now that we know the basics of the subprocess library, it’s time to move on to some usage examples.
Usage Examples of subprocess in Python
In this section, we’re going to review some practical uses of the subprocess library. You can check all of them in this Github repository.
Program checker
One of the main usages of this library is the ability to make simple OS operations. For instance, a simple script that checks if a program is installed. In Linux, we can do this with the which command. Note: In UNIX when a command is successful its status code is 0. Otherwise, something went wrong during the execution Since we’re not using the shell=True argument, we can take the user input securely. Also, we can check if the input is a valid program with a regex pattern. In this case, we’re getting the programs from the user and using a regex expression that certifies the program string only includes letters and digits. We check the existence of each program with a for a loop.
Simple Grep in Python
Your friend Tom has a list of patterns in a text file and another large file in which he wants to get the number of matches for each pattern. He would spend hours running the grep command for every pattern. Fortunately, you know how to solve this problem with Python, and you’ll help him to accomplish this task in few seconds. Taking a look at this file, we define two variables which are the filenames we want to work with. Then we open the file that contains all the patterns and iterate over them. Next, we call a subprocess that runs a grep command with the “-c” flag (means count) and determine the output of the match with a conditional. If you run this file (remember you can download the text files from the Github repo)
Set up a virtualenv with subprocess
One of the coolest things you can do with Python is process automation. This kind of script can save you hours of time per week. For example, we’re going to create a setup script that creates a virtual environment for us and tries to find a requirements.txt file in the current directory to install all the dependencies. In this case, we’re using multiple processes and parsing the data we need in our python script. We’re also using the pathlib library which allows us to figure it if the requirements.txt file exists. If you run the python file you’ll get some useful messages of what’s happening with the OS. Note that we get the output from the installation process because we’re not redirecting the standard output to a variable.
Run another Programming Language
We can run other programming languages with python and get the output from those files. This is possible because the subprocesses interact directly with the operative system. For instance, let’s create a hello world program in C++ and Java. In order to execute the following file, you’ll need to install C++ and Java compilers. helloworld.cpp helloworld.java I know this seems a lot of code compared to a simple Python one-liner, but this is just for testing purposes. We’re going to create a Python script that runs all the C++ and Java files in a directory. To do this first we want to get a list of files depending on the file extension, and glob allows us to do it easily! After that, we can start using subprocesses to execute each type of file. One little trick is to use the string function strip to modify the output and only get what we need.
Open external programs
We’re able to run other programs just by calling their binaries location through a subprocess. Let’s try it out by opening brave, my preferred web browser. This will open a browser instance, or just another tab if you already have running the browser.
As with any other program that accept flags we can use them to produce the desired behavior.
To sum up
A sub-process is a computer process created by another process. We can check the processes our computer is running with tools like htop and the task manager. Python has its own library to work with subprocesses. Currently, the run function gives us a simple interface to create and manage subprocesses. We can create any kind of application with them because we’re interacting directly with the OS. Finally, remember that the best way to learn is to create something you would like to use.