Sunday, March 22, 2026

subprocess: From Python to Go

One of the things that makes the UNIX toolbox so powerful is the ability to run a command and capture its output. If you now think, “why, of course you can!”, realize that this seemingly trivial operation requires complex features in the very core of the operating system to make this possible. On top of that, the application layer needs to offer the functionality, preferably wrapped in a pleasant to use interface. While Golang has a pretty decent API for running external commands, I particularly love Python’s subprocess module; it’s incredibly versatile, and the interface is easy to remember (especially after having used it countless times).

Really, capturing command output in Go is as simple as:

output, err := exec.Command(cmd).Output()

And yet, I don’t want to code it this way. It’s too baked, set in stone, for my tastes. What I wish to have is a wrapper function that offers some flexibility on how to capture output, e.g. capture stdout, and redirect stderr to /dev/null; much like how Python’s subprocess.run() works.

When you study the Python documentation, you’ll find that subprocess.run() takes over a dozen arguments (I’m not exaggerating…!) to tweak its behavior. Python is flexible in that functions may have default arguments, that can be omitted when calling the function. We don’t have that luxury in Go, therefore we won’t port each and every feature in detail. That said, we will have a very capable Go function that mimics the Python API close enough.

package subprocess

func Run(argv []string, stdout, stderr DupFd) (Result, error)

The Run() function takes an argv as command. The stdout/stderr DupFd are special constants that define how we wish to redirect stdout and stderr. The Result is a structure that contains the captured output and exit code. The Go idiom is to return an error upon error.

type DupFd int

const (
    _ DupFd = -iota
    DevNull
    Pipe
    Stdout
    Stderr
)

The DupFd constants are just integers. Giving them an explicit type makes the code more robust. Note use of negative iota; the idea is that you can still pass a (type casted) open file descriptor, redirecting to file. (Not shown in this post for brevity, however).

type Result struct {
    Stdout   []byte
    Stderr   []byte
    ExitCode int
}

The returned Result contains captured output and the exit code. You may prefer having strings rather than bytes. A handy way of converting bytes to lines of text is:

func Lines(b []byte) []string {
    return slices.Collect(strings.Lines(string(b)))
}

Now let’s get to the meat of implementing Run(). It’s possible to write this function as one long stretch of code, but for neatness (and re-usability) break it down into three steps:

// pseudo code

func Run(argv, stdout, stderr) {
    proc := Popen(argv, stdout, stderr)
    proc.Communicate()
    proc.Wait()
    return result
}

By now it should be clear that the trivial task of capturing command output is actually quite involved.

The name “Popen” comes from the C library function popen(), which creates a pipe connected to a newly spawned child process. We do not call the C library function, in fact our version imitates the Python subprocess.Popen() function.

func Popen(argv []string, stdout, stderr DupFd) (Process, error)

The returned Process structure holds some data on the started child process, and the read ends of any pipes connected to the child’s stdout/stderr.

import "os/exec"
import "io"

type Process struct {
    Cmd        *exec.Cmd
    StdoutPipe io.ReadCloser
    StderrPipe io.ReadCloser
}

With this in place we can implement Popen() as follows. First we create (but do not yet start) the command:

cmd := exec.Command(argv[0], argv[1:]...)

Again, the arguments stdout, stderr DupFd are constants that determine how we wish to redirect.

var pipefd io.ReadCloser
var err error

switch stdout {
case DevNull:
    // redirect to /dev/null
    f, err := os.OpenFile(os.DevNull, os.O_WRONLY, 0o600)
    if err != nil {
        return Process{}, err
    }
    cmd.Stdout = f

case Pipe:
    // capture output via pipe
    pipefd, err = cmd.StdoutPipe()

default:
    // do not redirect
    cmd.Stdout = os.Stdout
}

Next, start the process and return the Process structure:

proc := Process{
    Cmd: cmd,
    StdoutPipe: pipefd,
    StderrPipe: pipefd_stderr,
}

// start the process!
err := proc.Cmd.Start()
if err != nil {
    return proc, err
}
return proc, nil

The production version of Popen() includes some things that I left out here for brevity;

a similar switch as above, but for redirecting stderr
redirecting stderr to stdout, and vice versa
redirecting stdin for input
redirecting to an open file descriptor for writing to an on-disk file

Now that we have a running the process with some pipes, not much else is happening by itself, really. What we need to do is read from the pipes to obtain the data. That’s easy enough, but there is a caveat; the process potentially writes lots of data to both stdout and stderr. If we go about this naively, then it’s well possible that internal buffers fill up and the whole process deadlocks. Moreover, if we throw input on stdin into the mix as well, then it’s easy to see we have to juggle multiple I/O streams concurrently.

If there is one thing Go is good for, it’s concurrent programming.

Mimicking the Python API again, the function that does the I/O is named Communicate().

func (self *Process) Communicate() ([]byte, []byte, error)

Communicate reads from the pipes and returns the output as bytes. Not shown here: writing input data to the input pipe.

// captured output data
var b_stdout []byte = []byte{}
var b_stderr []byte = []byte{}

// separate error variables
var err_stdout, err_stderr error

// a waitgroup to synchronize concurrent goroutines
var wg sync.WaitGroup

// note that `self` is a Process struct
if self.StdoutPipe != nil {
    wg.Go(func() {
        b_stdout, err_stdout = io.ReadAll(self.StdoutPipe)
    })
}
// ... left out: same thing for stderr

// wait for all goroutines to finish
wg.Wait()

// ... error handling goes here

// and return the data
return b_stdout, b_stderr, nil

The funny thing about goroutines is that they’re not necessarily running multi-threaded. (Since these are blocking I/O calls, they will be threads). It is absolutely possible to write this function in a single-threaded fashion (using I/O multiplexing). Using goroutines however just makes it so much easier.

The final step is cleaning up the process and collecting the exit code.

func (self *Process) Wait() (int, error)

The implementation of Wait() is straightforward, but we need to take care of the error handling. The Go Command.Wait() method returns an error of type ExitError when the process exits with a non-zero exit status. I don’t like that; in my book a process is free to have any return code, I don’t treat it as a hard error.

err := self.Cmd.Wait()
if err != nil {
    // test whether error is an ExitError
    exiterr, ok := err.(*exec.ExitError)
    if ok {
        exitcode := exiterr.ExitCode()
        // non-zero exit is not an error
        return exitcode, nil
    }
    // some other error occurred
    return -1, err
}
// exit code 0
return 0, nil

Note the syntax of the Go type assertion. The error type is actually a Go interface, implying that there may be many kinds of error. The type assertion uses (runtime) introspection to verify that this specific error is an ExitError.

Given all this, it’s easy to put together a fully functional subprocess.Run() function. One last gotcha: the process must always be waited on, otherwise we risk creating a zombie process. Either use Golang’s defer keyword, or avoid returning too early on error.

This post explains the code beginning to end. It’s not natural to get everything right the first time around; writing code is an iterative process where you refine and refactor until arriving at where you want to be. There was no time and space in this post to describe the iterations in detail. Other than that, note the features unique to Go that make this code tick.