blog

Project Plumbing with Plumbum (Part I)

Bash scripting is hard, let’s go plumbing

Consider the following scenario:

Let’s say you’re working on a software project. Maybe it’s a web service, maybe a GUI app, whatever. Doesn’t matter. As usual, you discover there’s some tedious task that needs doing repeatedly, so you decide to automate it. Since it’s pretty much the easiest thing you can think of, you crank out a quick bash^[1] script, which seems to handle things for the moment.

Later, you find some more similar tasks, so you crank out some more bash scripts. And then some more. Then you realize that you’re repeating yourself an awful lot, so you try factoring out some stuff, calling some scripts from other scripts. Eventually you realize that you have dozens of bash scripts calling each other in various combinations, scripts depending on other scripts three and four deep, parameter passing that makes your eyes melt from all the "$1"s, dogs and cats living together, mass hysteria!

But what else can you do? I mean, these scripts are basically just calling a bunch of command line tools, so we have to use a shell language to automate them, right?

Nope. Totally wrong.

Plumbum is a Python module originally written by Tomer Filiba which adopts the motto "Never write shell scripts again". You can use it to completely replace those pesky, unwieldy shell scripts with nice, clean, reusable Python.

Plumbum Basics: LocalMachine and LocalCommand

Let’s take a quick tour:

First, let’s import the local object.:

from plumbum import local

Plumbum calls this a LocalMachine object, and it’s full of shelly goodness.

First of all we can use it to get LocalCommand objects, which are handy wrappers to shell commands:

>>> ls = local['ls']
>>> ls
LocalCommand(<LocalPath /bin/ls>)

We can use this LocalCommand object to run the command, either alone:

>>> ls()
u'cmd_exe_improvements.rstnmaptestnplumbum article.rstnrst2wp.pyn'

or with additional arguments:

>>> ls('-a')
u'.n..ncmd_exe_improvements.rstnmaptestnplumbum article.rstnrst2wp.pyn'

This gives us back a single string containing whatever the command sent to stdout, which we can then manipulate however we like.

We can also print the command itself, which will show us exactly what command will be run:

>>> print ls()
/bin/ls

This is pretty neat, but plumbum also provides an alternative way to get these command objects using a little import magic, so we don’t need to repeat ourselves by typing "local" over and over:

>>> from plumbum.cmd import ls
>>> ls
LocalCommand(<LocalPath /bin/ls>)

This import magic looks for executables on the executable search path, so if you try to import a non-existent command, you’ll get a Python exception (which you can of course catch and handle like any other exception).

We can also bind arguments to the command without running it, so that we can pass around a command along with it’s arguments:

>>> ls['-a']
BoundCommand(LocalCommand(<LocalPath /bin/ls>), ('-a',))
>>> print ls['-a']
/bin/ls -a

This is handy because the various command objects in plumbum use Python’s operator overloading to let us easily combine commands with pipelines and redirection using a relatively familiar syntax:

>>> print ls['-a'] | grep['.py']
/bin/ls -a | /bin/grep .py
>>> (ls['-a'] | grep['.py'])()
u'pyscript.pyn'

LocalCommand objects can be used in several other ways, including explicitly running them in the foreground or background, getting exit status codes, and nesting commands (i.e. replicating bash’s backtick syntax).

Already Winning!

OK, so now we can run our command line tools from Python, but…so what? How does that help us, anyway? I mean, couldn’t we have just done all that in bash and done a little less typing? Well, yes we could have, but because we’ve translated our shell commands into Python expressions, we can do some pretty nifty things, that would be much less pleasant to do in bash.

For example, we could bundle up a subset of our script into a function, and call it from someplace else, or call it multiple times with varying arguments. Obviously bash has functions as well, but Python functions are far more powerful and flexible than in bash. Moreover, we can also write Python functions that operate on our commands, create libraries of commonly used commands, and anything else that you can do with a Python object.

So even the brief amount of Plumbum’s functionality we’ve explored so far actually buys us a lot more than you might imagine.

But we’re just getting started…

Path and Environment Manipulation

In addition to generating command objects, the LocalMachine object also has a number of other handy features. It can tell you where the current python interpreter is:

>>> local.python()
LocalCommand('/usr/bin/python')

search your path:

>>> local.which('ls')
<LocalPath /bin/ls>

look up (and set) environment variables:

>>> local.env['SHELL']
'/bin/bash'
>>> local.env['MY_ENV_VAR'] = 'blark'

and give you your current working directory:

>>> local.cwd
<LocalWorkdir /home/kevin>

Now a couple of these return LocalPath objects (LocalWorkdir is a subclass of LocalPath) and it’s worth taking a closer look at these, as they provide a pretty nice object-oriented interface for manipulating file paths.

Firstly, you can create paths and get some basic information about them:

>>> p = local.path('/tmp/slartibartfast')
>>> p.exists()
False
>>> p.mkdir()
>>> p.exists()
True
>>> p.isfile()
False
>>> p.isdir()
True

Plumbum’s path objects also overload the python division operators to allow joining paths:

>>> p / 'notimportant'
<LocalPath /tmp/slartibartfast/notimportant>

and globbing for paths:

>>> local.path('/tmp') // '*fast'
[<LocalPath /tmp/slartibartfast>]

And of course there’s tons of filepath-related stuff, like stat-ing files, opening and closing files, and other functionality analogous to things you might find in the standard library’s os.path module.

The LocalWorkdir subclass also adds the ability to be used as a context manager, so you can easily switch into and out of a directory using Python’s with statement. It looks something like this:

>>> with local.cwd('/tmp'):
...     print ls()
...
tmpfile0001
tmpfile0002
tmpfile0003
.
.
.

Bonus: Works on Windows!

If, like me, you spend a lot of time bouncing back and forth between something Unix-y and MS Windows, you’ll be glad to know that pretty much everything in plumbum works just as well on Windows (though obviously you may not have the same shell commands on Windows). This makes it entirely possible to write project automation scripts that work in both environments with only a little more work. Which is awesome.

Stay Tuned

We’ve only scratched the surface of what plumbum can do, but this post is already running a bit long. Next time, I’ll talk some more about some utility functions that Plumbum provides, as well as how to use Plumbum to script command line tools across multiple hosts.

I’m going to say “bash” a lot throughout this article, but whenever I do, feel free to replace it with a reference to your shell language of choice. It’s just that “bash” is a lot shorter to type than “your shell language of choice”, not to mention less encumbering to read. I mean, typing that every time would be like wasting a bunch of time writing footnotes, which I’m obviously not going to do..I’m going to say “bash” a lot throughout this article, but whenever I do, feel free to replace it with a reference to your shell language of choice. It’s just that “bash” is a lot shorter to type than “your shell language of choice”, not to mention less encumbering to read. I mean, typing that every time would be like wasting a bunch of time writing footnotes, which I’m obviously not going to do..

Spot the Vulnerability: Loops and Terminating Conditions

by Paul Hendry | Jan 7, 2022 | Developer Blog, Home Display

Spot the Vulnerability: Loops and Terminating Conditions In memory-unsafe languages like C, special care must be taken when copying untrusted data, particularly when copying it to another buffer. In this post, we\'ll spot and mitigate a past vulnerability in Linux\'s...

Accurate Timing

by Jason Bagley | Sep 24, 2021 | Developer Blog, Home Display

In many tasks we need to do something at given intervals of time. The most obvious ways may not give you the best results. Time? Meh. The most basic tasks that don't have what you might call CPU-scale time requirements can be handled with the usual language and...

Exploring Dependent Types in Idris

by Paul Hendry | Aug 27, 2021 | Developer Blog, Home Display

When I'm not coding the "impossible" at Art+Logic, I take a lot of interest in new programming technologies and paradigms; even if they're not yet viable for use in production, there can often be takeaways for improving your everyday code. My current...

« Older Entries