Plays Well with Others – Lessons in Reusable Tooling

I’ve used many languages for my scripting needs, but my favorite — believe it or not — is probably Bash. Bash may not have any types, and scripting with Bash may be fraught with pitfalls, but sometimes the problem at hand is solved most succinctly and elegantly with small focused programs that compose well.

Bash’s composition comes in the form of piping one program’s output into the next program. No program needs to know where its input comes from, just like Lego blocks don’t care what block they stack on. This is an incredibly useful approach. For instance, take a look at how easy it is process the contents of the clipboard:

pbpaste | base64 | pbcopy

We just base64-encoded whatever was in the clipboard (at least, on Mac OS X; Linux has similar programs under different names, and on Windows you may need to write your own versions of pbpaste and pbcopy). You can use pbpaste and pbcopy and the rest of the Bash toolbox to do whatever processing of clipboard text you have in mind, and thanks to pipes, you can do it very concisely.

Getting data, updating it, and submitting it again is a common pattern throughout software development. In many cases, someone’s written a command line program that makes it easy. One of my favorite tools is jq, which gives the command line incredibly handy powers of JSON processing. Combined with curl and a database’s HTTP interface (if it has one, as CoucheDB does), you can call down data, modify it, and update the record all in one command.

curl http://localhost:5984/db/document_id \
  | jq 'del(.field)' \
  | curl http://localhost:5984/db/document_id -X PUT -d@-

If the processing you need to do is too sophisticated to pull off neatly with Bash’s toolbox, then by all means, write your own program to do it, but remember that your program will be more useful if you follow Bash’s example of composability. Your program doesn’t need to hit the database’s HTTP interface directly. It just has to read from standard input. And it doesn’t need to post the data back via HTTP either. It can just print it to standard output. Let the user decide where the data comes from and where it goes.

For instance, on one project, I frequently needed to modify a file and serve it up over HTTP for another computer to read. I could have scripted the whole process, from reading the file to making the modifications to serving it, but sed could read the file and make the changes, so all I needed was a program that served standard input over HTTP. My needs were so simple I didn’t even bother searching to see if such a program already existed. I just wrote it. I called the program inserve, and though the specific need is long gone, inserve is still part of my trusted toolbox.

inserve is useful — it seems strange to say — because it does so little. The more a program does, the more specialized it is and the less useful you’ll find it to be. But generality is only useful in tandem with the ability to communicate to other programs. I don’t always need to read a file, change it, and serve it over HTTP. Sometimes I need to curl a URL and serve the contents over HTTP, or find a file with some specific text and serve it up, or serve the contents of my clipboard or my current Vim buffer. inserve‘s agnosticism about its input means I don’t need new tools for these tasks.

Programs must be able to work together, to talk and listen to one another, whether it’s pipes, file descriptors, HTTP, or plain old TCP. A fancy Lego block that doesn’t stack may be a neat toy, but it’s not a Lego.