Emacs fulfills the UNIX Philosophy

Home About GitHub

Part 3: Seems like Functional Programming (FP)

This is part 3 of a 6-part series of articles defining what the UNIX philosophy is, what Emacs is, and discussing whether Emacs fulfills the UNIX philosophy.
0. Introduction
1. Emacs is an app platform
2. What is the UNIX philosophy
3. Seems like Functional Programming
4. Lisp does FP better than UNIX shell programming
5. The parallel histories of UNIX and Lisp
6. Response to common criticisms

In the part 2 of this series, tried to define more precisely what the UNIX Philosophy really is, and I quoted many of the original authors of UNIX and the UNIX Philosophy.

In this article, we get to the crux of my argument: that the UNIX Philosophy is really all about functional programming (FP), or really, that the UNIX Philosophy is an incomplete or misguided formulation of the principles of FP. If you really want to follow the principle of using programs as tools that do one thing and do it well, using an FP environment like Emacs, which provides a convenient interface for executing Lisp functions, is generally the better solution.

UNIX programming is trying to be FP

Doug McIlroy, at least in my book, deserves the credit for pipes. He thought like a mathematician and I think he had this connection right from the start. I think of the Unix command line as a prototypical functional language.

— Alfred Aho (co-author of the AWK language), stated in an interview, published in Masterminds of Programming: Conversations with the Creators of Major Programming Languages, 2009.

Since the UNIX Philosophy is so conerned with the design and use of programs, let's consider what a program really is. When people talk about programs in the context of the UNIX Philosophy, they usually mean some fundamental unit of executable code in the operating system that can perform some transformation on data, and that can be composed into shell pipelines. There is also an emphasis on making the tools easy to use in an interactive programming environment, so as to best take advantage of the interactive nature of the command line.

A program is a function

In a UNIX Programming Environment, the fundamental unit of code that is a program is conceptually equivalent to a function in a FP language. The definition of a program need not be restricted to code that runs in it's own process, and that would be a somewhat meaningless constraint to apply. The most important property of these units of code — these functions — is that they be easy to reason about (do just one thing), be somehow composable, and be easy to use in an interactive environment, such as in a REPL.

The term functional programming encompasses a large variety of concepts, and it is even more difficult to find consensus on the definition and principles of FP than it is to find for the definition of the UNIX Philosophy. But since this is a discussion about UNIX, Emacs, and Lisp, I'll borrow from the work of Paul Graham (of Y-Combinator fame), from his 1995 book ANSI Common Lisp, in which he personally observes that:

... in Lisp the edit-compile-test cycle is so short that programming is real-time.

Bigger abstractions and an interactive environment can change the way organizations develop software. The phrase rapid prototyping describes a kind of programming that began with Lisp: in Lisp, you can often write a prototype in less time than it would take to write the spec for one.

...

When you program in a functional style, bugs can only have a local effect. When you use a very abstract language, some bugs (e.g. dangling pointers) are no longer possible, and what remain are easy to find, because your programs are so much shorter. And when you have an interactive environment, you can correct bugs instantly, instead of enduring a long cycle of editing, compiling, and testing.

— Paul Graham, ANSI Common Lisp,, section 1.2 - 1.3.

On it's face, this may seem not to have much to do with the UNIX Philosophy. But I think if you consider some of the practical consequences that emerge from the properties of FP, you can begin to see the relationship to the UNIX Philosophy:

  1. Functions should be simple, sometimes described as orthogonal, or general. It should be easy to reason about a function's behavior. Functions should perform some minimal transformation on the input.

  2. Functions should be composable, through the use of higher order functions. It should be possible to craft more complex, more specific transformations on data by composing simpler functions.

  3. It should be easy to experiment with function composition in an interactive environment. Though this is not so much a principle of FP, rather it is a natural consequence of it, especially Lisp. FP languages almost always provide a printable representation of the data going in and out of each function, which is a human and machine readable format well suited for interactive programming. This is because the printing and parsing of these representations of data are themslves also functions which evaluate a transformation of data to strings, or of strings to data. Lisp languages use S-Expressions.

Of course, one could list several more distinguishing properties of FP than just the above three. Just to name a few: purity, referential transparency, pattern matching, polymorphism, and equational reasoning. But as long as we can agree that the three above points I mentioned are indeed useful and distinguishing properties of FP, it is easier to see the parallels between FP and the UNIX Philosophy.

And as a more concrete example: in FP languages, UNIX-like pipelines are expressible as higher-order functions, usually this is the composition operator which serves the same purpose as the shell pipe operator. In Haskell, the Monadic bind operator is a composition operator expressed over Monads.

Function composition in a Bourne Shell

I do not intend to argue the Bourne Shell family of languages are functional programming languages. Alfred Aho (quoted above) merely called it a prototypical functional programming language. But in this section I just want to draw attention to some of the shell scripting techniques that mimic functional programming, and how Bourne shell scripting makes function composition possible.

Pipes as function composition

Shell pipes are probably the most obvious form of function composition in the UNIX Programming Environment because of how the input of one program goes to the output of another, which is similar to the mathematical definition of function composition. For example:

# Here, the output of 'cat' will become the input of 'wc'
cat *.txt | wc -l;

Continuations

And there are yet other ways to compose programs besides pipes, for example Continuation Passing Style (CPS). This involves a partial function application passed as an argument to another function. An example of this in UNIX and Linux would be the find program, which can take an -exec argument, for example:

find . -type f -name '*.txt' \
    -exec grep -e 'gr[ea]y' -inHR '{}' +

Here, the find program will search for *.txt files, and for every file found it runs grep with the file as an argument to it. Grep then searches for occurences of the word grey or gray. In this example (grep -e 'gr[ea]y' -inHR '{}' \;) is a kind of continuation because grep may not be run at all if no text files are found, or it may run many times for many text files. It is up to the find program to decide when or if to call the grep function. The arguments to (-e 'gr[ea]y' -inHR) are partially applied, and the final argument to grep (the file to search) is substituted by find in the place of the string {}. So this is also an example of an anaphoric function call, in which the arguments to grep include a pre-determined free variable (in this case {}) that are bound when the continuation is evaluated.

Strict and lazy evaluation

Command substitution which is the $(...) built-in syntax, and it's lazy counterpart: Process substitution, which is the <(...), are roughly similar to the FP concepts of applicative order evaluation, and normal order evaluation (respectively). Not all shell languages in the Bourne family provide these features, but this is a feature of Bash. Here is a simple example of the command substitution feature in action:

grep '\<keyword\>' -nH $(find . -name '*.txt'; );

Here the bracketed command $(find . -name '*.txt'; ) passes the output of the bracketed find command as a list of arguments to the grep command. All files in all subdirectories that match the pattern *.txt will be searched by grep for the pattern '\<keyword\>' in the list of files returned by find.

The lazy form of this expression is <(...), which has one major difference: rather than passing the output of the bracketed command as a string, a temporary FIFO file is created and the path to this file is passed as an argument to the command. A FIFO is a kind of pipe, so when a command output is redirected to a FIFO file, the FIFO allows the output command to be consumed one line at a time. The command need not run to completion, so in this sense it is lazy.

paste <(find . -name '*.txt'; ) <(find . -name '*.html'; ) | \
    head -n 10;

The paste does not read input lazily (one line at a time), but we could create such a command easily with awk, so lets assume paste reads input files lazily. Now, here our paste command will take the output of 2 different find commands and merge them into 2-columns of output (perhaps associating .txt files with similarly named .html files). From the point of view of the paste command, it recieves file paths to 2 temporary files, each file being a FIFO. It can read each of these files one line at a time, which in effect accepts one line of output from each find command at a time. Since output is piped to the head -n 10 command, only 10 lines of output are printed before the command pipeline terminates. If paste is indeed reading the FIFO files lazily, neither find command will run to completion, they will only find the first 10 files and then the command pipeline terminates.

Conclusions

In this article I tried to explain the similarities between FP and the UNIX philosophy. I argue that the UNIX philosophy is an attempt at formulating the principles of FP, but I also suggest that the UNIX philsophy really falls short of actual FP. I also give several examples of how certain concepts in FP can be expressed in Bourne shell progrmming.

In the next article I will talk about how the shell language of UNIX is not a proper FP language. If we can agree that the UNIX Philosophy is really all about FP, and since Emacs is itself a Lisp implementation, then maybe we can agree that functions programmed in Emacs Lisp exemplify just what the UNIX philosophy is really all about. So Emacs really does fulfill the spirit of the UNIX Philosophy, perhaps even better than UNIX itself does.