Emacs fulfills the UNIX Philosophy
Part 3: Seems like Functional Programming (FP)
This is part 3 of a 6-part series of articles defining what the UNIX philosophy is, what Emacs is, and discussing whether Emacs fulfills the UNIX philosophy.
0. Introduction 1. Emacs is an app platform 2. What is the UNIX philosophy 3. Seems like Functional Programming 4. Lisp does FP better than UNIX shell programming 5. The parallel histories of UNIX and Lisp 6. Response to common criticisms
In the part 2 of this series, tried to define more precisely what the UNIX Philosophy really is, and I quoted many of the original authors of UNIX and the UNIX Philosophy.
In this article, we get to the crux of my argument: that the UNIX
Philosophy is really all about functional programming (FP),
or really, that the UNIX Philosophy is an incomplete or misguided
formulation of the principles of FP. If you really want to follow
the principle of using programs as tools that do one thing and do
it well,
using an FP environment like Emacs, which provides a
convenient interface for executing Lisp functions, is generally the
better solution.
UNIX programming is trying to be FP
Doug McIlroy, at least in my book, deserves the credit for pipes. He thought like a mathematician and I think he had this connection right from the start. I think of the Unix command line as a prototypical functional language.
— Alfred Aho (co-author of the AWK language), stated in an interview, published in Masterminds of Programming: Conversations with the Creators of Major Programming Languages, 2009.
Since the UNIX Philosophy is so conerned with the design and use
of programs, let's consider what a program
really
is. When people talk about programs in the context of the UNIX
Philosophy, they usually mean some fundamental unit of executable
code in the operating system that can perform some
transformation on data, and that can be composed into shell
pipelines. There is also an emphasis on making the tools easy to use
in an interactive programming environment, so as to best take
advantage of the interactive nature of the command line.
A program
is a function
In a UNIX Programming Environment, the fundamental unit of code
that is a program
is conceptually equivalent to
a function
in a FP language. The definition of
a program
need not be restricted to code that runs in
it's own process, and that would be a somewhat meaningless
constraint to apply. The most important property of these units of
code — these functions — is that they be easy to reason
about (do just one thing), be somehow composable, and be
easy to use in an interactive environment, such as in a
REPL.
The term functional programming
encompasses a large
variety of concepts, and it is even more difficult to find consensus
on the definition and principles of FP than it is to find for the
definition of the UNIX Philosophy. But since this is a discussion
about UNIX, Emacs, and Lisp, I'll borrow from the work of
Paul Graham
(of Y-Combinator fame),
from his 1995 book ANSI
Common Lisp,
in which he personally observes that:
... in Lisp the edit-compile-test cycle is so short that programming is real-time.
Bigger abstractions and an interactive environment can change the way organizations develop software. The phrase rapid prototyping describes a kind of programming that began with Lisp: in Lisp, you can often write a prototype in less time than it would take to write the spec for one.
...
When you program in a functional style, bugs can only have a local effect. When you use a very abstract language, some bugs (e.g. dangling pointers) are no longer possible, and what remain are easy to find, because your programs are so much shorter. And when you have an interactive environment, you can correct bugs instantly, instead of enduring a long cycle of editing, compiling, and testing.
— Paul Graham,
ANSI Common Lisp,, section 1.2 - 1.3.
On it's face, this may seem not to have much to do with the UNIX Philosophy. But I think if you consider some of the practical consequences that emerge from the properties of FP, you can begin to see the relationship to the UNIX Philosophy:
Functions should be simple, sometimes described as
orthogonal,
orgeneral.
It should be easy to reason about a function's behavior. Functions should perform some minimal transformation on the input.Functions should be composable, through the use of higher order functions. It should be possible to craft more complex, more specific transformations on data by composing simpler functions.
It should be easy to experiment with function composition in an interactive environment. Though this is not so much a principle of FP, rather it is a natural consequence of it, especially Lisp. FP languages almost always provide a printable representation of the data going in and out of each function, which is a human and machine readable format well suited for interactive programming. This is because the printing and parsing of these representations of data are themslves also functions which evaluate a transformation of data to strings, or of strings to data. Lisp languages use S-Expressions.
Of course, one could list several more distinguishing properties of FP than just the above three. Just to name a few: purity, referential transparency, pattern matching, polymorphism, and equational reasoning. But as long as we can agree that the three above points I mentioned are indeed useful and distinguishing properties of FP, it is easier to see the parallels between FP and the UNIX Philosophy.
And as a more concrete example: in FP languages, UNIX-like pipelines are expressible as higher-order functions, usually this is the composition operator which serves the same purpose as the shell pipe operator. In Haskell, the Monadic bind operator is a composition operator expressed over Monads.
Function composition in a Bourne Shell
I do not intend to argue the Bourne Shell family of
languages are functional programming languages. Alfred Aho (quoted
above) merely called it a prototypical functional
programming language.
But in this section I just want to draw
attention to some of the shell scripting techniques that mimic
functional programming, and how Bourne shell scripting makes
function composition possible.
Pipes as function composition
Shell pipes are probably the most obvious form of function composition in the UNIX Programming Environment because of how the input of one program goes to the output of another, which is similar to the mathematical definition of function composition. For example:
# Here, the output of 'cat' will become the input of 'wc' cat *.txt | wc -l;
Continuations
And there are yet other ways to compose programs besides pipes,
for
example Continuation
Passing Style (CPS). This involves
a partial
function application passed as an argument to another
function. An example of this in UNIX and Linux would be
the find
program, which can take an -exec
argument, for example:
find . -type f -name '*.txt' \ -exec grep -e 'gr[ea]y' -inHR '{}' +
Here, the find
program will search
for *.txt
files, and for every file found it
runs grep
with the file as an argument to it. Grep then
searches for occurences of the word grey
or gray
. In
this example (grep -e 'gr[ea]y' -inHR '{}' \;
) is a
kind of continuation because grep
may not be run
at all if no text files are found, or it may run many times for many
text files. It is up to the find
program to decide when
or if to call the grep
function. The arguments to
(-e 'gr[ea]y' -inHR
) are partially applied, and
the final argument to grep
(the file to search) is
substituted by find
in the place of the
string
. So this is also an example of
an anaphoric
function call, in which the arguments to {}
grep
include a pre-determined free variable (in this
case
) that are bound when the continuation is
evaluated.{}
Strict and lazy evaluation
Command substitution which is the $(...)
built-in syntax, and it's lazy counterpart: Process
substitution, which is the <(...)
, are roughly
similar to the FP concepts of applicative order evaluation,
and normal order evaluation (respectively). Not all shell
languages in the Bourne family provide these features, but this is a
feature of Bash. Here is a simple example of the command
substitution feature in action:
grep '\<keyword\>' -nH $(find . -name '*.txt'; );
Here the bracketed command $(find . -name '*.txt'; )
passes the output of the
bracketed find
command as a list of arguments to
the grep
command. All files in all subdirectories that
match the pattern *.txt
will be searched
by grep
for the
pattern '\<keyword\>'
in the list of files
returned by find
.
The lazy form of this expression is <(...)
, which
has one major difference: rather than passing the output of the
bracketed command as a string, a temporary FIFO file is created and
the path to this file is passed as an argument to the command. A
FIFO is a kind of pipe, so when a command output is redirected to a
FIFO file, the FIFO allows the output command to be consumed one
line at a time. The command need not run to completion, so in this
sense it is lazy.
paste <(find . -name '*.txt'; ) <(find . -name '*.html'; ) | \ head -n 10;
The paste
does not read input lazily (one line at a
time), but we could create such a command easily
with awk
, so lets assume paste
reads input
files lazily. Now, here our paste
command will take the
output of 2 different find
commands and merge them into
2-columns of output (perhaps associating .txt
files
with similarly named .html
files). From the point of
view of the paste
command, it recieves file paths to 2
temporary files, each file being a FIFO
. It can read
each of these files one line at a time, which in effect accepts one
line of output from each find
command at a time. Since
output is piped to the head -n 10
command, only 10
lines of output are printed before the command pipeline
terminates. If paste
is indeed reading
the FIFO
files lazily
, neither find
command will run to completion, they will only find the first 10
files and then the command pipeline terminates.
Conclusions
In this article I tried to explain the similarities between FP and the UNIX philosophy. I argue that the UNIX philosophy is an attempt at formulating the principles of FP, but I also suggest that the UNIX philsophy really falls short of actual FP. I also give several examples of how certain concepts in FP can be expressed in Bourne shell progrmming.
In the next article I will talk about how the shell language of UNIX is not a proper FP language. If we can agree that the UNIX Philosophy is really all about FP, and since Emacs is itself a Lisp implementation, then maybe we can agree that functions programmed in Emacs Lisp exemplify just what the UNIX philosophy is really all about. So Emacs really does fulfill the spirit of the UNIX Philosophy, perhaps even better than UNIX itself does.