2021-07-25

I've always considered the macros of Scheme to be less powerful than Common Lisp
because of the lack of unhygienic macros. However, recently I've come to enjoy
the hygienic macros Scheme provide in the form of syntax-rules. In this post, I
wish to share a little hack I made to define functions with typed arguments in
Scheme.

As one should with macro programming, let us first consider how the final result
will look like, before considering the implementation.

    (define-typed (add-two (x number?) (y number?))
      (+ x y))

As you can see, the final product will look very similar to the regular 'define'
syntax for defining procedures, but will group each argument together with a
type predicate. When add-two is called, each of the type predicates will be
called for their respective parameter.

In other words, the above example should expand to something like the following:

    (define (add-two x y)
      (cond ((not (number? x)) (error "add-two: Wrong argument type."))
            ((not (number? y)) (error "add-two: Wrong argument type."))
            (else (+ x y))))

Not quite the benefit of static typing, but it at least requires all arguments
to be of the right types before attempting to evaluate the body. I do realise
that the error message is not the most helpful thing in the world. Improving it
is left as an exercise to the reader.

The pattern is quite clear. We want to extract the predicate and perform it
against the corresponding variable. When all predicates are checked, we perform
the main body of the function (the else branch).

Let's consider the pattern we're matching on first. In other words, let's
generalise the square definition above. It looks like it consists of 4 parts.

 1. The invocation itself, i.e. the symbol 'define-typed'.
 2. Next comes a list, with the first element being a symbol which functions as
    an identifier for the function.
 3. The rest of the arguments in this list are pairs of argument name, and
    predicate for that argument. Note that we can have zero or more such pairs.
 4. Lastly, we have a body consisting of one or more forms to be evaluated.

Our pattern in syntax-rules looks like this:

    (define-typed (identifier (var predicate) ...)
      body body* ...)

Here's an overview of the different parts of this pattern:

 1. The first symbol has to be the name of the macro. I've picked
    'define-typed'.
 2. Next we have a list, with the first element being the name of the procedure
    we're defining.
 3. Following the identifier, there's a variable, var, and a predicate. The
    ellipses following the pair means that we want to match on zero or more of
    these pairs.
 4. Lastly, there's the body. Remember that I said the body consists of one or
    more forms? The pattern "body body* ..." means that there has to be one form
    called 'body' and zero or more forms grouped into the 'body*' name. It could
    be called anything, but I call it body* by convention.

As you can see, the description of the pattern is quite similar to the actual
implementation. Now, syntax-rules is based on groups of pairs consisting of a
pattern like the one described above for matching, and the form it should be
replaced by when the macro is called. Our next task is then to create this form.
We can go about this using the same method as we used for describing the
pattern.

So, what should the final form look like? Again, the pattern is quite clear.

 1. We create a standard define form for defining procedures, with the
    identifier we picked and the variable names as arguments to the procedure.
 2. Then we need a 'cond' form with a chain of predicates applied to their
    respective variables.
 3. Lastly, there has to be an else branch for evaluating the body of the
    function.

This is the form we end up with, following the description above:

    (define (identifier var ...)
       (cond ((not (predicate var))
              (error (string-append (symbol->string (quote identifier))
                                    ": Wrong argument type."))) ...
             (else body body* ...)))

Like I did above, I'll go through this definition step by step.

 1. We start by defining a procedure as normal, using 'identifier' as the
    identifier. The arguments to the procedure should be all the variables
    captured in 'var'.
 2. Next comes the cond block. It consists of zero or more (note the ellipses)
    checks. If any predicate returns false - (not (predicate var)) - we throw an
    error (I've used symbol->string to add the identifier to the error message,
    it's not terribly important for the purposes of demonstration). One thing to
    note here is that the ellipses work just as you'd expect them to. Since we
    have zero or more predicates and variables, we need to have zero or more
    checks in the cond block. This part of the magic of syntax-rules, it infers
    stuff like this incredibly well and will even create an error if you forget
    to put ellipses where it's required.
 3. Lastly, there is an else branch which will evaluate the body form(s). Again,
    the ellipses here are mandatory.

Here's the final macro:

    (define-syntax define-typed
      (syntax-rules ()
        ((define-typed (identifier (var predicate) ...)
           body body* ...)
         (define (identifier var ...)
           (cond ((not (predicate var))
                  (error (string-append (symbol->string (quote identifier))
                                        ": Wrong argument type."))) ...
                 (else body body* ...))))))

And... that's it! You now have a facility for creating functions with typed
arguments. One interesting thing about this macro is that the predicate does not
need to be a traditional primitive type predicate either.

Say you want to define a procedure for computing square roots. Unless your
function returns complex numbers, it doesn't really make sense for it to take a
negative number as an argument. Instead of checking for if the argument is
negative manually inside the body of the procedure, you can simply pass a
predicate which checks it with the macro.

    (define-typed (sqrt (x (λ (x) (not (negative? x)))))
      ...)

Another example would be division, where it doesn't make sense to divide by
zero. A similar approach works well here too:

    (define-typed (div (dividend (number?))
                       (divisor (λ (x) (not (zero? x)))))
      ...)

I hope this post has been a nice introduction to syntax-rules and shown some of
the elegance and power of it. It has certainly made me think twice before
turning to Common Lisp for my macro-defining needs.

built with sh(1) and cat(1)