Announcing: Scheme functional lenses

Home About Codeberg

I have published my library for functional lenses written in R7RS Scheme on CodeBerg. This article is a reprinting of the README.md file in the Git repository on CodeBerg.

Functional lenses for R7RS Scheme is inspired by Haskell. Functional lenses provide a way to combine getters and setters into a first-class value that which be easily composed together to inspect and update arbitrarily complex data structures.

In Scheme (and other Lisp-family languages), programmers typically make use of pattern matching to extract important information from data structures. However pattern matching can occasionally be a bit too cumbersome for selecting data that is somewhat regular in structure, such as objects structured according to the XML or JSON protocols. In other, more popular languages like Python or JavaScript, you would simply write an expression such as:

  // In JavaScript, you would write code like so:

  var result =
      input["people"].findBy("first_name", "Alice")["salary"];

  costs.monthly.chart.scale = {x: 1.25, y: 1.33};

What can be done with one line of code in JavaScript may take three or four lines of pattern matching code in Scheme, one line for each indirection through the data.

An alternative to pattern matching is to make use of a "Meta-Object Protocol" (MOP) implementation and map your data structures to objects, then access the fields of these objects using ordinary object method composition, just like how Python or JavaScript work. Various MOPs exist for Scheme, some may even provide easy ways to wrap hash tables and vectors into objects that can easily keys or indicies.

Yet another alternative is functional lenses (this library). It works by defining few primitive "unit lenses", and then more complex lenses can be constructed by composing together these unit lenses. The resulting code looks much more similar to something with which a JavaScript or Python programmer would be comfortable using. The above JavaScript example could hypothetically be rewritten to look more like the following:

  ;; With Scheme functional lenses, the above JavaScript example
  ;; *could* perhaps be written in this way:

  (define result
    (view input 'people (find-by 'first-name "Alice") 'salary))

  (lens-set! '((x . 1.25) (y . 1.33)) costs 'monthly 'chart 'scale))

There are also similarities in functionaltiy between lenses and SRFI-17 "Generalized set!", in which a set! is redefined as a macro which inspects its first argument and uses it to construct an updating procedure. Then there are well-defined mechanisms, such as defining "accessors" within a MOP framework, to parameterize the set! expression so that Scheme programmers may declare their own set! expressions for arbitrary data structures without needing to write their own macros.

However functional lenses, as the name implies, do not use macros, rather they are entirely functional. In fact, it is possible to declare purely functional lenses as well, in which data structures are not mutatated but replaced with slightly altered copies (more on this below).

Building and testing

This library has been used with the following scheme implementations:

Unit tests currently only pass on Guile, and also in the Gambit interpreter (the Gambit compiler crashes when using lenses). Stklos mostly passes all unit tests with a few not passing. MIT/GNU Scheme does not provide the SRFI-64 testing framework, but running each individual tests by hand seems to produce expected results.

It should not be too difficult to port to other R7RS Scheme implementation as well.

To run the test suite in Guile:

scheme@(guile-user)> (load "gypsum/lens-tests.scm")

Simple Example

(import
  (scheme base)
  (only (scheme write) display write newline)

  (gypsum lens) ;; <----- import declaration --------

  )

(cond-expand
  ;; Lenses uses R7RS vectors, but hash tables is implementation
  ;; specific. SRFIs 69 and 43 are the older vector and hash table
  ;; standard APIs, SRFIs 133 and 125 are the newer vector and
  ;; hash table standard APIs.

  ((or guile gambit)
    (import
      (only (srfi 69) alist->hash-table) ; old hash tables
      (only (srfi 43) vector) ; old vectors
    ))
  ;; Otherwise, we use SRFI-133 for vectors, SRFI-124 for hash tables
  (else
    (import
      (only (srfi 125) alist->hash-table) ; hash tables
      (only (srfi 133) vector) ; vectors
    ))
  )

(define (main)

  ;; ==== EXAMPLES ====
  ;; (The "hashtab" example data structure is defined below)
  
  ;; --- Use the `view` procedure to view elements ---

  (display (view hashtab "zero")) (newline)
    ;; displays 0

  (display (view hashtab "AB" "A")) (newline)
    ;; displays "go-left"

  (display (view hashtab "AB" "vec" 0 "C")) (newline)
    ;; displays "go-up"

  (display (view hashtab "AB" "vec" 2)) (newline)
    ;; displays #f

  ;; --- Use the "lens-set!" procedure to update elements ---
  ;;
  ;; The order of the arguments are:
  ;;
  ;;  1. the value to be inserted into the data structure
  ;;  2. the data structure to be updated
  ;;  3. zero or more lenses which can update the data structure
  ;;
  ;; In Python or JavaScript you would write:
  ;; 
  ;;     hashtab["AB"]["vec"][2] = [20,30,40]
  ;;
  ;; With lenses, you write:

  (lens-set! '(20 30 40)
    hashtab "AB" "vec" 2)

  ;; hashtab["two"] = 222
  (lens-set! 222
    hashtab "two")

  ;; print(hashtab["two"])
  (display (view hashtab "two")) (newline)
    ;; displays 222

  ;; print(hashtab["AB"]["vec"][2])
  (display (view hashtab "AB" "vec" 2)) (newline)
    ;; displays (20 30 40)

  ;; --- Use the "update!" procedure to update elements and return a value ---

  (let-values
      (((hashtab sum)
        (update! (lambda (items)
                   (values (cons 10 items) (apply + items)))
          hashtab "AB" "vec" 2))
      )
   (display sum) (newline)
     ;; displays 90

   (display (view hashtab "AB" "vec" 2)) (newline)
     ;; displays (10 20 30 40)
  ))

(define hashtab
  (alist->hash-table
   `(("zero" . 0)
     ("one"  . 1)
     ("two"  . 2)
     ("AB" .
      ,(alist->hash-table
        `(("A" . "go-left")
          ("B" . "go-right")
          ("vec" .
          ,(vector
            (alist->hash-table ;; at index 0
             `(("C" . "go-up")
               ("D" . "go-down")))
            (alist->hash-table ;; at index 1
             '(("E" . "turn-around")
               ("F" . "jump")))
            #f ;; at index 2
            ))))))))

(main)

Purely functional lenses

As mentioned earlier, it is possible to declare purely functional lenses as well, in which data structures are not mutatated but replaced with slightly altered copies (more on this below). This can sometimes make your code much easier to optimize for compilers that perform an SSA pass, because by using pure-functional lenses, more of your code is essentially already in SSA-form to begin with.

Here is an example of a purely function lens for car and cdr which does not use set-car! or set-cdr!, rather it replaces the cons cell with an updated copy:

(define =>car
  (unit-lens
   (lambda (cell)           ;; ---------- getter ----------
     (car cell)
    )
   (lambda (cell new-value) ;; ---------- setter ----------
     (cons new-value (cdr cell))
    )
   (lambda (update cell)    ;; ----- in-place update ------
     (let-values
         (((new-car result) (update (car cell))))
      (values (cons new-car (cdr cell)) result)
      ))
   '=>car ;; <- you can optionally label the lens for better error messages
   ))

Difference between update! and generalized set!

As stated earlier, the SRFI-17 "Generalized set!" also provides a simpler way to update data structures more similar to how languages like JavaScript or Python would do. But unlike generalized set!, the mechanism for updating is accomplished with procedures (functions) rather than through the use of a macro such as set!.

One other difference is that the update! function applied to lenses must return two values:

  1. the updated data structure, or a replacement data structure if you are operating on a pure/immutable data type,

  2. an arbitrary second value.

This allows us to return additional information about the data structure as it is being updated. For example, we could express a depth-first search (DFS) algorithm which returns the path to an element which succeeds when applied to a predicate, we can return this path even when updating the lens. And this DFS lens can still be composed with other lenses in calls to view, lens-set!, or update!.

(define (=>depth-first-search select?)
  ;; This lens finds the first node in a tree for which the `SELECT?`
  ;; predicate return #t. The full path to the node is returned.
  ;; Updates return the path to the element updated as the second
  ;; value.

  (let*((getter ;; ----------------------------------------------
         ;; Returns the first node succeeding on application of
         ;; `SELECT?` The path is applied to `select?` but is not
         ;; returned.
         (lambda (node)
           (call/cc
            (lambda (return)
              (let loop ((node node) (path '()))
                (cond
                 ((select? node path) (return node))
                 ((hash-table? node)
                  (hash-table-for-each
                   (lambda (key node) (loop node (cons key path)))
                   node)
                  #f)
                 ((vector? node)
                  (let ((len (vector-length node)))
                    (let next ((i 0))
                      (cond
                       ((< i len)
                        (loop node (cons i path))
                        (next (+ 1 i)))
                       (else #f)))))
                 (else #f)))))
           ))
        (updater ;; --------------------------------------------------
         ;; The updater function takes an updating procedure to alter
         ;; the value to which this lens is referring. Since this is a
         ;; depth-first search, the updating procedure is applied to
         ;; the first element selected by the `SELECT?` predicate. The
         ;; updating procedure must return 2 values: 1. the updated
         ;; node, and 2. some arbitrary other value. When many lenses
         ;; are composed in sequence for the `UPDATE!` procedure, the
         ;; last lens in the sequence will return this arbitrary
         ;; value.
         ;;
         ;; In this example, the arbitrary second value returned is
         ;; the second return value of `UPDATER` CONS'ed to the path
         ;; of selected node, or #f if nothing is selected.
         (lambda (updater node)
           (call/cc
            (lambda (return)
              (let loop ((node node) (path '()))
                (cond
                 ((select? node path)
                  ;; ********* THIS IS INTERESTING! *********
                  ;; This is what makes lenses different from
                  ;; generalized `SET!`:

                  (let-values (((node result) (updater node)))
                    (return node (cons result path)))

                  ;; Notice how we can define our lens to not only
                  ;; return the result of `UPDATER` but also return
                  ;; the "path". When this lens is used with
                  ;; `LENS-SET!` this information is discarded, but if
                  ;; you use `UPDATE!` the lens you define can choose
                  ;; to return additional information that you might
                  ;; find useful.
                  )
                 ((hash-table? node)
                  (hash-table-for-each
                   (lambda (key subnode)
                     (let-values
                         (((subnode result)
                           (loop node (cons key path))))
                       (cond
                        (result
                         (hash-table-set! node key subnode)
                         (return node result))
                        (else (values)))))
                   node))
                 ((vector? node)
                  (let ((len (vector-length node)))
                    (let next ((i 0))
                      (cond
                       ((< i len)
                        (loop (vector-ref node i) (cons i path))
                        (next (+ 1 i)))))))
                 (else (values node #f))))))
           ))

        (setter ;; --------------------------------------------------------
         ;; We use the `DEFAULT-UNIT-LENS-SETTER` procedure that is
         ;; provided by `(GYPSUM LENS)` to derive the setter automatically.
         ;; The default setter uses the updater and discards the second of
         ;; the returned VALUES.
         (default-unit-lens-setter updater))
        )
    ;; Use the `UNIT-LENS` procedure provided by `(GYPSUM LENS)` to
    ;; construct a lens. This constructs a data structure associating
    ;; a getter, setter, and updater procedure all together.
    (unit-lens getter setter updater `(=>depth-first-search ,select?))
    ))

(define =>DFS =>depth-first-search)

(define (select-by-key-symbol sym)
  (lambda (o path) (and (pair? path) (eq? sym (car path)))))

(define (select-by-key-string name)
  (lambda (o path) (and (pair? path) (string=? name (car path)))))

(define ht
  (alist->hash-table
   (list
    (cons 'zero 0) (cons 'one 1) (cons 'two 2)
    (cons 'elements
     (alist->hash-table
      (list
       (cons 'earth "solid")
       (cons 'wind  "gas")
       (cons 'fire  "plasma")
       (cons 'water "liquid")
       (cons 'periodic-table
        (vector
          #f "Hydrogen" "Helium" "Lithium" "Berylium" "Boron"
          "Carbon" "Nitrogen" "Oxygen" "Flourine" "Neon"))))
     ))))

(view ht (=>DFS (select-by-key-symbol 'periodic-table)) 1)
  ;; displays "Hydrogen"

(view ht (=>DFS (select-by-key-symbol 'elements)) 'water)
  ;; displays "liquid"

Record types

Unfortunately, there are not yet any macros that automatically derive lenses for record types, so you must define the record type lenses by hand.

By convention, lens symbols are always prefixed with =>. Non-canoncial lenses (discussed later) are suffixed with a asterisk (*), and mutating lenses are suffixed with a bang (!). Since record-unit-lenses are typically mutating, it is most correct to name them suffixed with !. However we will make our color lenses canonical later, so they will be prefixed with *! in this example:

(import
  (scheme base)
  (only (scheme write) display write newline)
  (gypsum lens)
  )

(define-record-type <color-type>
  (make<color> r g b)
  color-type?
  (r red   set!red)
  (g green set!green)
  (b blue  set!blue))

;; Here we define the lenses:

(define =>red*!   (record-unit-lens red   set!red))
(define =>green*! (record-unit-lens green set!green))
(define =>blue*!  (record-unit-lens blue  set!blue))

(define *yellow*
  (let ((c (make<color> 0 0 0)))
    (lens-set! 255 c =>red*!)
    (lens-set! 255 c =>green*!)
    c))

Using literals and constants as lenses

You may have noticed that the view, lens-set, and update operators can take integers to focus on vectors, and string and symbols focus on hash tables. However for record types like our <color-type> in the example in the previous section, we must declare a lens object using record-unit-lens. This is because the lens operators recognize primitive values like integers and strings and automatically constructs an actual lens object to use these values.

In this table, we can see how primitive values are automatically converted to lens objects:

type ex. value as a lens object
integer? 123 (=>vector-index*! 123)
string? "hello" (=>hash-key*! "hello")
symbol? 'hello (=>hash-key*! 'hello)

What if we want the symbols 'red, 'green, and 'blue to automatically refer to the respective fields of our <color-type> from the example in the previous section? To do this, we can use the declare-rule/index->lens API at the top level of our program, typically right after defining the record-unit-lenss. Once we have done this, the lens operators view, lens-set, and update can automatically construct a lens such as =>red*! or =>green*! from symbols.

To declare a new translation rule, pass two arguments:

  1. the type of index you want to use as a lens, must be 'symbol, 'string, or 'integer.

  2. a predicate that returns #T for records applied to it, for example our color-type? predicate.

  3. a lens constructor that constructs a lens using the symbol, string, or integer.

(define (symbol->color-lens sym)
  ;; This predicate converts a symbol to one of the color
  ;; component lenses we defined earlier.
  (cond
   ((or (eq? sym 'r) (eq? sym 'red)) =>red)
   ((or (eq? sym 'g) (eq? sym 'green)) =>green)
   ((or (eq? sym 'b) (eq? sym 'blue)) =>blue)
   (else (error "not a color field name, must be 'red, 'green, or 'blue" sym))
   ))

(declare-rule/index->lens
  'symbol            ;; 1. the type of value that can be used as an index
  color-type?        ;; 2. the predicate to identify the record
  symbol->color-lens ;; 3. the procedure to convert a symbol to a lens
  )

;; Now we can use ordinary symbols to access fields of a `<color-type>`.

(define (main)
  (lens-set 0 my-painting =>bg-color 'red)
    ;; In this example, we can compose a lens object `=>bg-color` with
    ;; a lens symbol `'red`.
  )

Remember that declare-rule/index->lens really should be evaluated at the top level of a Scheme program or library, and it is best to decalre these rules right after the define-record-type form and the record-unit-lens lenses.

Inspecting lens index rules with get-rule-declaration/index->lens

To inspect what kind of lens indexing rules have been declared, you can use the get-rule-declaration/index->lens procedure. This procedure returns the indexing rules that have been declared for a particular index type. It takes two arguments:

For example if you have a vector-like object "vec32" and want to see if it is possilbe to index this object with an integer-valued index, you can evaluate:

(get-rule-declaration/index->lens 'integer vec32)

...and if that returns non-null, then it is OK to index your "vec32" with the an integer value, e.g. 5 in the expression:

(view vec32 5)

Canonical lenses

The hash key lenses do not create new keys if they do not already exist. Also, if a structure such as a hash table becomes empty after an update, the empty hash table remains rather than being deleted and replaced with #f.

However it is possible to create "canonical" lenses that construct new data structures (where possible) if lens-set or update is called on a structure, and remove structures that have become empty after an update. By convention, non-canonicalized lenses are suffixed with *. Lenses can be both canonical and non-pure, i.e. also mutate data on update. By convention, lenses that are both canonical and mutating should be named with a symbol suffixed with *!

=>vector-index*! 
a non-canonical lens that mutates a vector.
=>hash-key*! 
a non-canonical lens that mutates a hash table.
=>hash-key! 
a canonical lens that mutates a hash table.
=>hash-key 
a canonical lens that returns a new table if updated.

In the following example, we canonicalize the <color-type> from the previous example such that non-existent colors construct a black color.

(define (new-color) (make<color> 0 0 0))

(define (black? c) (or (not c) (= 0 (red c) (green c) (blue c))))

(define =>red!   (=>canonical =>red*!   new-color black?))
(define =>green! (=>canonical =>green*! new-color black?))
(define =>blue!  (=>canonical =>blue*!  new-color black?))

;; --- Now, setting on `#f` applies value to a `new-color`  ---

(let*((cyan (lens-set! 255 #f   =>green!)) ;; can set on #f, no problem
      (cyan (lens-set! 255 cyan =>blue!)))
  (display cyan)(newline) ;; #<<color-type> r: 0 g: 255 b: 255>

  ;; Setting the color to black with our canonical lenses will return
  ;; #f in place of a color.
  
  (let*((cyan (lens-set! 0 cyan =>green!))
        (cyan (lens-set! 0 cyan =>blue!)))
    (display cyan)(newline)
      ;; displays #f
    )

The =>self lens

In functional programming, it is frequently useful to have an "identity" function. The =>self lens is the identity lens, it operates on the data structure itself. Here is an example of where it might be useful:

;; Suppose "dim" is a symbol that is defined as one of '1D, '2D, or '3D.

(define origin-point
  ;; The "origin-point" is a number if "dim" is 1D, but if
  ;; "dim" is 2D or 3D, the "origin-point" is a vector.
  (cond
   ((eq? dim '1D)  64.0)
   ((eq? dim '2D)  #(8.0  8.0))
   ((eq? dim '3D)  #(4.0  4.0  4.0))
   (else (error "invalid dimensionality" dim))
   )

(define =>x
  ;; We define the lens "=>x" to access the first cell of the vector
  ;; if "dim" is 2D or 3D, but if "dim" is 1D the lens is "=>self".
  (cond
   ((eq dim '1D)  =>self) ;; <------------------ using "=>self" here
   (else         (=>vector-index! 0))))

(display (view origin-point =>x))
  ;; displays 64.0 if dim is 1D, 8.0 if dim is 2D, or 4.0 if dim is 3D.

Using =>on-update with lens-set!

Suppose you want to define a lens that always sorts a vector every time an element is inserted. There are other situations where you may want to always perform some mutating action on an object whenever the object is updated. This is the purpose of the primitive =>on-update lens.

(lens-set! 100
  vec (=>on-update (=>vector-index! n) vector-sort))

Debugging

Sometimes you simple do not have access to useful debugging tools, your only recourse is to do "print debugging" where you write lots of (display ...) or (write ...) statements everywhere and analyze the log of your program after it has run. To help with this, the =>trace lens is provided.

You can compose the =>trace lens with any other lens to produce logging information whenever the lens is used to access or updates data.

(define example-vector
  (vector #f #f 10 20 30 40 50 60 70 80 90 100 110 #f #f))

(define vec-min-index 2)
(define vec-max-index 13)
(define vec-median-index
  (truncate-quotient (- vec-min-index vec-max-index) 2))

(define =>median (=>vector-index! vec-median-index))
  ;; --- Hmm... this lens seems to be broken. ---

;; Lets inspect the lens viewing operation with "=>trace"

(display (view example-vector (=>trace =>median)))

The log output of the above code would then look something like this:

(begin view on lens (lens (vector-index! -5)):
  #:record-in-focus (#(#f #f 10 20 30 40 50 60 70 80 90 100 110 #f #f)))
(error in view on lens (lens (vector-index! -5)):
  #:error-object ("in procedure vector-ref: Argument out-of-range" -5)
  #:record-in-focus (#(#f #f 10 20 30 40 50 60 70 80 90 100 110 #f #f)))

Error in procedure vector-ref: Argument out of range: -5

As you can see, this report:

  begin view on lens (lens (vector-index! -5))

... shows that our vector index lens is being constructed with a negative value.

Concluding remarks

I personally have found the use of lenses to be an incredibly elegant and useful solution to many problems in Scheme involving large and complex data structures, such as objects structured as XML or JSON. I have come to prefer using lenses to using pattern matching, or going through the trouble of using a Meta-Object Protocol. I am publishing it here in the hopes that other people will find it as useful as I have.

This README is the only documentation for this library, but the source code is well-commented. To learn more about the available lens library APIs and how to use them, please refer to the comments in the source code.

I hope to keep this library as portable as possible, and have made an effort to write all code using only R7RS and the SRFIs. I am also interested in porting this library to other Scheme implementations. If you are interested in porting this code, and you manage to figure out which cond-expand expressions you need to write to make this library compile or run on another Scheme, please feel free to submit a pull request to me, either on CodeBerg, or as a Git patch attached to an e-mail.