Announcing: Scheme functional lenses
I have published my library for functional lenses written in R7RS Scheme on CodeBerg. This article is a reprinting of the
README.md
file in the Git repository on CodeBerg.
Functional lenses for R7RS Scheme is inspired by Haskell. Functional lenses provide a way to combine getters and setters into a first-class value that which be easily composed together to inspect and update arbitrarily complex data structures.
In Scheme (and other Lisp-family languages), programmers typically make use of pattern matching to extract important information from data structures. However pattern matching can occasionally be a bit too cumbersome for selecting data that is somewhat regular in structure, such as objects structured according to the XML or JSON protocols. In other, more popular languages like Python or JavaScript, you would simply write an expression such as:
// In JavaScript, you would write code like so:
var result =
input["people"].findBy("first_name", "Alice")["salary"];
costs.monthly.chart.scale = {x: 1.25, y: 1.33};
What can be done with one line of code in JavaScript may take three or four lines of pattern matching code in Scheme, one line for each indirection through the data.
An alternative to pattern matching is to make use of a "Meta-Object Protocol" (MOP) implementation and map your data structures to objects, then access the fields of these objects using ordinary object method composition, just like how Python or JavaScript work. Various MOPs exist for Scheme, some may even provide easy ways to wrap hash tables and vectors into objects that can easily keys or indicies.
Yet another alternative is functional lenses (this library). It works by defining few primitive "unit lenses", and then more complex lenses can be constructed by composing together these unit lenses. The resulting code looks much more similar to something with which a JavaScript or Python programmer would be comfortable using. The above JavaScript example could hypothetically be rewritten to look more like the following:
;; With Scheme functional lenses, the above JavaScript example
;; *could* perhaps be written in this way:
(define result
(view input 'people (find-by 'first-name "Alice") 'salary))
(lens-set! '((x . 1.25) (y . 1.33)) costs 'monthly 'chart 'scale))
There are also similarities in functionaltiy between lenses and SRFI-17
"Generalized set!
", in which a set!
is
redefined as a macro which inspects its first argument and uses it to
construct an updating procedure. Then there are well-defined mechanisms,
such as defining "accessors" within a MOP framework, to parameterize the
set!
expression so that Scheme programmers may declare
their own set!
expressions for arbitrary data structures
without needing to write their own macros.
However functional lenses, as the name implies, do not use macros, rather they are entirely functional. In fact, it is possible to declare purely functional lenses as well, in which data structures are not mutatated but replaced with slightly altered copies (more on this below).
Building and testing
This library has been used with the following scheme implementations:
Unit tests currently only pass on Guile, and also in the Gambit interpreter (the Gambit compiler crashes when using lenses). Stklos mostly passes all unit tests with a few not passing. MIT/GNU Scheme does not provide the SRFI-64 testing framework, but running each individual tests by hand seems to produce expected results.
It should not be too difficult to port to other R7RS Scheme implementation as well.
To run the test suite in Guile:
scheme@(guile-user)> (load "gypsum/lens-tests.scm")
Simple Example
(import
(scheme base)
(only (scheme write) display write newline)
(gypsum lens) ;; <----- import declaration --------
)
(cond-expand
;; Lenses uses R7RS vectors, but hash tables is implementation
;; specific. SRFIs 69 and 43 are the older vector and hash table
;; standard APIs, SRFIs 133 and 125 are the newer vector and
;; hash table standard APIs.
((or guile gambit)
(import
(only (srfi 69) alist->hash-table) ; old hash tables
(only (srfi 43) vector) ; old vectors
))
;; Otherwise, we use SRFI-133 for vectors, SRFI-124 for hash tables
(else
(import
(only (srfi 125) alist->hash-table) ; hash tables
(only (srfi 133) vector) ; vectors
))
)
(define (main)
;; ==== EXAMPLES ====
;; (The "hashtab" example data structure is defined below)
;; --- Use the `view` procedure to view elements ---
(display (view hashtab "zero")) (newline)
;; displays 0
(display (view hashtab "AB" "A")) (newline)
;; displays "go-left"
(display (view hashtab "AB" "vec" 0 "C")) (newline)
;; displays "go-up"
(display (view hashtab "AB" "vec" 2)) (newline)
;; displays #f
;; --- Use the "lens-set!" procedure to update elements ---
;;
;; The order of the arguments are:
;;
;; 1. the value to be inserted into the data structure
;; 2. the data structure to be updated
;; 3. zero or more lenses which can update the data structure
;;
;; In Python or JavaScript you would write:
;;
;; hashtab["AB"]["vec"][2] = [20,30,40]
;;
;; With lenses, you write:
(lens-set! '(20 30 40)
hashtab "AB" "vec" 2)
;; hashtab["two"] = 222
(lens-set! 222
hashtab "two")
;; print(hashtab["two"])
(display (view hashtab "two")) (newline)
;; displays 222
;; print(hashtab["AB"]["vec"][2])
(display (view hashtab "AB" "vec" 2)) (newline)
;; displays (20 30 40)
;; --- Use the "update!" procedure to update elements and return a value ---
(let-values
(((hashtab sum)
(update! (lambda (items)
(values (cons 10 items) (apply + items)))
hashtab "AB" "vec" 2))
)
(display sum) (newline)
;; displays 90
(display (view hashtab "AB" "vec" 2)) (newline)
;; displays (10 20 30 40)
))
(define hashtab
(alist->hash-table
`(("zero" . 0)
("one" . 1)
("two" . 2)
("AB" .
,(alist->hash-table
`(("A" . "go-left")
("B" . "go-right")
("vec" .
,(vector
(alist->hash-table ;; at index 0
`(("C" . "go-up")
("D" . "go-down")))
(alist->hash-table ;; at index 1
'(("E" . "turn-around")
("F" . "jump")))
#f ;; at index 2
))))))))
(main)
Purely functional lenses
As mentioned earlier, it is possible to declare purely functional lenses as well, in which data structures are not mutatated but replaced with slightly altered copies (more on this below). This can sometimes make your code much easier to optimize for compilers that perform an SSA pass, because by using pure-functional lenses, more of your code is essentially already in SSA-form to begin with.
Here is an example of a purely function lens for car
and
cdr
which does not use set-car!
or
set-cdr!
, rather it replaces the cons
cell
with an updated copy:
(define =>car
(unit-lens
(lambda (cell) ;; ---------- getter ----------
(car cell)
)
(lambda (cell new-value) ;; ---------- setter ----------
(cons new-value (cdr cell))
)
(lambda (update cell) ;; ----- in-place update ------
(let-values
(((new-car result) (update (car cell))))
(values (cons new-car (cdr cell)) result)
))
'=>car ;; <- you can optionally label the lens for better error messages
))
Difference
between update!
and generalized
set!
As stated earlier,
the SRFI-17
"Generalized set!
" also provides a simpler way to
update data structures more similar to how languages like JavaScript
or Python would do. But unlike generalized set!
, the
mechanism for updating is accomplished with procedures (functions)
rather than through the use of a macro such as set!
.
One other difference is that the update!
function
applied to lenses must return two values
:
the updated data structure, or a replacement data structure if you are operating on a pure/immutable data type,
an arbitrary second value.
This allows us to return additional information about the data
structure as it is being updated. For example, we could express a
depth-first search (DFS) algorithm which returns the path to an
element which succeeds when applied to a predicate, we can return this
path even when updating the lens. And this DFS lens can still be
composed with other lenses in calls
to view
, lens-set!
, or
update!
.
(define (=>depth-first-search select?)
;; This lens finds the first node in a tree for which the `SELECT?`
;; predicate return #t. The full path to the node is returned.
;; Updates return the path to the element updated as the second
;; value.
(let*((getter ;; ----------------------------------------------
;; Returns the first node succeeding on application of
;; `SELECT?` The path is applied to `select?` but is not
;; returned.
(lambda (node)
(call/cc
(lambda (return)
(let loop ((node node) (path '()))
(cond
((select? node path) (return node))
((hash-table? node)
(hash-table-for-each
(lambda (key node) (loop node (cons key path)))
node)
#f)
((vector? node)
(let ((len (vector-length node)))
(let next ((i 0))
(cond
((< i len)
(loop node (cons i path))
(next (+ 1 i)))
(else #f)))))
(else #f)))))
))
(updater ;; --------------------------------------------------
;; The updater function takes an updating procedure to alter
;; the value to which this lens is referring. Since this is a
;; depth-first search, the updating procedure is applied to
;; the first element selected by the `SELECT?` predicate. The
;; updating procedure must return 2 values: 1. the updated
;; node, and 2. some arbitrary other value. When many lenses
;; are composed in sequence for the `UPDATE!` procedure, the
;; last lens in the sequence will return this arbitrary
;; value.
;;
;; In this example, the arbitrary second value returned is
;; the second return value of `UPDATER` CONS'ed to the path
;; of selected node, or #f if nothing is selected.
(lambda (updater node)
(call/cc
(lambda (return)
(let loop ((node node) (path '()))
(cond
((select? node path)
;; ********* THIS IS INTERESTING! *********
;; This is what makes lenses different from
;; generalized `SET!`:
(let-values (((node result) (updater node)))
(return node (cons result path)))
;; Notice how we can define our lens to not only
;; return the result of `UPDATER` but also return
;; the "path". When this lens is used with
;; `LENS-SET!` this information is discarded, but if
;; you use `UPDATE!` the lens you define can choose
;; to return additional information that you might
;; find useful.
)
((hash-table? node)
(hash-table-for-each
(lambda (key subnode)
(let-values
(((subnode result)
(loop node (cons key path))))
(cond
(result
(hash-table-set! node key subnode)
(return node result))
(else (values)))))
node))
((vector? node)
(let ((len (vector-length node)))
(let next ((i 0))
(cond
((< i len)
(loop (vector-ref node i) (cons i path))
(next (+ 1 i)))))))
(else (values node #f))))))
))
(setter ;; --------------------------------------------------------
;; We use the `DEFAULT-UNIT-LENS-SETTER` procedure that is
;; provided by `(GYPSUM LENS)` to derive the setter automatically.
;; The default setter uses the updater and discards the second of
;; the returned VALUES.
(default-unit-lens-setter updater))
)
;; Use the `UNIT-LENS` procedure provided by `(GYPSUM LENS)` to
;; construct a lens. This constructs a data structure associating
;; a getter, setter, and updater procedure all together.
(unit-lens getter setter updater `(=>depth-first-search ,select?))
))
(define =>DFS =>depth-first-search)
(define (select-by-key-symbol sym)
(lambda (o path) (and (pair? path) (eq? sym (car path)))))
(define (select-by-key-string name)
(lambda (o path) (and (pair? path) (string=? name (car path)))))
(define ht
(alist->hash-table
(list
(cons 'zero 0) (cons 'one 1) (cons 'two 2)
(cons 'elements
(alist->hash-table
(list
(cons 'earth "solid")
(cons 'wind "gas")
(cons 'fire "plasma")
(cons 'water "liquid")
(cons 'periodic-table
(vector
#f "Hydrogen" "Helium" "Lithium" "Berylium" "Boron"
"Carbon" "Nitrogen" "Oxygen" "Flourine" "Neon"))))
))))
(view ht (=>DFS (select-by-key-symbol 'periodic-table)) 1)
;; displays "Hydrogen"
(view ht (=>DFS (select-by-key-symbol 'elements)) 'water)
;; displays "liquid"
Record types
Unfortunately, there are not yet any macros that automatically derive lenses for record types, so you must define the record type lenses by hand.
By convention, lens symbols are always prefixed
with =>
. Non-canoncial lenses (discussed later) are
suffixed with a asterisk (*
), and mutating lenses are
suffixed with a bang (!
).
Since record-unit-lens
es are typically mutating, it is
most correct to name them suffixed with !
. However we
will make our color lenses canonical later, so they will be prefixed
with *!
in this example:
(import
(scheme base)
(only (scheme write) display write newline)
(gypsum lens)
)
(define-record-type <color-type>
(make<color> r g b)
color-type?
(r red set!red)
(g green set!green)
(b blue set!blue))
;; Here we define the lenses:
(define =>red*! (record-unit-lens red set!red))
(define =>green*! (record-unit-lens green set!green))
(define =>blue*! (record-unit-lens blue set!blue))
(define *yellow*
(let ((c (make<color> 0 0 0)))
(lens-set! 255 c =>red*!)
(lens-set! 255 c =>green*!)
c))
Using literals and constants as lenses
You may have noticed that the view
,
lens-set
, and update
operators can take
integers to focus on vectors, and string and symbols focus on hash
tables. However for record types like our
<color-type>
in the example in the previous section,
we must declare a lens object using record-unit-lens
. This
is because the lens operators recognize primitive values like integers
and strings and automatically constructs an actual lens object to use
these values.
In this table, we can see how primitive values are automatically converted to lens objects:
type | ex. value | as a lens object |
---|---|---|
integer? | 123 |
(=>vector-index*! 123) |
string? | "hello" |
(=>hash-key*! "hello") |
symbol? | 'hello |
(=>hash-key*! 'hello) |
What if we want the symbols 'red
, 'green
,
and 'blue
to automatically refer to the respective fields
of our <color-type>
from the example in the
previous section? To do this, we can use the
declare-rule/index->lens
API at the top level of our
program, typically right after defining the
record-unit-lens
s. Once we have done this, the lens
operators view
, lens-set
, and
update
can automatically construct a lens such as
=>red*!
or =>green*!
from symbols.
To declare a new translation rule, pass two arguments:
the type of index you want to use as a lens, must be
'symbol
,'string
, or'integer
.a predicate that returns
#T
for records applied to it, for example ourcolor-type?
predicate.a lens constructor that constructs a lens using the symbol, string, or integer.
(define (symbol->color-lens sym)
;; This predicate converts a symbol to one of the color
;; component lenses we defined earlier.
(cond
((or (eq? sym 'r) (eq? sym 'red)) =>red)
((or (eq? sym 'g) (eq? sym 'green)) =>green)
((or (eq? sym 'b) (eq? sym 'blue)) =>blue)
(else (error "not a color field name, must be 'red, 'green, or 'blue" sym))
))
(declare-rule/index->lens
'symbol ;; 1. the type of value that can be used as an index
color-type? ;; 2. the predicate to identify the record
symbol->color-lens ;; 3. the procedure to convert a symbol to a lens
)
;; Now we can use ordinary symbols to access fields of a `<color-type>`.
(define (main)
(lens-set 0 my-painting =>bg-color 'red)
;; In this example, we can compose a lens object `=>bg-color` with
;; a lens symbol `'red`.
)
Remember that
declare-rule/index->lens
really should be evaluated at
the top level of a Scheme program or library, and it is best to decalre
these rules right after the define-record-type
form and the
record-unit-lens
lenses.
Inspecting lens index rules with get-rule-declaration/index->lens
To inspect what kind of lens indexing rules have been declared, you
can use the get-rule-declaration/index->lens
procedure. This procedure returns the indexing rules that have been
declared for a particular index type. It takes two arguments:
INDEX-TYPE
: which must be one of'INTEGER
,'STRING
, or'SYMBOL
RECORD
: a record object that you want to inspect with an index.
For example if you have a vector-like object "vec32" and want to see if it is possilbe to index this object with an integer-valued index, you can evaluate:
(get-rule-declaration/index->lens 'integer vec32)
...and if that returns non-null, then it is OK to index your "vec32"
with the an integer value, e.g. 5
in the expression:
(view vec32 5)
Canonical lenses
The hash key lenses do not create new keys if they do not already
exist. Also, if a structure such as a hash table becomes empty after an
update, the empty hash table remains rather than being deleted and
replaced with #f
.
However it is possible to create "canonical" lenses that construct
new data structures (where possible) if lens-set
or
update
is called on a structure, and remove structures
that have become empty after an update. By
convention, non-canonicalized lenses are suffixed
with *
. Lenses can be both canonical and non-pure,
i.e. also mutate data on update
. By convention, lenses
that are both canonical and mutating should be named with a symbol
suffixed with *!
=>vector-index*!
- a non-canonical lens that mutates a vector.
=>hash-key*!
- a non-canonical lens that mutates a hash table.
=>hash-key!
- a canonical lens that mutates a hash table.
=>hash-key
- a canonical lens that returns a new table if updated.
In the following example, we canonicalize the
<color-type>
from the previous example such that
non-existent colors construct a black color.
(define (new-color) (make<color> 0 0 0))
(define (black? c) (or (not c) (= 0 (red c) (green c) (blue c))))
(define =>red! (=>canonical =>red*! new-color black?))
(define =>green! (=>canonical =>green*! new-color black?))
(define =>blue! (=>canonical =>blue*! new-color black?))
;; --- Now, setting on `#f` applies value to a `new-color` ---
(let*((cyan (lens-set! 255 #f =>green!)) ;; can set on #f, no problem
(cyan (lens-set! 255 cyan =>blue!)))
(display cyan)(newline) ;; #<<color-type> r: 0 g: 255 b: 255>
;; Setting the color to black with our canonical lenses will return
;; #f in place of a color.
(let*((cyan (lens-set! 0 cyan =>green!))
(cyan (lens-set! 0 cyan =>blue!)))
(display cyan)(newline)
;; displays #f
)
The =>self
lens
In functional programming, it is frequently useful to have an
"identity" function. The =>self
lens is the identity
lens, it operates on the data structure itself. Here is an example of
where it might be useful:
;; Suppose "dim" is a symbol that is defined as one of '1D, '2D, or '3D.
(define origin-point
;; The "origin-point" is a number if "dim" is 1D, but if
;; "dim" is 2D or 3D, the "origin-point" is a vector.
(cond
((eq? dim '1D) 64.0)
((eq? dim '2D) #(8.0 8.0))
((eq? dim '3D) #(4.0 4.0 4.0))
(else (error "invalid dimensionality" dim))
)
(define =>x
;; We define the lens "=>x" to access the first cell of the vector
;; if "dim" is 2D or 3D, but if "dim" is 1D the lens is "=>self".
(cond
((eq dim '1D) =>self) ;; <------------------ using "=>self" here
(else (=>vector-index! 0))))
(display (view origin-point =>x))
;; displays 64.0 if dim is 1D, 8.0 if dim is 2D, or 4.0 if dim is 3D.
Using =>on-update
with lens-set!
Suppose you want to define a lens that always sorts a vector every
time an element is inserted. There are other situations where you may
want to always perform some mutating action on an object whenever the
object is updated. This is the purpose of the primitive
=>on-update
lens.
(lens-set! 100
vec (=>on-update (=>vector-index! n) vector-sort))
Debugging
Sometimes you simple do not have access to useful debugging tools,
your only recourse is to do "print debugging" where you write lots of
(display ...)
or (write ...)
statements
everywhere and analyze the log of your program after it has run. To
help with this, the =>trace
lens is provided.
You can compose the =>trace
lens with any other
lens to produce logging information whenever the lens is used to
access or updates data.
(define example-vector
(vector #f #f 10 20 30 40 50 60 70 80 90 100 110 #f #f))
(define vec-min-index 2)
(define vec-max-index 13)
(define vec-median-index
(truncate-quotient (- vec-min-index vec-max-index) 2))
(define =>median (=>vector-index! vec-median-index))
;; --- Hmm... this lens seems to be broken. ---
;; Lets inspect the lens viewing operation with "=>trace"
(display (view example-vector (=>trace =>median)))
The log output of the above code would then look something like this:
(begin view on lens (lens (vector-index! -5)):
#:record-in-focus (#(#f #f 10 20 30 40 50 60 70 80 90 100 110 #f #f)))
(error in view on lens (lens (vector-index! -5)):
#:error-object ("in procedure vector-ref: Argument out-of-range" -5)
#:record-in-focus (#(#f #f 10 20 30 40 50 60 70 80 90 100 110 #f #f)))
Error in procedure vector-ref: Argument out of range: -5
As you can see, this report:
begin view on lens (lens (vector-index! -5))
... shows that our vector index lens is being constructed with a negative value.
Concluding remarks
I personally have found the use of lenses to be an incredibly elegant and useful solution to many problems in Scheme involving large and complex data structures, such as objects structured as XML or JSON. I have come to prefer using lenses to using pattern matching, or going through the trouble of using a Meta-Object Protocol. I am publishing it here in the hopes that other people will find it as useful as I have.
This README is the only documentation for this library, but the source code is well-commented. To learn more about the available lens library APIs and how to use them, please refer to the comments in the source code.
I hope to keep this library as portable as possible, and have made
an effort to write all code using only R7RS and the SRFIs. I am also
interested in porting this library to other Scheme implementations. If
you are interested in porting this code, and you manage to figure out
which cond-expand
expressions you need to write to make
this library compile or run on another Scheme, please feel free to
submit a pull request to me, either on CodeBerg, or as a Git patch
attached to an e-mail.