These are rough notes and ideas
It would be nice if terms could specify when they took a binding operator. Maybe with GADTs?
Start with just basic s-exprs
type 'a t =
| Foo of string
| Bar of int * 'a
[@@deriving abt]
Will produce an ABT including:
- constructors
- to string impl
Will just require manually putting in the sexp conversion functions. Should be straight forward, and will helps keep a tiny dependency footprint
https://ocaml.org/manual/gadts.html
The treatment of unification modulo ɑ-equivalence could be informed by [Andrew M. Pitts](https://www.cl.cam.ac.uk/~amp12/papers/nomlfo/nomlfo.pdf)’ exposition of [Nominal Logic](https://www.cl.cam.ac.uk/~amp12/papers/nomlfo/nomlfo.pdf): equivariance (invariance under name swapping) was the key to an elegant algorithm.
Perhaps this would also provide a cleaner way of doing variable substitution etc?
Perhaps by letting bound variables be pointers back to the whole binding scope in which they are bound? Then we can determine if two bound variables are equal: by checking if the binding structure in which they are nodes are isomorphic.?
We could imagine a unification that would let (3) below go through, if the unification was of a possible pointer structure relative to it’s position in the ABT, rather than to any particular pointer?
Nominal logic and nominal unification uses a distinction between atoms (which are the bindable things) and variables, “non-ground terms”. We get the same dichotomy because we track bound and free variables separately.
At minimum, we should investigate the connection between the pointer-based binding approach used here and the name permutation approach used in nominal terms.
E.g.,
let open Option.Let in
https://homepages.inf.ed.ac.uk/jcheney/publications/cheney05unif.pdf
(e.g., current occurs check)
Given terms a = λz.M
and b = λx.(λy.x)
, where M
is a free variable, unify
a b
should be Ok (λz.(λy.z), [M -> λy.z ])
, such that x
in the
body of the substitute for M
is bound to λx._
outermost in the unified term.
But what should happen with terms a = (λz.M) N
and b = (λx.(λy.x)) (λx.M)
?
We can give [M -> λy.z; N -> M]
, but then b
would be (λx.(λy.x)) (λx.λy.z)
which doesn’t seem consistent. We need to map bound variables through the
ɑ-equivalence of their binders.
These unifications succeed (where a ~ b
means a
and b
correspond via alpha
equivalence):
(1) \x.M = \y.y => \x.x [M -> y (y ~ x)]
(2) \x.M M = \y.(\w.y) (\w.y) => \x.(\w.x) (\w.x) [M -> \w.y (y ~ x)]
But the following fails:
(3) \w.(\x.M) (\z.M) = \w.(\y.y) N => FAIL [M -> y (y ~ x); N -> M]
This is because the substitution of M in (1) and (2) preserves the binding structure in the unified terms and obeys binding scope:
(1)
\x. x ~ \y. y
^ / ^ /
\_/ \_/
(2)
\x.(\w.x) (\z.x) ~ \y.(\w.y) (\z.y)
^ / / ^ / /
\__/______/ \__/______/
Where as the substitution in (3) would either (a) require a binding structure that violates proper scoping rules
(a) (3)
\w.(\x. x) (\z.x) !~ \w.(\y.y) (\z.y) => FAIL
^ / /
\_/___x__/
^
|
(escapes scope)
or it would require we bind `M` to two different bound variables (x
and z
):
(b) (3)
\w.(\x.M) (\z.M) = \w.(\y.y) N => \w.(\x.x) (\z.z) [M -> x; M -> z; N -> M]
To unify two bindings:
a = x.(M x)
b = y.(y M)
- On encountering corresponding bindings, enter them into a bidirectional map:
Bimap.add m ~a:(x.) ~b:(y.) (* {x. <-> y.} *)
- Map free variables to bound variables or ground terms in the substition:
Subst.add s M *y (* [M -> *y] *)
- Free variables mapped to bound variables through the reference corresponding to the
appropriate side of the equation:
a[M := Subst.get s M |> Var.to_binding |> Bimap.left m |> Var.of_binding => x] b[M := Subst.get s M |> Var.to_binding |> Bimap.right m |> Var.of_binding => y]