-
Notifications
You must be signed in to change notification settings - Fork 9
Introduction to Macros
Metaprogramming can be very useful in reducing or removing boilerplate code. However, some languages have less-than-satisfactory solutions while others have none at all. Template Haskell, for example, has several major problems (e.g. it is notoriously unstable between GHC releases, according to the 2017 State of Haskell), and PureScript doesn't have a macro-like system in the first place.
However, by following in Lisp's footsteps and adopting a homoiconic syntax (yay, parentheses!), Axel is equipped with a simple yet powerful metaprogramming system. Unlike in most languages, Axel source code is effectively a textual representation of the corresponding abstract syntax tree (AST) that the transpiler uses to represent the program internally. Macros, then, are just functions that take lists of Axel expressions and return lists of other Axel expressions. The Axel transpiler runs these macros at compile-time, allowing for code that creates code.
NOTE In this tutorial, -- =>
just means "evaluates to."
For an example of the power that macros provide us with, let's try to implement the LambdaCase
GHC language extension as a macro ourselves.
First, for a brief introduction to Axel's metaprogramming abilities. Let's use the following Axel expression as our example:
(+ 1 2)
This applies the +
function to two arguments, 1
and 2
. Internally, Axel represents this not as a function application per se, but rather as an s-expression (the Lisp name for a parenthesized list) containing the symbols +
, 1
, and 2
. The cool part about this, which Axel inherits from Lisp, is that by passing the expression to the quote
"special form," we can get that internal representation and manipulate it. For example:
(quote (+ 1 2)) -- => (SExpression [(Symbol "+") (LiteralInt 1) (LiteralInt 2)])
You can quote any expression you'd like:
(quote 1) -- => (LiteralInt 1)
(quote "test") -- => (LiteralString "test")
You can even quote another application of quote
:
(quote (quote 1)) -- (SExpression [(Symbol "quote") (LiteralInt 1)])
Once we have the quoted form of an expression, we can manipulate it like any other data structure, like so:
(case (quote (+ 1 2)))
((SExpression sexp) (SExpression (reverse sexp))))
-- => (SExpression [(LiteralInt 2) (LiteralInt 1) (Symbol "+")])
Let's define a function, backwards
, to reverse whatever Axel expression we pass in.
(= backwards (-> Expression Expression) -- It will take an Axel expression & return an Axel expression
((expression)
Now, an Axel Expression
can be one of several cases: LiteralChar Char
, LiteralInt Int
, LiteralString String
, SExpression [Expression]
, or Symbol String
. The only thing we can reverse is an SExpression
, so let's return the input unchanged for all other cases.
(case expression
((LiteralChar _) expression)
((LiteralInt _) expression)
((LiteralString _) expression)
((Symbol _) expression)
For the case that we're passed an SExpression
, however, let's extract the list of Expression
s inside it and reverse it. We'll have to wrap the new list in an SExpression
, since our function expects an Expression
to be returned.
((SExpression expressions) (SExpression (reverse expressions))))))
Now, we can run our backwards
function like so:
-- We have to quote `(2 1 +)` because otherwise it would be evaluated (which is not what we want).
-- Rather, we want the internal representation of `(+ 2 1)` to be passed to `backwards` (which requires a value of type `Expression`).
(backwards '(+ 2 1)) -- => (SExpression [(Symbol "+") (LiteralInt 1) (LiteralInt 2)])
When we evaluate backwards
, we get back a new Expression
. So, we now have the data representing our new code, but how can we actually execute it to get 3
?
Macros to the rescue! Currently, backwards
is just a plain old Axel function. But, Axel provides an alternative to normal function definition (=
) called defmacro
, and it lets us define a function that will actually be run at compile-time. Macros take a list of Expression
s and return a list of Expression
s. Macro calls in an Axel program are replaced by the results they return.
To see this in action, let's turn backwards
into a macro. defmacro
doesn't require a type signature to be passed, since it automatically uses the type [Expression] -> IO [Expression]
. You'll notice that backwards
is a partial function, since it wouldn't be able to handle being passed more than one argument. However, this isn't the worry it would usually be if this were a normal function, since errors will be thrown at compile-time and fixed before the program is actually run.
(defmacro backwards
((list expression)
Because backwards
will not have any side-effects, we will return
whatever our result will be (since we need to return a value of type IO [Expression]
rather than just [Expression]
).
(return
Also, we'll wrap the whole thing in a list, because (again) we're looking to return a list of Expression
s rather than just a single Expression
. From here, our definition is the same as before:
[(case expression
((LiteralChar _) expression)
((LiteralInt _) expression)
((LiteralString _) expression)
((SExpression expressions) (SExpression (reverse expressions)))
((Symbol _) expression))])))
Now, if we actually call backwards
in our code, the macro call will be "expanded" during compilation. Also, if you remember from when backwards
was a function, we had to explicitly quote our input for it not to be evaluated. However, when passing a form to a macro, it will be automatically quoted. This helps macros remain unobtrusive and makes them feel more like first-class syntactic constructs.
To see how our macro will actually be evaluated, let's walk through it step-by-step. If the program we provide to Axel looks like this:
(= main (IO Unit)
(() (backwards ("Hello, world!" putStrLn))))
backwards
will be executed as the program compiles, after which the program will look like this:
(= main (IO Unit)
(() (putStrLn "Hello, world!")))
And, when the program is executed at the end of this process, the output will be, as expected:
"Hello, world!"
Congratulations, you've created your very first macro!
Now, let's try to create a more complicated macro and implement GHC's LambdaCase
extension from scratch. We're going to write our macro using what we know now, and then we'll take a look at "quasiquoting" and see how it helps us make our macros more concise. Finally, we'll see it in action by using it to clean up our example.
For some context, take the following Haskell code:
\x -> case x of
True -> 1
False -> 2
Notice that x
is immediately passed to the case
expression and never used again. This happens commonly enough that the LambdaCase
extension was designed to help remove the duplication. It allows us to write the above lambda more concisely, like so:
\case
True -> 1
False -> 2
So, it's a pretty simple (yet useful) syntactic transformation. If we want Axel to have the same feature, it could look like the following:
-- Before
(\ (x) (case x
((True) 1)
((False) 2)
-- After
(\case
((True) 1)
((False) 2)
The rule to implement this feature is pretty simple: If you see (\case <case 1, case 2, ...>)
, replace that with (\ (<identifier>) (case <identifier> <case 1, case 2, ...>)
, where <identifier>
is some as-of-yet unused identifier.
Let's start our macro definition like usual:
(defmacro \case
((cases) undefined))
Now, we'll fill in the basic details, based on the discussion above. Since we want to return a list, we'll use return
to lift our result into IO
and then wrap our result with the SExpression
constructor.
(defmacro \case
((cases) (return [(SExpression TODO)])))
We want to fill in TODO
with \ (<some identifier> (case <some identifier> <cases>)
. Since we want to use \
and case
as symbols (instead of using them to refer to what they mean at runtime), we'll quote them with the '
operator.
(defmacro \case
((cases) (return [(SExpression ['\ (SExpression [TODO])
(SExpression ['case TODO TODO])])])))
Now, we need to somehow get an unused identifier that we can use as the variable in our lambda. Luckily, Axel provides the gensym
helper function, which is designed just for this! Because gensym
runs in the IO
monad (gensym
is of type IO Expression
because the Symbol
it creates must be unique), our code will as well.
(defmacro \case
((cases)
(do
<- identifier gensym
(return [(SExpression ['\ (SExpression [identifier])
(SExpression ['case identifier TODO])])]))))
And now, for the final TODO
: the cases themselves.
The cases
parameter of our macro is a list of Expression
s that we want to splice directly into our output (that is, we're looking for (case <identifier> <case 1> <case 2> <...>)
rather than (case <identifier> (<case 1> <case 2> <...>))
). So, we can finish our macro's definition like so:
(defmacro \case
((cases)
(do
<- identifier gensym
(return [(SExpression ['\ (SExpression [identifier])
(SExpression (: 'case (: identifier cases)))])]))))
And, there we have it!
To be honest, that was a bit tedious to write, and it's not the easiest to understand. Manually constructing our SExpression
s adds a lot of extra noise to our macro definition. Fortunately, though, there's a better way!
Meet quasiquote
, quote
's big sibling. It behaves just like quote
on normal expressions:
(quasiquote 1) -- => (LiteralInt 1)
(quasiquote (1 2 3)) -- => (SExpression [(LiteralInt 1) (LiteralInt 2) (LiteralInt 3)])
However, quasiquote
has some sidekicks that give it superpowers. If you wrap quasiquote
's argument with unquote
, it will undo the quasiquotation for just that section. For example:
(quasiquote 1) -- => (LiteralInt 1)
(quasiquote (unquote 1)) -- => 1
(quasiquote (1 (LiteralInt 2) 3)) -- => (SExpression [(LiteralInt 1) (SExpression [(Symbol "LiteralInt") (LiteralInt 2)]) (LiteralInt 3)])
(quasiquote (1 (unquote (LiteralInt 2)) 3)) -- => (SExpression [(LiteralInt 1) (LiteralInt 2) (LiteralInt 3)])
There's also unquote-splicing
, which not only undoes quasiquotation but also "splices" the result directly into its enclosing s-expression. For example:
(quasiquote (1 (unquote (SExpression [(LiteralInt 2) (LiteralInt 3)])))) -- => (SExpression [(LiteralInt 1) (SExpression [(LiteralInt 2) (LiteralInt 3)])])
(quasiquote (1 (unquote-splicing (SExpression [(LiteralInt 2) (LiteralInt 3)]))) -- => (SExpression [(LiteralInt 1) (LiteralInt 2) (LiteralInt 3)])
In effect, quasiquote
acts as a sort of templating system for creating Axel expressions, where you selectively choose what parts should be evaluated at compile time.
One last note: quasiquoting, like quoting, is common enough that it gets special syntax. While quote
becomes '
, quasiquote
becomes \`` (e.g.
(quasiquote (1 2 3))is usually written
`(1 2 3)).
unquotecan be written with a comma (e.g.
(unquote 1)is usually written
,1), and
unquote-splicingcan be written with a comma and then an at-sign (e.g.
(unquote-splicing (1 2 3))is usually written
,@(1 2 3)`).
Now, we have all the building blocks we need to clean up our macro definition from earlier. Here's the original definition, for reference:
(defmacro \case
((cases)
(do
<- identifier gensym
(return [(SExpression ['\ (SExpression [identifier])
(SExpression (: 'case (: identifier cases)))])]))))
And here's a version which uses quasiquote
, unquote
, and unquote-splicing
:
(defmacro \case
((cases)
(do
<- identifier gensym
(return `(\ (,identifier)
(case ,identifier
,@cases)))))
Notice how the second representation closely mirrors the final output (much more so than our first try does). Also, try to figure out how we derived the second representation from the first one. While it might be difficult at first, getting the hang of quoting and quasiquoting just takes practice.
TODO Finish tutorial