Macros
Macros
A macro is a special function that is executed at compile time.
Normally, the input for a macro is an abstract syntax tree (AST) of the
code that is passed to it. The macro can then do transformations on it
and return the transformed AST. This can be used to add custom language
features and implement domain-specific languages
{.interpreted-text
role="idx"}.
Macro invocation is a case where semantic analysis does not entirely proceed top to bottom and left to right. Instead, semantic analysis happens at least twice:
- Semantic analysis recognizes and resolves the macro invocation.
- The compiler executes the macro body (which may invoke other procs).
- It replaces the AST of the macro invocation with the AST returned by the macro.
- It repeats semantic analysis of that region of the code.
- If the AST returned by the macro contains other macro invocations, this process iterates.
While macros enable advanced compile-time code transformations, they cannot change Nim\'s syntax.
Debug Example
The following example implements a powerful debug
command that accepts
a variable number of arguments:
# to work with Nim syntax trees, we need an API that is defined in the
# `macros` module:
import std/macros
macro debug(args: varargs[untyped]): untyped =
# `args` is a collection of `NimNode` values that each contain the
# AST for an argument of the macro. A macro always has to
# return a `NimNode`. A node of kind `nnkStmtList` is suitable for
# this use case.
result = nnkStmtList.newTree()
# iterate over any argument that is passed to this macro:
for n in args:
# add a call to the statement list that writes the expression;
# `toStrLit` converts an AST to its string representation:
result.add newCall("write", newIdentNode("stdout"), newLit(n.repr))
# add a call to the statement list that writes ": "
result.add newCall("write", newIdentNode("stdout"), newLit(": "))
# add a call to the statement list that writes the expressions value:
result.add newCall("writeLine", newIdentNode("stdout"), n)
var
a: array[0..10, int]
x = "some string"
a[0] = 42
a[1] = 45
debug(a[0], a[1], x)
The macro call expands to:
write(stdout, "a[0]")
write(stdout, ": ")
writeLine(stdout, a[0])
write(stdout, "a[1]")
write(stdout, ": ")
writeLine(stdout, a[1])
write(stdout, "x")
write(stdout, ": ")
writeLine(stdout, x)
Arguments that are passed to a varargs
parameter are wrapped in an
array constructor expression. This is why debug
iterates over all of
args
\'s children.
BindSym
The above debug
macro relies on the fact that write
, writeLine
and
stdout
are declared in the system module and are thus visible in the
instantiating context. There is a way to use bound identifiers (aka
symbols
) instead of using unbound
identifiers. The bindSym
builtin can be used for that:
import std/macros
macro debug(n: varargs[typed]): untyped =
result = newNimNode(nnkStmtList, n)
for x in n:
# we can bind symbols in scope via 'bindSym':
add(result, newCall(bindSym"write", bindSym"stdout", toStrLit(x)))
add(result, newCall(bindSym"write", bindSym"stdout", newStrLitNode(": ")))
add(result, newCall(bindSym"writeLine", bindSym"stdout", x))
var
a: array[0..10, int]
x = "some string"
a[0] = 42
a[1] = 45
debug(a[0], a[1], x)
The macro call expands to:
write(stdout, "a[0]")
write(stdout, ": ")
writeLine(stdout, a[0])
write(stdout, "a[1]")
write(stdout, ": ")
writeLine(stdout, a[1])
write(stdout, "x")
write(stdout, ": ")
writeLine(stdout, x)
However, the symbols write
, writeLine
and stdout
are already bound
and are not looked up again. As the example shows, bindSym
does work
with overloaded symbols implicitly.
Case-Of Macro
In Nim, it is possible to have a macro with the syntax of a case-of expression just with the difference that all of-branches are passed to and processed by the macro implementation. It is then up the macro implementation to transform the of-branches into a valid Nim statement. The following example should show how this feature could be used for a lexical analyzer.
import std/macros
macro case_token(args: varargs[untyped]): untyped =
echo args.treeRepr
# creates a lexical analyzer from regular expressions
# ... (implementation is an exercise for the reader ;-)
discard
case_token: # this colon tells the parser it is a macro statement
of r"[A-Za-z_]+[A-Za-z_0-9]*":
return tkIdentifier
of r"0-9+":
return tkInteger
of r"[\+\-\*\?]+":
return tkOperator
else:
return tkUnknown
Style note: For code readability, it is best to use the least powerful programming construct that still suffices. So the \"check list\" is:
(1) Use an ordinary proc/iterator, if possible. (2) Else: Use a generic proc/iterator, if possible. (3) Else: Use a template, if possible. (4) Else: Use a macro.
For loop macro
A macro that takes as its only input parameter an expression of the
special type system.ForLoopStmt
can rewrite the entirety of a for
loop:
import std/macros
macro example(loop: ForLoopStmt) =
result = newTree(nnkForStmt) # Create a new For loop.
result.add loop[^3] # This is "item".
result.add loop[^2][^1] # This is "[1, 2, 3]".
result.add newCall(bindSym"echo", loop[0])
for item in example([1, 2, 3]): discard
Expands to:
Another example:
import std/macros
macro enumerate(x: ForLoopStmt): untyped =
expectKind x, nnkForStmt
# check if the starting count is specified:
var countStart = if x[^2].len == 2: newLit(0) else: x[^2][1]
result = newStmtList()
# we strip off the first for loop variable and use it as an integer counter:
result.add newVarStmt(x[0], countStart)
var body = x[^1]
if body.kind != nnkStmtList:
body = newTree(nnkStmtList, body)
body.add newCall(bindSym"inc", x[0])
var newFor = newTree(nnkForStmt)
for i in 1..x.len-3:
newFor.add x[i]
# transform enumerate(X) to 'X'
newFor.add x[^2][^1]
newFor.add body
result.add newFor
# now wrap the whole macro in a block to create a new scope
result = quote do:
block: `result`
for a, b in enumerate(items([1, 2, 3])):
echo a, " ", b
# without wrapping the macro in a block, we'd need to choose different
# names for `a` and `b` here to avoid redefinition errors
for a, b in enumerate(10, [1, 2, 3, 5]):
echo a, " ", b