On this page:
3.1 What is a syntax transformer?
3.2 What’s the input?
3.3 Actually transforming the input
3.4 Compile time vs. run time
3.5 begin-for-syntax

3 Transform!

  YOU ARE INSIDE A ROOM.

  THERE ARE KEYS ON THE GROUND.

  THERE IS A SHINY BRASS LAMP NEARBY.

  

  IF YOU GO THE WRONG WAY, YOU WILL BECOME

  HOPELESSLY LOST AND CONFUSED.

  

  > pick up the keys

  

  YOU HAVE A SYNTAX TRANSFORMER

3.1 What is a syntax transformer?

A syntax transformer is not one of the トランスフォーマ transformers.

Instead, it is simply a function. The function takes syntax and returns syntax. It transforms syntax.

Here’s a transformer function that ignores its input syntax, and always outputs syntax for a string literal:

These examples assume #lang racket. If you want to try them using #lang racket/base, you’ll need to (require (for-syntax racket/base)).

> (define-syntax foo
    (lambda (stx)
      (syntax "I am foo")))

Using it:

> (foo)

"I am foo"

When we use define-syntax, we’re making a transformer binding. This tells the Racket compiler, "Whenever you encounter a chunk of syntax starting with foo, please give it to my transformer function, and replace it with the syntax I give back to you." So Racket will give anything that looks like (foo ...) to our function, and we can return new syntax to use instead. Much like a search-and-replace.

Maybe you know that the usual way to define a function in Racket:

(define (f x) ...)

is shorthand for:

(define f (lambda (x) ...))

That shorthand lets you avoid typing lambda and some parentheses.

Well there is a similar shorthand for define-syntax:

> (define-syntax (also-foo stx)
    (syntax "I am also foo"))
> (also-foo)

"I am also foo"

What we want to remember is that this is simply shorthand. We are still defining a transformer function, which takes syntax and returns syntax. Everything we do with macros, will be built on top of this basic idea. It’s not magic.

Speaking of shorthand, there is also a shorthand for syntax, which is #':

#' is short for syntax much like ' is short for quote.

> (define-syntax (quoted-foo stx)
    #'"I am also foo, using #' instead of syntax")
> (quoted-foo)

"I am also foo, using #' instead of syntax"

We’ll use the #' shorthand from now on.

Of course, we can emit syntax that is more interesting than a string literal. How about returning (displayln "hi")?

> (define-syntax (say-hi stx)
    #'(displayln "hi"))
> (say-hi)

hi

When Racket expands our program, it sees the occurrence of (say-hi), and sees it has a transformer function for that. It calls our function with the old syntax, and we return the new syntax, which is used to evaluate and run our program.

3.2 What’s the input?

Our examples so far have ignored the input syntax and output some fixed syntax. But typically we will want to transform the input syntax into something else.

Let’s start by looking closely at what the input actually is:

> (define-syntax (show-me stx)
    (print stx)
    #'(void))
> (show-me '(+ 1 2))

#<syntax:eval:10:0 (show-me (quote (+ 1 2)))>

The (print stx) shows what our transformer is given: a syntax object.

A syntax object consists of several things. The first part is the S-expression representing the code, such as '(+ 1 2).

Racket syntax is also decorated with some interesting information such as the source file, line number, and column. Finally, it has information about lexical scoping (which you don’t need to worry about now, but will turn out to be important later.)

There are a variety of functions available to access a syntax object. Let’s define a piece of syntax:

> (define stx #'(if x (list "true") #f))
> stx

#<syntax:eval:11:0 (if x (list "true") #f)>

Now let’s use functions that access the syntax object. The source information functions are:

(syntax-source stx) is returning 'eval, only because of how I’m generating this documentation, using an evaluator to run code snippets in Scribble. Normally this would be something like "my-file.rkt".

> (syntax-source stx)

'eval

> (syntax-line stx)

11

> (syntax-column stx)

0

More interesting is the syntax "stuff" itself. syntax->datum converts it completely into an S-expression:

> (syntax->datum stx)

'(if x (list "true") #f)

Whereas syntax-e only goes "one level down". It may return a list that has syntax objects:

> (syntax-e stx)

'(#<syntax:eval:11:0 if> #<syntax:eval:11:0 x> #<syntax:eval:11:0 (list "true")> #<syntax:eval:11:0 #f>)

Each of those syntax objects could be converted by syntax-e, and so on recursively—which is what syntax->datum does.

In most cases, syntax->list gives the same result as syntax-e:

> (syntax->list stx)

'(#<syntax:eval:11:0 if> #<syntax:eval:11:0 x> #<syntax:eval:11:0 (list "true")> #<syntax:eval:11:0 #f>)

(When would syntax-e and syntax->list differ? Let’s not get side-tracked now.)

When we want to transform syntax, we’ll generally take the pieces we were given, maybe rearrange their order, perhaps change some of the pieces, and often introduce brand-new pieces.

3.3 Actually transforming the input

Let’s write a transformer function that reverses the syntax it was given:

The values at the end of the example allows the result to evaluate nicely. Try (reverse-me "backwards" "am" "i") to see why it’s handy.

> (define-syntax (reverse-me stx)
    (datum->syntax stx (reverse (cdr (syntax->datum stx)))))
> (reverse-me "backwards" "am" "i" values)

"i"

"am"

"backwards"

Understand Yoda, we can. Great, but how does this work?

First we take the input syntax, and give it to syntax->datum. This converts the syntax into a plain old list:

> (syntax->datum #'(reverse-me "backwards" "am" "i" values))

'(reverse-me "backwards" "am" "i" values)

Using cdr slices off the first item of the list, reverse-me, leaving the remainder: ("backwards" "am" "i" values). Passing that to reverse changes it to (values "i" "am" "backwards"):

> (reverse (cdr '(reverse-me "backwards" "am" "i" values)))

'(values "i" "am" "backwards")

Finally we use datum->syntax to convert this back to syntax:

> (datum->syntax #f '(values "i" "am" "backwards"))

#<syntax (values "i" "am" "backwards")>

That’s what our transformer function gives back to the Racket compiler, and that syntax is evaluated:

> (values "i" "am" "backwards")

"i"

"am"

"backwards"

The first argument of datum->syntax contains the lexical context information that we want to associate with the syntax outputted by the transformer. If the first argument is set to #f then no lexical context will be associated.

3.4 Compile time vs. run time

(define-syntax (foo stx)
  (make-pipe) ;Ce n'est pas le temps d'exécution
  #'(void))

Normal Racket code runs at ... run time. Duh.

Instead of "compile time vs. run time", you may hear it described as "syntax phase vs. runtime phase". Same difference.

But a syntax transformer is called by Racket as part of the process of parsing, expanding, and compiling our program. In other words, our syntax transformer function is evaluated at compile time.

This aspect of macros lets you do things that simply aren’t possible in normal code. One of the classic examples is something like the Racket form, if:

(if <condition> <true-expression> <false-expression>)

If we implemented if as a function, all of the arguments would be evaluated before being provided to the function.

> (define (our-if condition true-expr false-expr)
    (cond [condition true-expr]
          [else false-expr]))
> (our-if #t
          "true"
          "false")

"true"

That seems to work. However, how about this:

> (define (display-and-return x)
    (displayln x)
    x)
> (our-if #t
          (display-and-return "true")
          (display-and-return "false"))

true

false

"true"

One answer is that functional programming is good, and side-effects are bad. But avoiding side-effects isn’t always practical.

Oops. Because the expressions have a side-effect, it’s obvious that they are both evaluated. And that could be a problem—what if the side-effect includes deleting a file on disk? You wouldn’t want (if user-wants-file-deleted? (delete-file) (void)) to delete a file even when user-wants-file-deleted? is #f.

So this simply can’t work as a plain function. However a syntax transformer can rearrange the syntax – rewrite the code – at compile time. The pieces of syntax are moved around, but they aren’t actually evaluated until run time.

Here is one way to do this:

> (define-syntax (our-if-v2 stx)
    (define xs (syntax->list stx))
    (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                              [else ,(cadddr xs)])))
> (our-if-v2 #t
             (display-and-return "true")
             (display-and-return "false"))

true

"true"

> (our-if-v2 #f
             (display-and-return "true")
             (display-and-return "false"))

false

"false"

That gave the right answer. But how? Let’s pull out the transformer function itself, and see what it did. We start with an example of some input syntax:

> (define stx #'(our-if-v2 #t "true" "false"))
> (displayln stx)

#<syntax:eval:32:0 (our-if-v2 #t "true" "false")>

1. We take the original syntax, and use syntax->list to change it into a list of syntax objects:

> (define xs (syntax->list stx))
> (displayln xs)

(#<syntax:eval:32:0 our-if-v2> #<syntax:eval:32:0 #t> #<syntax:eval:32:0 "true"> #<syntax:eval:32:0 "false">)

2. To change this into a Racket cond form, we need to take the three interesting pieces—the condition, true-expression, and false-expression—from the list using cadr, caddr, and cadddr and arrange them into a cond form:

`(cond [,(cadr xs) ,(caddr xs)]
       [else ,(cadddr xs)])

3. Finally, we change that into syntax using datum->syntax:

> (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                            [else ,(cadddr xs)]))

#<syntax (cond (#t "true") (else "false"))>

So that works, but using cadddr etc. to destructure a list is painful and error-prone. Maybe you know Racket’s match? Using that would let us do pattern-matching.

Notice that we don’t care about the first item in the syntax list. We didn’t take (car xs) in our-if-v2, and we didn’t use name when we used pattern-matching. In general, a syntax transformer won’t care about that, because it is the name of the transformer binding. In other words, a macro usually doesn’t care about its own name.

Instead of:

> (define-syntax (our-if-v2 stx)
    (define xs (syntax->list stx))
    (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                              [else ,(cadddr xs)])))

We can write:

> (define-syntax (our-if-using-match stx)
    (match (syntax->list stx)
      [(list name condition true-expr false-expr)
       (datum->syntax stx `(cond [,condition ,true-expr]
                                 [else ,false-expr]))]))

Great. Now let’s try using it:

> (our-if-using-match #t "true" "false")

match: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Oops. It’s complaining that match isn’t defined.

Our transformer function is working at compile time, not run time. And at compile time, only racket/base is required for you automatically—not the full racket.

Anything beyond racket/base, we have to require ourselves—and require it for compile time using the for-syntax form of require.

In this case, instead of using plain (require racket/match), we want (require (for-syntax racket/match))the for-syntax part meaning, "for compile time".

So let’s try that:

> (require (for-syntax racket/match))
> (define-syntax (our-if-using-match-v2 stx)
    (match (syntax->list stx)
      [(list _ condition true-expr false-expr)
       (datum->syntax stx `(cond [,condition ,true-expr]
                                 [else ,false-expr]))]))
> (our-if-using-match-v2 #t "true" "false")

"true"

Joy.

3.5 begin-for-syntax

We used for-syntax to require the racket/match module because we needed to use match at compile time.

What if we wanted to define our own helper function to be used by a macro? One way to do that is put it in another module, and require it using for-syntax, just like we did with the racket/match module.

If instead we want to put the helper in the same module, we can’t simply define it and use it—the definition would exist at run time, but we need it at compile time. The answer is to put the definition of the helper function(s) inside begin-for-syntax:

(begin-for-syntax
 (define (my-helper-function ....)
   ....))
(define-syntax (macro-using-my-helper-function stx)
  (my-helper-function ....)
  ....)

In the simple case, we can also use define-for-syntax, which composes begin-for-syntax and define:

(define-for-syntax (my-helper-function ....)
  ....)
(define-syntax (macro-using-my-helper-function stx)
  (my-helper-function ....)
  ....)

To review: