On this page:
4.1 Pattern variable vs. template—fight!
4.1.1 with-syntax
4.1.2 with-syntax*
4.1.3 format-id
4.1.4 Another example
4.2 Making our own struct
4.3 Using dot notation for nested hash lookups

4 Pattern matching: syntax-case and syntax-rules

Most useful syntax transformers work by taking some input syntax, and rearranging the pieces into something else. As we saw, this is possible but tedious using list accessors such as cadddr. It’s more convenient and less error-prone to use match to do pattern-matching.

Historically, syntax-case and syntax-rules pattern matching came first. match was added to Racket later.

It turns out that pattern-matching was one of the first improvements to be added to the Racket macro system. It’s called syntax-case, and has a shorthand for simple situations called define-syntax-rule.

Recall our previous example:

(require (for-syntax racket/match))
(define-syntax (our-if-using-match-v2 stx)
  (match (syntax->list stx)
    [(list _ condition true-expr false-expr)
     (datum->syntax stx `(cond [,condition ,true-expr]
                               [else ,false-expr]))]))

Here’s what it looks like using syntax-case:

> (define-syntax (our-if-using-syntax-case stx)
    (syntax-case stx ()
      [(_ condition true-expr false-expr)
       #'(cond [condition true-expr]
               [else false-expr])]))
> (our-if-using-syntax-case #t "true" "false")

"true"

Pretty similar, huh? The pattern matching part looks almost exactly the same. The way we specify the new syntax is simpler. We don’t need to do quasi-quoting and unquoting. We don’t need to use datum->syntax. Instead, we supply a "template", which uses variables from the pattern.

There is a shorthand for simple pattern-matching cases, which expands into syntax-case. It’s called define-syntax-rule:

> (define-syntax-rule (our-if-using-syntax-rule condition true-expr false-expr)
    (cond [condition true-expr]
          [else false-expr]))
> (our-if-using-syntax-rule #t "true" "false")

"true"

Here’s the thing about define-syntax-rule. Because it’s so simple, define-syntax-rule is often the first thing people are taught about macros. But it’s almost deceptively simple. It looks so much like defining a normal run time function—yet it’s not. It’s working at compile time, not run time. Worse, the moment you want to do more than define-syntax-rule can handle, you can fall off a cliff into what feels like complicated and confusing territory. Hopefully, because we started with a basic syntax transformer, and worked up from that, we won’t have that problem. We can appreciate define-syntax-rule as a convenient shorthand, but not be scared of, or confused about, that for which it’s shorthand.

Most of the materials I found for learning macros, including the Racket Guide, do a very good job explaining how patterns and templates work. So I won’t regurgitate that here.

Sometimes, we need to go a step beyond the pattern and template. Let’s look at some examples, how we can get confused, and how to get it working.

4.1 Pattern variable vs. template—fight!

Let’s say we want to define a function with a hyphenated name, a-b, but we supply the a and b parts separately. The Racket struct macro does something like this: (struct foo (field1 field2)) automatically defines a number of functions whose names are variations on the name foosuch as foo-field1, foo-field2, foo?, and so on.

So let’s pretend we’re doing something like that. We want to transform the syntax (hyphen-define a b (args) body) to the syntax (define (a-b args) body).

A wrong first attempt is:

> (define-syntax (hyphen-define/wrong1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (let ([name (string->symbol (format "~a-~a" a b))])
         #'(define (name args ...)
             body0 body ...))]))

eval:47:0: a: pattern variable cannot be used outside of a

template

  in: a

Huh. We have no idea what this error message means. Well, let’s try to work it out. The "template" the error message refers to is the #'(define (name args ...) body0 body ...) portion. The let isn’t part of that template. It sounds like we can’t use a (or b) in the let part.

In fact, syntax-case can have as many templates as you want. The obvious, required template is the final expression supplying the output syntax. But you can use syntax (a.k.a. #') on a pattern variable. This makes another template, albeit a small, "fun size" template. Let’s try that:

> (define-syntax (hyphen-define/wrong1.1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (let ([name (string->symbol (format "~a-~a" #'a #'b))])
         #'(define (name args ...)
             body0 body ...))]))

No more errors—good! Let’s try to use it:

> (hyphen-define/wrong1.1 foo bar () #t)
> (foo-bar)

foo-bar: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Apparently our macro is defining a function with some name other than foo-bar. Huh.

This is where the Macro Stepper in DrRacket is invaluable.

Even if you prefer mostly to use Emacs, this is a situation where it’s definitely worth temporarily using DrRacket for its Macro Stepper.

The Macro Stepper says that the use of our macro:

(hyphen-define/wrong1.1 foo bar () #t)

expanded to:

(define (name) #t)

Well that explains it. Instead, we wanted to expand to:

(define (foo-bar) #t)

Our template is using the symbol name but we wanted its value, such as foo-bar in this use of our macro.

Is there anything we already know that behaves like this—where using a variable in the template yields its value? Yes: Pattern variables. Our pattern doesn’t include name because we don’t expect it in the original syntax—indeed the whole point of this macro is to create it. So name can’t be in the main pattern. Fine—let’s make an additional pattern. We can do that using an additional, nested syntax-case:

> (define-syntax (hyphen-define/wrong1.2 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax #'a
                                   (string->symbol (format "~a-~a" #'a #'b)))
                    ()
         [name #'(define (name args ...)
                   body0 body ...)])]))

Looks weird? Let’s take a deep breath. Normally our transformer function is given syntax by Racket, and we pass that syntax to syntax-case. But we can also create some syntax of our own, on the fly, and pass that to syntax-case. That’s all we’re doing here. The whole (datum->syntax ...) expression is syntax that we’re creating on the fly. We can give that to syntax-case, and match it using a pattern variable named name. Voila, we have a new pattern variable. We can use it in a template, and its value will go in the template.

We might have one more—just one, I promise!—small problem left. Let’s try to use our new version:

> (hyphen-define/wrong1.2 foo bar () #t)
> (foo-bar)

foo-bar: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Hmm. foo-bar is still not defined. Back to the Macro Stepper. It says now we’re expanding to:

(define (|#<syntax:11:24foo>-#<syntax:11:28 bar>|) #t)

Oh right: #'a and #'b are syntax objects. Therefore

(string->symbol (format "~a-~a" #'a #'b))

is the printed form of both syntax objects, joined by a hyphen:

|#<syntax:11:24foo>-#<syntax:11:28 bar>|

Instead we want the datum in the syntax objects, such as the symbols foo and bar. Which we get using syntax->datum:

> (define-syntax (hyphen-define/ok1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax #'a
                                   (string->symbol (format "~a-~a"
                                                           (syntax->datum #'a)
                                                           (syntax->datum #'b))))
                    ()
         [name #'(define (name args ...)
                   body0 body ...)])]))
> (hyphen-define/ok1 foo bar () #t)
> (foo-bar)

#t

And now it works!

Next, some shortcuts.

4.1.1 with-syntax

Instead of an additional, nested syntax-case, we could use with-syntaxAnother name for with-syntax could be, "with new pattern variable".. This rearranges the syntax-case to look more like a let statement—first the name, then the value. Also it’s more convenient if we need to define more than one pattern variable.

> (define-syntax (hyphen-define/ok2 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (datum->syntax #'a
                                          (string->symbol (format "~a-~a"
                                                                  (syntax->datum #'a)
                                                                  (syntax->datum #'b))))])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok2 foo bar () #t)
> (foo-bar)

#t

Again, with-syntax is simply syntax-case rearranged:

(syntax-case <syntax> () [<pattern> <body>])
(with-syntax ([<pattern> <syntax>]) <body>)

Whether you use an additional syntax-case or use with-syntax, either way you are simply defining additional pattern variables. Don’t let the terminology and structure make it seem mysterious.

4.1.2 with-syntax*

We know that let doesn’t let us use a binding in a subsequent one:

> (let ([a 0]
        [b a])
    b)

a: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Instead we can nest lets:

> (let ([a 0])
    (let ([b a])
      b))

0

Or use a shorthand for nesting, let*:

> (let* ([a 0]
         [b a])
    b)

0

Similarly, instead of writing nested with-syntaxs, we can use with-syntax*:

> (require (for-syntax racket/syntax))
> (define-syntax (foo stx)
    (syntax-case stx ()
      [(_ a)
        (with-syntax* ([b #'a]
                       [c #'b])
          #'c)]))

One gotcha is that with-syntax* isn’t provided by racket/base. We must (require (for-syntax racket/syntax)). Otherwise we may get a rather bewildering error message:

...: ellipses not allowed as an expression in: ....

4.1.3 format-id

There is a utility function in racket/syntax called format-id that lets us format identifier names more succinctly than what we did above:

> (require (for-syntax racket/syntax))
> (define-syntax (hyphen-define/ok3 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (format-id #'a "~a-~a" #'a #'b)])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok3 bar baz () #t)
> (bar-baz)

#t

Using format-id is convenient as it handles the tedium of converting from syntax to symbol datum to string ... and all the way back.

The first argument of format-id, lctx, is the lexical context of the identifier that will be created. You almost never want to supply stxthe overall chunk of syntax that the macro transforms. Instead you want to supply some more-specific bit of syntax, such as an identifier that the user has provided to the macro. In this example, we’re using #'a. The resulting identifier will have the same scope as that which the user provided. This is more likely to behave as the user expects, especially when our macro is composed with other macros.

4.1.4 Another example

Finally, here’s a variation that accepts an arbitrary number of name parts to be joined with hyphens:

> (require (for-syntax racket/string racket/syntax))
> (define-syntax (hyphen-define* stx)
    (syntax-case stx ()
      [(_ (names ...) (args ...) body0 body ...)
       (let ([name-stxs (syntax->list #'(names ...))])
         (with-syntax ([name (datum->syntax (car name-stxs)
                                            (string->symbol
                                             (string-join (for/list ([name-stx name-stxs])
                                                            (symbol->string
                                                             (syntax-e name-stx)))
                                                          "-")))])
           #'(define (name args ...)
               body0 body ...)))]))
> (hyphen-define* (foo bar baz) (v) (* 2 v))
> (foo-bar-baz 50)

100

Just as when we used format-id, when using datum->syntax we’re being careful with the first, lctx argument. We want the identifier we create to use the lexical context of an identifier provided to the macro by the user. In this case, the user’s identifiers are in the (names ...) template variable. We change this from one syntax into a list of syntaxes. The first element we use for the lexical context. Then of course we’ll use all the elements to form the hyphenated identifier.

To review:

4.2 Making our own struct

Let’s apply what we just learned to a more-realistic example. We’ll pretend that Racket doesn’t already have a struct capability. Fortunately, we can write a macro to provide our own system for defining and using structures. To keep things simple, our structure will be immutable (read-only) and it won’t support inheritance.

Given a structure declaration like:

(our-struct name (field1 field2 ...))

We need to define some procedures:

> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx)
    (syntax-case stx ()
      [(_ id (fields ...))
       (with-syntax ([pred-id (format-id #'id "~a?" #'id)])
         #`(begin
             ; Define a constructor.
             (define (id fields ...)
               (apply vector (cons 'id  (list fields ...))))
             ; Define a predicate.
             (define (pred-id v)
               (and (vector? v)
                    (eq? (vector-ref v 0) 'id)))
             ; Define an accessor for each field.
             #,@(for/list ([x (syntax->list #'(fields ...))]
                           [n (in-naturals 1)])
                  (with-syntax ([acc-id (format-id #'id "~a-~a" #'id x)]
                                [ix n])
                    #`(define (acc-id v)
                        (unless (pred-id v)
                          (error 'acc-id "~a is not a ~a struct" v 'id))
                        (vector-ref v ix))))))]))
; Test it out
> (require rackunit)
> (our-struct foo (a b))
> (define s (foo 1 2))
> (check-true (foo? s))
> (check-false (foo? 1))
> (check-equal? (foo-a s) 1)
> (check-equal? (foo-b s) 2)
> (check-exn exn:fail?
             (lambda () (foo-a "furble")))
; The tests passed.
; Next, what if someone tries to declare:
> (our-struct "blah" ("blah" "blah"))

format-id: contract violation

  expected: (or/c string? symbol? identifier? keyword? char?

number?)

  given: #<syntax:eval:83:0 "blah">

The error message is not very helpful. It’s coming from format-id, which is a private implementation detail of our macro.

You may know that a syntax-case clause can take an optional "guard" or "fender" expression. Instead of

[pattern template]

It can be:

[pattern guard template]

Let’s add a guard expression to our clause:

> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx)
    (syntax-case stx ()
      [(_ id (fields ...))
       ; Guard or "fender" expression:
       (for-each (lambda (x)
                   (unless (identifier? x)
                     (raise-syntax-error #f "not an identifier" stx x)))
                 (cons #'id (syntax->list #'(fields ...))))
       (with-syntax ([pred-id (format-id #'id "~a?" #'id)])
         #`(begin
             ; Define a constructor.
             (define (id fields ...)
               (apply vector (cons 'id  (list fields ...))))
             ; Define a predicate.
             (define (pred-id v)
               (and (vector? v)
                    (eq? (vector-ref v 0) 'id)))
             ; Define an accessor for each field.
             #,@(for/list ([x (syntax->list #'(fields ...))]
                           [n (in-naturals 1)])
                  (with-syntax ([acc-id (format-id #'id "~a-~a" #'id x)]
                                [ix n])
                    #`(define (acc-id v)
                        (unless (pred-id v)
                          (error 'acc-id "~a is not a ~a struct" v 'id))
                        (vector-ref v ix))))))]))
; Now the same misuse gives a better error message:
> (our-struct "blah" ("blah" "blah"))

eval:86:0: our-struct: not an identifier

  at: "blah"

  in: (our-struct "blah" ("blah" "blah"))

Later, we’ll see how syntax-parse makes it even easier to check usage and provide helpful messages about mistakes.

4.3 Using dot notation for nested hash lookups

The previous two examples used a macro to define functions whose names were made by joining identifiers provided to the macro. This example does the opposite: The identifier given to the macro is split into pieces.

If you write programs for web services you deal with JSON, which is represented in Racket by a jsexpr?. JSON often has dictionaries that contain other dictionaries. In a jsexpr? these are represented by nested hasheq tables:

; Nested hasheq's typical of a jsexpr:
> (define js (hasheq 'a (hasheq 'b (hasheq 'c "value"))))

In JavaScript you can use dot notation:

foo = js.a.b.c;

In Racket it’s not so convenient:

(hash-ref (hash-ref (hash-ref js 'a) 'b) 'c)

We can write a helper function to make this a bit cleaner:

; This helper function:
> (define/contract (hash-refs h ks [def #f])
    ((hash? (listof any/c)) (any/c) . ->* . any)
    (with-handlers ([exn:fail? (const (cond [(procedure? def) (def)]
                                            [else def]))])
      (for/fold ([h h])
        ([k (in-list ks)])
        (hash-ref h k))))
; Lets us say:
> (hash-refs js '(a b c))

"value"

That’s better. Can we go even further and use a dot notation somewhat like JavaScript?

; This macro:
> (require (for-syntax racket/syntax))
> (define-syntax (hash.refs stx)
    (syntax-case stx ()
      ; If the optional default' is missing, use #f.
      [(_ chain)
       #'(hash.refs chain #f)]
      [(_ chain default)
       (let* ([chain-str (symbol->string (syntax->datum #'chain))]
              [ids (for/list ([str (in-list (regexp-split #rx"\\." chain-str))])
                     (format-id #'chain "~a" str))])
         (with-syntax ([hash-table (car ids)]
                       [keys       (cdr ids)])
           #'(hash-refs hash-table 'keys default)))]))
; Gives us "sugar" to say this:
> (hash.refs js.a.b.c)

"value"

; Try finding a key that doesn't exist:
> (hash.refs js.blah)

#f

; Try finding a key that doesn't exist, specifying the default:
> (hash.refs js.blah 'did-not-exist)

'did-not-exist

It works!

We’ve started to appreciate that our macros should give helpful messages when used in error. Let’s try to do that here.

> (require (for-syntax racket/syntax))
> (define-syntax (hash.refs stx)
    (syntax-case stx ()
      ; Check for no args at all
      [(_)
       (raise-syntax-error #f "Expected hash.key0[.key1 ...] [default]" stx)]
      ; If the optional default' is missing, use #f.
      [(_ chain)
       #'(hash.refs chain #f)]
      [(_ chain default)
       (unless (identifier? #'chain)
         (raise-syntax-error #f "Expected hash.key0[.key1 ...] [default]" stx #'chain))
       (let* ([chain-str (symbol->string (syntax->datum #'chain))]
              [ids (for/list ([str (in-list (regexp-split #rx"\\." chain-str))])
                     (format-id #'chain "~a" str))])
         ; Check that we have at least hash.key
         (unless (and (>= (length ids) 2)
                      (not (eq? (syntax-e (cadr ids)) '||)))
           (raise-syntax-error #f "Expected hash.key" stx #'chain))
         (with-syntax ([hash-table (car ids)]
                       [keys       (cdr ids)])
           #'(hash-refs hash-table 'keys default)))]))
; See if we catch each of the misuses
> (hash.refs)

eval:97:0: hash.refs: Expected hash.key0[.key1 ...]

[default]

  in: (hash.refs)

> (hash.refs 0)

eval:98:0: hash.refs: Expected hash.key0[.key1 ...]

[default]

  at: 0

  in: (hash.refs 0 #f)

> (hash.refs js)

eval:99:0: hash.refs: Expected hash.key

  at: js

  in: (hash.refs js #f)

> (hash.refs js.)

eval:100:0: hash.refs: Expected hash.key

  at: js.

  in: (hash.refs js. #f)

Not too bad. Of course, the version with error-checking is quite a bit longer. Error-checking code generally tends to obscure the logic, and does here. Fortunately we’ll soon see how syntax-parse can help mitigate that, in much the same way as contracts in normal Racket or types in Typed Racket.

Maybe we’re not convinced that writing (hash.refs js.a.b.c) is really clearer than (hash-refs js '(a b c)). Maybe we won’t actually use this approach. But the Racket macro system makes it a possible choice.