For Development HEAD DRAFTSearch (procedure/syntax/module):

5.2 Hygienic macros

Macro bindings

The following forms establish bindings of name and a macro transformer created by transformer-spec. The binding introduced by these forms shadows a binding of name established in outer scope, if there’s any.

For toplevel bindings, it will shadow bindings of name imported or inherited from other modules (see Modules). (Note: This toplevel shadowing behavior is Gauche’s extension; in R7RS, you shouldn’t redefine imported bindings, so the portable code should avoid it.)

The effect is undefined if you bind the same name more than once in the same scope.

The transformer-spec can be either one of syntax-rules form, er-macro-transformer form, or another macro keyword or syntactic keyword. We’ll explain them later.

Special Form: define-syntax name transformer-spec

[R7RS base] If this form appears in toplevel, it binds toplevel name to a macro transformer defined by transformer-spec.

If this form appears in the declaration part of body of lambda (internal define-syntax), let and other similar forms, it binds name locally within that body. Conceptually, internal define-syntaxes on the same level are treated like letrec-syntax. However, mere appearance of define-syntax does not create another scope; for example, you can interleave internal define and internal define-syntax within the same scope. It is important, though, that the local macros defined by internal define-syntax should not be required to expand macro uses before the definition.

Special Form: let-syntax ((name transformer-spec) …) body
Special Form: letrec-syntax ((name transformer-spec) …) body

[R7RS base] Defines local macros. Each name is bound to a macro transformer as specified by the corresponding transformer-spec, then body is expanded. With let-syntax, transformer-spec is evaluated with the scope surrounding let-syntax, while with letrec-syntax the bindings of names are included in the scope where transformer-spec is evaluated. Thus letrec-syntax allows mutually recursive macros.

Transformer specs

The transformer-spec is a special expression that evaluates to a macro transformer. It is evaluated in a different phase than the other expressions, since macro transformers must be executed during compiling. So there are some restrictions.

At this moment, only one of the following expressions are allowed:

  1. A syntax-rules form. This is called “high-level” macro, for it uses pattern matching entirely, which is basically a different declarative language from Scheme, thus putting the complication of the phasing and hygiene issues completely under the hood. Some kind of macros are easier to write in syntax-rules. See Syntax-rules macro transformer, for further description.
  2. An er-macro-transformer form. This employs explicit-renaming (ER) macro, where you can use arbitrary Scheme code to transform the program, with required renaming to keep hygienity. The legacy Lisp macro can also be written with ER macro if you don’t use renaming. See Explicit-renaming macro transformer, for the details.
  3. A make-id-transformer form. This creates an identifier macro, Unlike an ordinary macro, an identifier macro expands without being at the head of a list; it looks like a variable in the source. See Identifier transformer, for the details.
  4. Macro or syntax keyword. This is Gauche’s extension, and can be used to define alias of existing macro or syntax keyword.
    (define-syntax si if)
    (define écrivez write)
    
    (si (< 2 3) (écrivez "oui"))
    

5.2.1 Syntax-rules macro transformer

Special Form: syntax-rules (literal …) clause clause2 …
Special Form: syntax-rules ellipsis (literal …) clause clause2 …

[R7RS base] This form creates a macro transformer by pattern matching.

Each clause has the following form:

(pattern template)

A pattern denotes a pattern to be matched to the macro call. It is an S-expression that matches if the macro call has the same structure, except that symbols in pattern can match a whole subtree of the input; the matched symbol is called a pattern variable, and can be referenced in the template.

For example, if a pattern is (_ "foo" (a b)), it can match the macro call (x "foo" (1 2)), or (x "foo" (1 (2 3))), but does not match (x "bar" (1 2)), (x "foo" (1)) or (x "foo" (1 2) 3). You can also match repeating structure or literal symbols; we’ll discuss it fully later.

Clauses are examined in order to see if the macro call form matches its pattern. If matching pattern is found, the corresponding template replaces the macro call form. A pattern variable in the template is replaced with the subtree of input that is bound to the pattern variable.

Here’s a definition of when macro in Why hygienic?, using syntax-rules:

(define-syntax when
  (syntax-rules ()
    [(_ test body ...) (if test (begin body ...))]))

The pattern is (_ test body ...), and the template is (if test (begin body ...)). The ellipsis ... is a symbol; we’re not omitting code here. It denotes that the previous pattern (body) may repeat zero or more times.

So, if the when macro is called as (when (zero? x) (print "huh?") (print "we got zero!")), the macro expander first check if the input matches the pattern.

  • The test in pattern matches the input (zero? x).
  • The body in pattern matches the input (print "huh?") and (print "we got zero!").

The matching of body is a bit tricky; as a pattern variable, you may think that body works like an array variable, each element holds each match—and you can use them in similarly repeating substructures in template. Let’s see the template, now that the input fully matched the pattern.

  • In the template, if and begin are not pattern variable, since they are not appeared in the pattern. So they are inserted as identifiers—that is, hygienic symbols effectively renamed to make sure to refer to the global if and begin, and will be unaffected by the macro use environment.
  • The test in the template is a pattern variable, so it is replaced for the matched value, (zero? x).
  • The body is also a pattern variable. The important point is that it is also followed by ellipsis. So we repeat body as many times as the number of matched values. The first value, (print "huh?"), and the second value, (print "we got zero!"), are expanded here.
  • Hence, we get (if (zero? x) (begin (print "huh?") (print "we got zero!"))) as the result of expansion. (With the note that if and begin refers to the identifiers visible from the macro definition environment.)

The expansion of ellipses is quite powerful. In the template, the ellipses don’t need to follow the sequence-valued pattern variable immediately; the variable can be in a substructure, as long as the substructure itself is followed by an ellipsis. See the following example:

(define-syntax show
  (syntax-rules ()
    [(_ expr ...)
     (begin
       (begin (write 'expr) (display "=") (write expr) (newline))
       ...)]))

If you call this macro as follows:

(show (+ 1 2) (/ 3 4))

It is expanded to the following form, modulo hygienity:

(begin
  (begin (write '(+ 1 2)) (display "=") (write (+ 1 2)) (newline))
  (begin (write '(/ 3 4)) (display "=") (write (/ 3 4)) (newline)))

So you’ll get this output.

(+ 1 2)=3
(/ 3 4)=3/4

You can also match with a repetition of substructures in the pattern. The following example is a simplified let that expands to lambda:

(define-syntax my-let
  (syntax-rules ()
    [(_ ((var init) ...) body ...)
     ((lambda (var ...) body ...) init ...)]))

If you call it as (my-let ((a expr1) (b expr2)) foo), then var is matched to a and b, while init is matched to expr1 and expr2, respectively. They can be used separately in the template.

Suppose “level” of a pattern variable means the number of nested ellipses that designate repetition of the pattern variable. A subtemplate can be followed as many ellipses as the maximum level of pattern variables in the subtemplate. In the following example, the level of pattern variable a is 1 (it is repeated by the last ellipsis in the pattern), while the level of b is 2 (repeated by the last two ellipses), and the level of c is 3 (repeated by all the ellipses).

(define-syntax ellipsis-test
  (syntax-rules ()
    [(_ (a (b c ...) ...) ...)
     '((a ...)
       (((a b) ...) ...)
       ((((a b c) ...) ...) ...))]))

In this case, the subtemplate a must be repeated by one level of ellipsis, (a b) must be repeated by two, and (a b c) must be repeated by three.

(ellipsis-test (1 (2 3 4) (5 6)) (7 (8 9 10 11)))
 ⇒ ((1 7)
    (((1 2) (1 5)) ((7 8)))
    ((((1 2 3) (1 2 4)) ((1 5 6))) (((7 8 9) (7 8 10) (7 8 11)))))

In the template, more than one ellipsis directly follow a subtemplate, splicing the leaves into the surrounding list:

(define-syntax my-append
  (syntax-rules ()
    [(_ (a ...) ...)
     '(a ... ...)]))

(my-append (1 2 3) (4) (5 6))
  ⇒ (1 2 3 4 5 6)

(define-syntax my-append2
  (syntax-rules ()
    [(_ ((a ...) ...) ...)
     '(a ... ... ...)]))

(my-append2 ((1 2) (3 4)) ((5) (6 7 8)))
  ⇒ (1 2 3 4 5 6 7 8)

Note: Allowing multiple ellipses to directly follow a subtemplate, and a pattern variable in a subtemplate to be enclosed within more than the variable’s level of nesting of ellipses, are extension to R7RS, and defined in SRFI-149. In the above examples, ellipsis-test, my-append and my-append2 are outside of R7RS.

Identifiers in a pattern is treated as pattern variables. But sometimes you want to match a specific identifier in the input. For example, the built-in cond and case detects an identifier else as a special identifier. You can use literal … for that. See the following example.

(define-syntax if+
  (syntax-rules (then else)
    [(_ test then expr1 else expr2) (if test expr1 expr2)]))

The identifiers listed as the literals don’t become pattern variables, but literally match the input. If the input doesn’t have the same identifier in the position, match fails.

(if+ (even? x) then (/ x 2) else (/ (+ x 1) 2))
 expands into (if (even? x) (/ x 2) (/ (+ x 1) 2))

(if+ (even? x) foo (/ x 2) bar (/ (+ x 1) 2))
 ⇒ ERROR: malformed if+

We’ve been saying identifiers instead of symbols. Roughly speaking, an identifier is a symbol with the surrounding syntactic environment, so that they can keep identity under renaming of hygiene macro.

The following example fails, because the else passed to the if+ macro is the one locally bound by let, which is different from the global else when if+ was defined, hence they don’t match.

(let ((else #f))
  (if+ (even? x) then (/ x 2) else (/ (+ x 1) 2))
  ⇒ ERROR: malformed if+

5.2.2 Explicit-renaming macro transformer

Special Form: er-macro-transformer procedure-expr

Creates a macro transformer from the given procedure-expr. The created macro transformer has to be bound to the syntactic keyword by define-syntax, let-syntax or letrec-syntax. Other use of macro transformers is undefined.

The procedure-expr must evaluate to a procedure that takes three arguments; form, rename and id=?.

The form argument receives the S-expression of the macro call. The procedure-expr must return an S-expression as the result of macro expansion. This part is pretty much like the traditional lisp macro. In fact, if you ignore rename and id=?, the semantics is the same as the traditional (unhygienic) macro. See the following example (Note the use of match; it is a good tool to decompose macro input):

(use util.match)

;; Unhygienic 'when-not' macro
(define-syntax when-not
  (er-macro-transformer
    (^[form rename id=?]
      (match form
        [(_ test expr1 expr ...)
         `(if (not ,test) (begin ,expr1 ,@expr))]
        [_ (error "malformed when-not:" form)]))))

(macroexpand '(when-not (foo) (print "a") 'boo))
  ⇒ (if (not (foo)) (begin (print "a") 'boo))

This is ok as long as you know you don’t need hygiene—e.g. when you only use this macro locally in your code, knowing all the macro call site won’t contain name conflicts. However, if you provide your when-not macro for general use, you have to protect namespace pollution around the macro use. For example, you want to make sure your macro work even if it is used as follows:

(let ((not values))
  (when-not #t (print "This shouldn't be printed")))

The rename argument passed to procedure-expr is a procedure that takes a symbol (or, to be precise, a symbol or an identifier) and effectively renames it to a unique identifier that keeps identity within the macro definition environment and won’t be affected in the macro use environment.

As a rule of thumb, you have to pass all new identifiers you insert into macro output to the rename procedure to keep hygiene. In our when-not macro, we insert if, not and begin into the macro output, so our hygienic macro would look like this:

(define-syntax when-not
  (er-macro-transformer
    (^[form rename id=?]
      (match form
        [(_ test expr1 expr ...)
         `(,(rename 'if) (,(rename 'not) ,test)
            (,(rename 'begin) ,expr1 ,@expr))]
        [_ (error "malformed when-not:" form)]))))

This is cumbersome and makes it hard to read the macro, so Gauche provides an auxiliary macro quasirename, which works like quasiquote but renaming identifiers in the form. See the entry of quasirename below for the details. You can write the hygienic when-not as follows:

(define-syntax when-not
  (er-macro-transformer
    (^[form rename id=?]
      (match form
        [(_ test expr1 expr ...)
         (quasirename rename
           `(if (not ,test) (begin ,expr1 ,@expr)))]
        [_ (error "malformed when-not:" form)]))))

You can intentionally break hygiene by inserting a symbol without renaming. The following code implements anaphoric when, meaning the result of the test expression is available in the expr1 exprs … with the name it. Since the binding of the identifier it does not exist in the macro use site, but rather injected into the macro use site by the macro expander, it is unhygienic.

(define-syntax awhen
  (er-macro-transformer
    (^[form rename id=?]
      (match form
        [(_ test expr1 expr ...)
         `(,(rename 'let1) it ,test     ; 'it' is not renamed
             (,(rename 'begin) ,expr1 ,@expr))]))))

If you use quasirename, you can write ,'it to prevent it from being renamed:

(define-syntax awhen
  (er-macro-transformer
    (^[form rename id=?]
      (match form
        [(_ test expr1 expr ...)
         (quasirename rename
           `(let1 ,'it ,test
              (begin ,expr1 ,@expr)))]))))

Here’s an example:

(awhen (find odd? '(0 2 8 7 4))
  (print "Found odd number:" it))
 ⇒ prints Found odd number:7

Finally, the id=? argument to the procedure-expr is a procedure that takes two arguments, and returns #t iff both are identifiers and either both are referring to the same binding or both are free. It can be used to compare literal syntactic keyword (e.g. else in cond and case forms) hygienically.

The following if=> macro behaves like if, except that it accepts (if=> test => procedure) syntax, in which procedure is called with the value of test if it is not false (similar to (cond [test => procedure]) syntax). The symbol => must match hygienically, that is, it must refer to the same binding as in the macro definition.

(define-syntax if=>
  (er-macro-transformer
    (^[form rename id=?]
      (match form
        [(_ test a b)
         (if (id=? (rename '=>) a)
           (quasirename rename
             `(let ((t ,test))
                (if t (,b t))))
           (quasirename rename
             `(if ,test ,a ,b)))]))))

The call (rename '=>) returns an identifier that captures the binding of => in the macro definition, and using id=? with the thing passed to the macro argument checks if both refer to the same binding.

(if=> 3 => list)  ⇒ (3)
(if=> #f => list) ⇒ #<undef>

;; If the second argument isn't =>, if=> behaves like ordinary if:
(if=> #t 1 2)     ⇒ 1

;; The binding of => in macro use environment differs from
;; the macro definition environment, so this if=> behaves like
;; ordinary if, instead of recognizing literal =>.
(let ((=> 'oof)) (if=> 3 => list)) ⇒ oof
Macro: quasirename renamer quasiquoted-form

It works like quasiquote, except that the symbols and identifiers that appear in the “literal” portion of form (i.e. outside of unquote and unquote-splicing) are replaced by the result of applying rename on themselves.

The quasiquote-form argument must be a quasiquoted form. The outermost quasiquote ` is consumed by quasirename and won’t appear in the output. The reason we require it is to make nested quasiquotes/quasirenames work.

For example, a form:

(quasirename r `(a ,b c "d"))

would be equivalent to write:

(list (r 'a) b (r 'c) "d")

This is not specifically tied to macros; the renamer can be any procedure that takes one symbol or identifier argument:

(quasirename (^[x] (symbol-append 'x: x)) `(+ a ,(+ 1 2) 5))
  ⇒ (x:+ x:a 3 5)

However, it comes pretty handy to construct the result form in ER macros. Compare the following two:

(use util.match)

;; using quasirename
(define-syntax swap
  (er-macro-transformer
    (^[f r c]
      (match f
        [(_ a b) (quasirename r
                   `(let ((tmp ,a))
                      (set! ,a ,b)
                      (set! ,b tmp)))]))))

;; not using quasirename
(define-syntax swap
  (er-macro-transformer
    (^[f r c]
      (match f
        [(_ a b) `((r'let) (((r'tmp) ,a))
                     ((r'set!) ,a ,b)
                     ((r'set!) ,b (r'tmp)))]))))

Note: In Gauche 0.9.7 and before, quasirename didn’t use quasiquoted form as the second argument; you can write (quasirename r form) instead of (quasirename r `form).

For the backward compatibility, we support the form without quasiquote by default for a while.

If you already have a quasirename form that does intend to produce a quasiquoted form, you have to rewrite it with double quasiquote: (quasirename r ``form).

To help transition, the handling of quasiquote in of quasirename can be customized with the environment variable GAUCHE_QUASIRENAME_MODE. It can have one of the following values:

legacy

Quasirename behaves the same way as 0.9.7 and before; use this to run code for 0.9.7 without any change.

compatible

Quasirename behaves as described in this entry; if form lacks a quasiquote, it silently assumes one. Existing code should work, except the rare case when you intend to return a quasiquoted form.

warn

Quasirename behaves as described in this entry, but warns if form lacks a quasiquote.

strict

Quasirename raises an error if form lacks a quasiquote. This will be the default behavior in future.


5.2.3 Identifier transformer

Special Form: make-id-transformer transformer-spec

Creates an identifier macro transformer from transformer-spec. The transformer-spec is the same as what can appear in define-syntax etc.

A normal macro expands from a form (M arg …) where M is an identifier bound to the macro. An identifier macro, on the other hand, expands from solely from the macro-bound identifier, or a form (set! M expr). In other words, an identifier macro is used in the context of a variable, rather than a function call.

Suppose the following code, where state-manager is a stateful closure. The identifier macro the-state hides the closure and makes it look like a variable:

(define state-manager
  (let ([state #f])
    (case-lambda
      [() state]
      [(val) (set! state val)])))

(define-syntax the-state
  (make-id-transformer
    (syntax-rules (set!)
      [(set! _ expr) (state-manager expr)]
      [_ (state-manager)])))

(state-manager 'off)

the-state ⇒ off

(set! the-state 'on)

the-state ⇒ on

(state-manager) ⇒ on

(Note that the single _ pattern in the second clause of syntax-rules above matches anything, so it should come after set! match. It is Gauche’s extension.)

Identifier macros may enable some cool tricks, but it can easily confuse readers. We generally discourage use of identifier macros except rare cases where it is absolutely necessary. In most cases, you can use ordinary macros by just adding parentheses.

If you want some portability, you may try identifier-syntax in util.identifier-syntax (see util.identifier-syntax - R6RS identifier syntax. It is in R6RS and may be supported other implementations. It is built on top of make-id-transformer.



For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT