Gauche:MacroProblem

Gauche:MacroProblem

The current version of Gauche has a couple of deficiencies related to hygienic macros. This page describes the reason and workaround.

In the following description, "traditional macro" refers to the Lisp-style non-hygienic macros.

Mixing traditional macro and hygienic macro

This problem doesn't appear very often, but when it appears, it bites you hard. It requires all the traditional macro to be written with extra care, which is a bad thing.

Cause

When a hygienic macro is defined, all the symbols in the expansion part of macro definition is replaced for identifiers. An identifier captures its creation environment (macro definition environment), so that it won't interfere with the macro use environment. Normally identifiers are transparent to users, but you might have seen them in error messages.

When a hygienic macro is expanded to a form that calls a traditional macro, the form the traditional macro gets may contain identifiers instead of symbols. If the traditional macro tests whether an item in the form is a symbol or not, it may fail to recognize identifiers.

Example 1

Suppose you want to write a macro test-let, which works like let, except when a symbol test appears in the first element of binding form, it evaluates the following expression and immediately returns #f if the expression returns #f.

Test-let can be defined by traditional macro:

(define-macro (test-let binds . body)
  (cond ((null? binds)
         `(let () ,@body))
        ((eq? (caar binds) 'test)
         (let ((var (gensym)))
           `(let ((,var ,(cadar binds)))
              (if ,var
                  (test-let ,(cdr binds) ,@body)
                  #f))))
        (else
         `(let (,(car binds)) (test-let ,(cdr binds) ,@body)))))

And it works fine so far.

(define (foo n)
  (test-let ((test (positive? n))
             (b (sqrt n)))
    b))

gosh> (foo 4)
2.0                   ;; OK.
gosh> (foo -4)
#f                    ;; rejected by the test.

However, if you try to call the macro within the definition of hygienic macro xtest-let, the problem appears.

(define-syntax xtest-let
  (syntax-rules ()
    ((_ n)
     (test-let ((test (positive? n))
                (b (sqrt n)))
       b))))

gosh> (xtest-let 4)
2.0                    ;; OK.
gosh> (xtest-let -4)
0.0+2.0i               ;; should be rejected, but...

That is because the symbol test in the definition of xtest-let is replaced by an identifier when it is passed to the macro transformer of test-let, so the (eq? (caar binds) 'test) doesn't succeed. Consequently the test in the macro is recognized as a normal binding form.

If you want to compare the given item with symbol, you have to strip off the syntactic wrapping the identifier has. One easy way is to use unwrap-syntax, which recursively traverses the given form and replaces identifiers for symbols.

   (eq? (unwrap-syntax (caar binds)) 'test)

If the form to pass to unwrap-syntax can be a large structure, however, calling unwrap-syntax can be expensive. In such a case, you can explicitly test if the form is an identifier or not, although it gets crumsy.

   (let* ((var (caar binds))
          (sym (cond ((symbol? var) var)
                     ((identifier? var) (identifier->symbol var))
                     (else #f))))
     (and sym (eq? sym 'test)))

Okey, this is bad enough. But the story doesn't end here....

Example 2

Suppose you want to write a macro like let, but allows a single symbol to appear in binding list instead of (var expr), like CommonLisp's let. Such symbol is bound to #f. The macro, llet, can be written in traditional macro like this:

(define-macro (llet binds . body)
  `(let ,(map (lambda (bind)
                (cond ((symbol? bind) `(,bind #f))
                      (else bind)))
              binds)
     ,@body))

It works as intended:

gosh> (llet ((a 3)
              b)
        (list a b))
(3 #f)

Now, we use the llet macro in the definition of hygienic macro, xlet.

(define-syntax xlet
  (syntax-rules ()
    ((_ n)
     (llet ((a n) b)
       (list a b)))))

Then you get a mysterious error when you try to use the macro xlet, since the test symbol? in llet definition doesn't recognize the identifier.

gosh> (xlet 5)
*** ERROR: syntax error (invalid binding form): (let ((#<id 0x81f7de0 user::a> 5) #<id 0x81f7d80 user::b>) (#<id 0x81f7d30 user::list> #<id 0x81f7de0 user::a> #<id 0x81f7d80 user::b>))

So, should we use unwrap-syntax? Yes, we can, but one caveat: If you want to create a binding for given identifier, you have to preserve the identifier. In other words, the following doesn't work:

(define-macro (llet binds . body)
  `(let ,(map (lambda (bind)
                (let ((b (unwrap-syntax bind)))
                  (cond ((symbol? b) `(,b #f))
                         (else b))))
              binds)
     ,@body))

gosh> (xlet 5)
*** ERROR: unbound variable: a

But the following does:

(define-macro (llet binds . body)
  `(let ,(map (lambda (bind)
                 (cond ((or (symbol? bind) (identifier? bind))
                         `(,bind #f))  ;; insert identifier as is
                        (else bind)))
               binds)
      ,@body))

gosh> (xlet 5)
(5 #f)

Solution

It seems that this problem makes traditional macro overly complicated. Is there any way to address this, instead of asking all traditional macros to use identifier? or unwrap-syntax?

One possibility is to replace identifiers in the output of hygienic macro for renamed symbols. Renaming keeps hygienity, and still you'll get plain S-expression to play with in traditional macro.

The reason Gauche keeps identifier around even after macro expansion is for efficiency. Since the compiler can directly recognize identifiers, we can avoid copying the macro output just to eliminate identifiers. If the hygienic macro is called recursively (it is often the case), we can save extra copying as well, for hygienic macro transformer knows how to deal with identifiers.

A compromise is to keep identifiers around, and when a form is passed to the traditional macro, we scan the form and replaces identifiers with renamed symbols. This still costs, but not as much as always renaming symbols.

Macro generating macro

This is a bug that I haven't had a time to fix. If the hygienic macro expands to a definition of hygienic macro, the expanded form doesn't work.

I think the cause is identifier/symbol problem described above.

More ...