Definitions (Gauche Users’ Reference)

4.10 Definitions

Special Form: define variable expression ¶

Special Form: define (variable . formals) body … ¶

Special Form: define variable ¶

[R7RS+ base] This form has different meanings in the toplevel (without no local bindings) or inside a local scope.

On toplevel, it defines a global binding to a symbol variable. In the first form, it globally binds a symbol variable to the value of expression, in the current module.

(define x (+ 1 2))
x ⇒ 3
(define y (lambda (a) (* a 2)))
(y 8) ⇒ 16

If variable is already bound in the same module, the subsequent definitions work just like assignments.

(define x 3)
(define (value-of-x) x)

(value-of-x x) ⇒ 3

(define x 4)

(value-of-x x) ⇒ 4

If variable is not bound in the current module, but has an imported bindings, things get interesting but complicated. See Into the Scheme-Verse, for the details.

The second form is a syntactic sugar of defining a procedure. It is equivalent to the following form.

(define (name . args) body …)
  ≡ (define name (lambda args body …))

The third form is a shorthand of (define variable (undefined)). It is introduced in R6RS (but not a part of R7RS). You can use that form to indicate the initial value doesn’t matter.

If the form appears inside a local scope (internal define), this introduce a local binding of the variable.

Internal defines can appear in the beginning of body of lambda or other forms that introduces local bindings. They are equivalent to a letrec* form, as shown below.

(lambda (a b)
  (define (cube x) (* x x x))
  (define (square x) (* x x))
  (+ (cube a) (square b)))

 ≡

(lambda (a b)
  (letrec* ([cube (lambda (x) (* x x x))]
            [square (lambda (x) (* x x))])
    (+ (cube a) (square b))))

Since internal defines are essentially a letrec* form, you can write mutually recursive local functions, and you can use preceding bindings introduced in the same scope to calculate the value to be defined. However, you can’t use a binding that is introduced after an internal define form to calculate its value; if you do so, Gauche may not report an error immediately, but you may get strange errors later on.

(lambda (a)
  (define x (* a 2))
  (define y (+ x 1))  ; ok to use x to calculate y
  (* a y))

(lambda (a)
  ;; You can refer to even? in odd?, since the value of even?
  ;; isn't used at the time odd? is defined; it is only used
  ;; when odd? is called.
  (define (odd? x) (or (= x 1) (not (even? (- x 1)))))
  (define (even? x) (or (= x 0) (not (odd? (- x 1)))))
  (odd? a))

(lambda (a)
  ;; This is not ok, for defining y needs to use the value
  ;; of x.  However, you may not get an error immediately.
  (define y (+ x 1))
  (define x (* a 2))
  (* a y))

Inside the body of binding constructs, internal defines must appear before any expression of the same level. The following code isn’t allowed, for an expression (print a) precedes the define form.

(lambda (a)
  (print a)
  (define (cube x) (* x x x))  ; error!
  (cube a))

It is also invalid to put no expressions but internal defines inside the body of binding constructs, although Gauche don’t report an error.

Note that begin (see Grouping) doesn’t introduce a new scope. Defines in the begin act as if begin and surrounding parenthesis are not there. Thus these two forms are equivalent.

(let ((x 0))
  (begin
    (define (foo y) (+ x y)))
  (foo 3))
 ≡
(let ((x 0))
  (define (foo y) (+ x y))
  (foo 3))

Macro: define-values (var …) expr ¶

Macro: define-values (var var1 … . var2) expr ¶

Macro: define-values var expr ¶

[R7RS base][SRFI-244] Expr is evaluated, and each value of the result is bound to each vars. In the first form, it is an error unless expr yields the same number of values as vars.

(define-values (lo hi) (min&max 3 -1 15 2))

lo ⇒ -1
hi ⇒ 15

In the second form, expr may yield as many values as var var1 … or more; the excess values are made into a list and bound to var2.

(define-values (a b . c) (values 1 2 3 4))

a ⇒ 1
b ⇒ 2
c ⇒ (3 4)

In the last form, all the values yielded by expr are gathered to a list and bound to var.

(define-values qr (quotient&remainder 23 5))

qr ⇒ (4 3)

You can use define-values wherever define is allowed; that is, you can mix define-values in internal defines.

(define (foo . args)
  (define-values (lo hi) (apply min&max args))
  (define len (length args))
  (list len lo hi))

(foo 1 4 9 3 0 7)
 ⇒ (6 0 9)

Special Form: define-constant variable expression ¶

Special Form: define-constant (variable . formals) body … ¶

This form is only effective in toplevel.

Like top-level define, it defines a top-level definition of variable with the value of expression, but additionally tells the compiler that (1) the binding won’t change, and (2) the value of expression won’t change from the one computed at the compile time. So the compiler can replace references of variable with the compile-time value of expression.

An error is signaled when you use set! to change the value of variable. It is allowed to redefine variable, but a warning is printed.

The difference from define-inline below is that the value of expression is computed at the compile time and treated as a literal. Suppose you define x as follows:

(define-constant x (vector 1 2 3))

Then, the code (list x) is compiled to the same code as (list '#(1 2 3)).

This distinction is especially important when you do AOT (ahead of time) compilation.

There’s no “internal define-constant”, since the compiler can figure out whether a local binding is mutated, and optimize code accordingly, without a help of declarations.

Special Form: define-inline variable expression ¶

Special Form: define-inline (variable . formals) body … ¶

The second form is a shorthand of (define-inline variable (lambda formals body …)).

If this appears in the position of internal defines, it is the same as internal defines.

If it appears in the toplevel, it defines an inlinable binding. An inlinable binding promises the compiler that the binding won’t change, but unlike constant bindings introduced by define-constant, the actual value of expression may be computed at runtime. Hence the compiler cannot simply replace the references of variable with the compile-time value of expression.

However, if the compiler can determine that the value of expression is to be a procedure, it may inline the procedure where it is invoked.

In the example below, the body of dot3 is inlined where dot3 is called. Furthermore, since the second argument of dot3 is a constant vector, you can see vector-ref on it is computed at compile time (e.g. CONST -1.0 etc.)

gosh> (define-inline (dot3 a b)
        (+ (* (vector-ref a 0) (vector-ref b 0))
           (* (vector-ref a 1) (vector-ref b 1))
           (* (vector-ref a 2) (vector-ref b 2))))
dot3
gosh> (disasm (^[] (dot3 x '#(-1.0 -2.0 -3.0))))
CLOSURE #<closure (#f)>
=== main_code (name=#f, code=0x28524e0, size=26, const=4 stack=6):
signatureInfo: ((#f))
     0 GREF-PUSH #<identifier user#x.20d38e0>; x
     2 LOCAL-ENV(1)             ; (dot3 x (quote #(-1.0 -2.0 -3.0)))
     3 LREF0                    ; a
     4 VEC-REFI(0)              ; (vector-ref a 0)
     5 PUSH
     6 CONST -1.0
     8 NUMMUL2                  ; (* (vector-ref a 0) (vector-ref b 0))
     9 PUSH
    10 LREF0                    ; a
    11 VEC-REFI(1)              ; (vector-ref a 1)
    12 PUSH
    13 CONST -2.0
    15 NUMMUL2                  ; (* (vector-ref a 1) (vector-ref b 1))
    16 NUMADD2                  ; (+ (* (vector-ref a 0) (vector-ref b 0))
    17 PUSH
    18 LREF0                    ; a
    19 VEC-REFI(2)              ; (vector-ref a 2)
    20 PUSH
    21 CONST -3.0
    23 NUMMUL2                  ; (* (vector-ref a 2) (vector-ref b 2))
    24 NUMADD2                  ; (+ (* (vector-ref a 0) (vector-ref b 0))
    25 RET

As an extreme case, if both arguments are compile-time constant, dot3 is completely computed at compile time:

gosh> (disasm (^[] (dot3 '#(1 2 3) '#(4 5 6))))
CLOSURE #<closure (#f)>
=== main_code (name=#f, code=0x2a2b8e0, size=2, const=0 stack=0):
signatureInfo: ((#f))
     0 CONSTI(32)
     1 RET

The same inlining behavior may be achieved by making dot3 a macro, but if you use define-inline, dot3 can be used as procedures when needed:

(map dot3 list-of-vectors1 list-of-vectors2)

If dot3 is a macro you can’t pass it as a higher-order procedure.

The inline expansion pass is run top-to-bottom. Inlinable procedure must be defined before used in order to be inlined.

If you redefine an inlinable binding, Gauche warns you, since the redefinition won’t affect already inlined call sites. So it should be used with care—either use it internal to the module, or use it for procedures that won’t change in future. Inlining is effective for performance-critical parts. If a procedure is called sparingly, there’s no point to define it inlinable.

Special Form: define-in-module module variable expression ¶

Special Form: define-in-module module (variable . formals) body … ¶

This form must appear in the toplevel. It creates a global binding of variable in module, which must be either a symbol of the module name or a module object. If module is a symbol, the named module must exist.

Expression is evaluated in the current module.

The second form is merely a syntactic sugar of:

(define-in-module module variable (lambda formals body …))

Note: to find out if a symbol has definition (global binding) in the current module, you can use module-binds? (see Module introspection).

4.10.1 Into the Scheme-Verse

Multiple toplevels are multiple scopes

One upon a time, the Scheme world was simple. We had one single global space we called the toplevel. Toplevel definitions can be understood as side-effects to this global space; if the name hasn’t been exist there yet, create a new binding, otherwise, overwrite existing one.

The problem was that it was hard to scale, thus many implementations introduced their own module systems. One of the main agenda of R6RS was to have a module system (which is called “library” in RnRS) consistent with the design of Scheme. Especially, since Scheme’s hygienic macro system captures lexical scope, it is desirable that it interacts with the module system in the same way.

In modern Scheme, “toplevel” of each module creates its own lexical scope, and the definitions are understood in letrec* semantics. Hence, macro systems can consistently treat identifiers as a name associated with a scope.

Suppose you see these toplevel definitions:

(define (odd? n)  (if (zero? x) #f (even? (- n 1))))
(define (even? n) (if (zero? x) #t (odd? (- n 1))))

The first appearance of even? in the first line is understood as the one defined in the second line. It becomes apparent when we compare it with internal defines:

(let ((even? error))
  (define (odd? n)  (if (zero? x) #f (even? (- n 1))))
  (define (even? n) (if (zero? x) #t (odd? (- n 1))))
  ...)

The even? in the definition of odd? refers to the one defined in the next line, never to the one bound by let.

So far, so good.

Now, consider the following toplevel code:

;; Invalid in RnRS, n >= 6
(import (scheme base) (scheme write))
(define orig-error error)
(define (error . args)
  (write args) (newline)
  (apply orig-error args))

The intention is to save the original value of error, which is imported from (scheme base), into a variable orig-error, then redefine error to add logging feature. This technique was popular in pre-R6RS Scheme.

However, with our new toplevel-as-a-scope Scheme, the error in (define orig-error error) must refer to the one defined in the same scope, which is the new definition below; otherwise lexical scoping gets broken. The value of inner error hasn’t been calculated when orig-error’s value is calculated, so the above form is an invalid program in terms of RnRS.

In fact, to avoid confusion, R6RS prohibits defining a toplevel variable that conflicts with the imported name (in R7RS the behavior of such program is undefined). In the example above, the name error is imported from (scheme base) and also defined in the toplevel, hence it’s a violation.

The modern way of such augumentation is to use renaming import:

(import (except (scheme base) error)
        (rename (scheme base) (error r7rs:error))
        (scheme write))
(define (error . args)
  (write args) (newline)
  (apply r7rs:error args))

Gauche’s take

Gauche’s module system predates R6RS and R7RS, and it regards a module as a first-class entity and suppors class-like inheritance. It is upper-compatible to R7RS libraries, but we take freedom in interpreting R7RS undefined behaviors.

First, you can define toplevel variables that conflict with imported or inherited bindings. The new definition simply shadows the old one.

Second, if multiple toplevel forms are processed at once e.g. it is enclosed in begin or the file is read by include, we treat them in one scope. That is, if the above orig-error example is read by include, the first error refers to the to-be-defined error below. Since the value of error hasn’t been calculated by the time it’s used, you’ll get the following error:

*** ERROR: uninitialized variable: error

Third, Gauche compiles and executes each individual toplevel forms (the forms that’s not enclosed in other S-expressions). It is the same as REPL semantics. If each form of orig-error example appears individually on the toplevel, the (define orig-error error) line actually refers to the R7RS error and assign it to orig-error, since we don’t know yet if error will be defined in the same scope.

The third rule is necessary to support REPL semantics, but note that the result would differ when the same file is included. If you can, avoid writing such ambiguous code.

Note: The second behavior is clarified in release 0.9.9 for the better compatibility with R7RS. Before that, the behavior of such case is undefined, but some code might have expected that it works in REPL semantics (the third rule).

In order to support the transition, if you set an enviornment variable GAUCHE_LEGACY_DEFINE, Gauche treats definitions in the same way as 0.9.8 and before. Note that if you that, you may see Gauche can’t include some valid R7RS code that has multiple libraries in one file.