Gauche has two primitive lazy evaluation mechanisms.
The first one is an explicit mechanism, defined in the Scheme
standard: You mark
an expression to be evaluated lazily by delay
,
and you use force
to make the evaluation happen when
needed. Gauche also support another primitive lazy
,
as defined in SRFI-45, for space-efficient tail-recursive
lazy algorithms.
The second one is a lazy sequence, in which evaluation happens
implicitly. From a Scheme program, a lazy sequence just looks
as a list—you can take its car
and cdr
, and
you can apply map
or other list procedures on it. However,
internally, its element isn’t calculated until it is required.
• Delay force and lazy: | ||
• Lazy sequences: |
Scheme has traditionally provided an explicit delayed evaluation mechanism
using delay
and force
. After R5RS, however,
it is found that it didn’t mix well with tail-recursive
algorithms: It required unbound memory, despite that the
body of the algorithm could be expressed in iterative manner.
SRFI-45 showed that introducing another primitive syntax lazy
addresses the issue.
For the detailed explanation please look at the SRFI-45 document.
Here we explain how to use those primitives.
[R7RS lazy][SRFI-45]
These forms creates a promise that delays the evaluation
of expression. Expression will be evaluated
when the promise is passed to force
.
If expression itself is expected to yield a promise,
you should use lazy
. Otherwise, you should use delay
.
If you can think in types, the difference may be clearer.
lazy : Promise a -> Promise a delay : a -> Promise a
Since we don’t have static typing, we can’t enforce this usage.
The programmer has to choose appropriate one from the context.
Generally, lazy
appears only to surround the entire
body of function that express a lazy algorithm.
NB: In R7RS, lazy
is called delay-force
, for the operation
is conceptually similar to (delay (force expr))
(note that the
type of force
is Promise a -> a
).
For the real-world example of use of lazy
,
you may want to check the implementation of util.stream
(see util.stream
- Stream library).
Note that expression may be evaluated more than once;
it may require the value of its own promise recursively, or more than one
thread may evaluate expression simultaneously.
In general, you shouldn’t have side-effecting computation
in expression—even if it is only evaluated once,
you may not exactly know when the evaluation occur, so
having side-effects there is a bad ides.
(See force
below for multi-thread semantics).
[SRFI-45][SRFI-226]
Returns a promise that returns obj ….
The name eager
is introduced in SRFI-45. It takes single argument,
but Gauche enhanced it to take multiple values.
SRFI-226 defines make-promise
, which is the same
as multi-value eager
.
(Note that R7RS-small also defines make-promise
in
scheme.lazy
library (see scheme.lazy
- R7RS lazy evaluation).
It is slightly different that it only takes one argument,
and if the argument is a promise, it will be returned as is.
This change is intentional; see SRFI-226 for the details.)
Since that these are procedures, arguments are evaluated
before they are called; so they work as a type converter
(a -> Promise a
) without delaying the evaluation.
Used mainly to construct promise-returning functions.
[R7RS lazy][SRFI-226] If promise is not a promise, it is just returned.
Otherwise, if promise’s value hasn’t been computed,
force
makes promise’s encapsulated expression
be evaluated in the same parameterization where the promise
is created, and returns the result.
Once promise’s value is computed, it is memorized in it
so that subsequent force
on it won’t cause the computation.
In Gauche, if the delayed expression yields multiple values, force
yields multiple values as well. (It is unspecified in R7RS).
Note that if uncomputed promise is forced simultaneously by more than one thread, its expression may be evaluated concurrently. The first delivered value becomes the value of the promise; if another thread finishes computation and see the value of promise is already determined, it discards its computation result and returns the determined value.
The following example shows the promise is evaluated with the
parameterization of delay
:
(define p (make-parameter 1)) (let1 v (parameterize ((p 2)) (delay (p))) (parameterize ((p 3)) (force v))) ⇒ 2
Note: R7RS defines force
to evaluate the unevaluated promise’s body
in the dynamic environment of the caller of force
, but it is
regarded as a mistake. In general, you shouldn’t care which force
call
actually evaluates the promise’s body, so the result shouldn’t depend
on the context of force
call. SRFI-226 addressed it.
[R7RS lazy]
Returns #t
iff obj is a promise object.
A lazy sequence is a list-like structure
whose elements are calculated lazily.
Internally we have a special type of pairs, whose cdr
is evaluated on demand.
However, in Scheme level, you’ll never see a distinct
“lazy-pair” type. As soon as you try to access a
lazy pair, Gauche automatically force the delayed
calculation, and the lazy pair turns into an ordinary pair.
It means you can pass lazy sequences to ordinary list-processing
procedures such as car
, cdr
or map
.
Look at the following example; generator->lseq
takes a procedure that generates one value at a time, and
returns a lazy sequence that consists of those values.
(with-input-from-file "file" (^[] (let loop ([cs (generator->lseq read-char)] [i 0]) (match cs [() #f] [(#\c (or #\a #\d) #\r . _) i] [(c . cs) (loop cs (+ i 1))]))))
It returns the position of the first occurrence of
character sequence “car” or “cdr” in the file file.
The loop treats the lazy sequence just like an ordinary list, but
characters are read as needed, so once the sequence is
found, the rest of the file won’t be read. If we do
it eagerly, we would have to read the entire file first no matter how
big it is, or to give up using the mighty match
macro and
to write a basic state machine that reads one character one at a time.
Other than implicit forcing, Gauche’s lazy sequences are slightly different than the typical lazy stream implementations in Scheme in the following ways:
cdr
side of the lazy pair is lazily evaluated;
the car
side is evaluated immediately. On the other hand, with
stream-cons
in util.stream
(see util.stream
- Stream library),
both car
and cdr
sides won’t be evaluated until
it is absolutely needed.
car
part is already
calculated, even if you don’t use it. In most cases
you don’t need to care, for calculating one item more
is a negligible overhead. However, when you create
a self-referential lazy structure, in which the earlier
elements of a sequence is used to calculate the latter
elements of itself, a bit of caution is needed;
a valid code for fully lazy circular structure may not
terminate in Gauche’s lazy sequences. We’ll show a
concrete example later. This bit of eagerness
is also visible when side effects are involved;
for example, lazy character sequence reading from a port
may read one character ahead.
Note: R7RS scheme.lseq
(SRFI-127)
provides a portable alternative of lazy sequence
(see scheme.lseq
- R7RS lazy sequences). It uses
dedicated APIs (e.g. lseq-cdr
) to operate on lazy sequences
so that portable implementation is possible. In Gauche, we just
use our built-in lazy sequence as SRFI-127 lazy sequence; if you
want your code to be portable, consider using SRFI-127, but be careful
not to mix lazy sequences and ordinary lists; Gauche won’t complain,
but other Scheme implementation may choke on it.
[R7RS lseq]
Creates a lazy sequence that consists of items produced
by generator, which is just a procedure with zero arguments that yields
an item at a time. Returning EOF marks the end of the sequence
(EOF itself isn’t included in the sequence).
For example, read-char
can work as a generator.
Gauche has a set of convenient utilities to deal with generators
(see gauche.generator
- Generators).
In the second form, the returned lazy sequence is prepended by
item …. Since there’s no way to distinguish lazy
pairs and ordinary pairs, you can write it as
(cons* item … (generator->lseq generator))
,
but that’s more verbose.
Internally, Gauche’s lazy sequence is optimized to be built on top of generators, so this procedure is the most efficient way to build lazy sequences.
Note: SRFI-127 also has generator->lseq
, which is exactly
the same as this in Gauche.
Returns a lazy pair consists of car and cdr.
The expression car is evaluated at the call of lcons
,
but evaluation of cdr is delayed.
You can’t distinguish a lazy pair from an ordinary pair. If you
access either its car
or cdr
, or even you ask pair?
to it, its cdr part is implicitly forced and you get an ordinary pair.
Unlike cons
, cdr should be an expression that yields
a (lazy or ordinary) list, including an empty list.
In other words, lazy sequences can always be a null-terminated list
when entirely forced; there are no “improper lazy sequences”.
(Since Scheme isn’t statically typed, we can’t force the cdr
expression to be a proper list before actually evaluating it.
Currently if cdr expression yields non-list, we just ignore
it and treat as if it yielded an empty list.)
(define z (lcons (begin (print 1) 'a) (begin (print 2) '()))) ⇒ ; prints '1', since the car part is evaluated eagerly. (cdr z) ⇒ () ;; and prints '2' ;; This also prints '2', for accessing car of a lazy pair forces ;; its cdr, even the cdr part isn't used. (car (lcons 'a (begin (print 2) '()))) ⇒ a ;; So as this; asking pair? to a lazy pair causes forcing its cdr. (pair? (lcons 'a (begin (print 2) '()))) ⇒ #t ;; To clarify: This doesn't print '2', because the second lazy ;; pair never be accessed, so its cdr isn't evaluated. (pair? (lcons 'a (lcons 'b (begin (print 2) '())))) ⇒ #t
Now, let me show you a case where “one item ahead” evaluation becomes
an issue. The following is an elegant definition of infinite
Fibonacci sequence using self-referential lazy structure
(lmap
is a lazy map, defined in gauche.lazy
module):
(use gauche.lazy) ;; for lmap (define *fibs* (lcons* 0 1 (lmap + *fibs* (cdr *fibs*)))) ;; BUGGY
Unfortunately, Gauche can’t handle it well.
(car *fibs*) ⇒ 0 (cadr *fibs*) ⇒ *** ERROR: Attempt to recursively force a lazy pair.
When we want to access the second argument (cadr
) of *fibs*
,
we take the car of the second pair,
which is a lazy pair of 1
and (lmap ...)
. The lazy pair
is forced and its cdr part needs to be calculated. The first thing
lmap
returns needs to see the first and second element of *fibs*
,
but the second element of *fibs*
is what we’re calculating now!
We can workaround this issue by avoiding accessing the immediately preceding value. Fibonacci numbers F(n) = F(n-1) + F(n-2) = 2*F(n-2) + F(n-3), so we can write our sequence as follows.
(define *fibs* (lcons* 0 1 1 (lmap (^[a b] (+ a (* b 2))) *fibs* (cdr *fibs*))))
And this works!
(take *fibs* 20) ⇒ (0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181)
Many lazy algorithms are defined in terms of fully-lazy cons
at the bottom. When you port such algorithms to Gauche using lcons
,
keep this bit of eagerness in mind.
Note also that lcons
needs to create a thunk to delay
the evaluation. So the algorithm to construct lazy list using
lcons
has an overhead of making closure for each item.
For performance-critical part,
you want to use generator->lseq
whenever possible.
A lazy version of cons*
(see List constructors).
Both lcons*
and llist*
do the same thing; both
names are provided for the symmetry to cons*
/list*
.
The tail argument should be an expression that yields a (possibly lazy) list. It is evaluated lazily. Note that the preceding elements x … are evaluated eagerly. The following equivalences hold.
(lcons* a) ≡ a (lcons* a b) ≡ (lcons a b) (lcons* a b ... y z) ≡ (cons* a b ... (lcons y z))
Creates a lazy sequence of numbers starting from start,
increasing by step (default 1), to the maximum value that doesn’t
exceed end. The default of end is +inf.0
,
so it creates an infinite list. (Don’t type just
(lrange 0)
in REPL, or it won’t terminate!)
If any of start or step
is inexact, the resulting sequence
has inexact numbers.
(take (lrange -1) 3) ⇒ (-1 0 1) (lrange 0.0 5 0.5) ⇒ (0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5) (lrange 1/4 1 1/8) ⇒ (1/4 3/8 1/2 5/8 3/4 7/8)
A lazy version of iota
(see List constructors); returns
a lazy sequence of count integers (default: positive infinity),
starting from start (default: 0), stepping by step (default: 1).
Just like iota
, the result consists of exact numbers
if and only if both start and step are exact; otherwise
the result consists of inexact numbers.
These are the same as the following expressions, respectively. They are provided for the convenience, since this pattern appears frequently.
(generator->lseq (cut read-char port)) (generator->lseq (cut read-byte port)) (generator->lseq (cut read-line port)) (generator->lseq (cut read port))
If port is omitted, the current input port is used.
Note that the lazy sequence may buffer some items, so once you make an lseq from a port, only use the resulting lseq and don’t ever read from port directly.
Note that the lazy sequence terminates when EOF is read from the port, but the port isn’t closed. The port should be managed in larger dynamic extent where the lazy sequence is used.
You can also convert input data into various lists by
the following expressions (see Input utility functions).
Those procedures read the port eagerly until EOF and returns
the whole data in a list, while lseq
versions read
the port lazily.
(port->list read-char port) (port->list read-byte port) (port->string-list port) (port->sexp-list port)
Those procedures make (lazy) lists out of ports. The opposite can be
done by open-input-char-list
and open-input-byte-list
;
See gauche.vport
- Virtual ports, for the details.
See also gauche.lazy
- Lazy sequence utilities, for more utility procedures
that creates lazy sequences.
Let’s consider calculating an infinite sequence of prime numbers.
(Note: If you need prime numbers in your application, you don’t
need to write one; just use math.prime
. see math.prime
- Prime numbers).
Just pretend we already have some prime numbers
calculated in a variable *primes*
, and you need
to find a prime number equal to or grater than n
(for simplicity, we assume n is an odd number).
(define (next-prime n) (let loop ([ps *primes*]) (let1 p (car ps) (cond [(> (* p p) n) n] [(zero? (modulo n p)) (next-prime (+ n 2))] [else (loop (cdr ps))]))))
This procedure loops over the list of prime numbers, and
if no prime number p less than or equal to (sqrt n)
divides n, we can say n is prime. (Actual test
is done by (> (* p p) n)
instead of (> p (sqrt n))
,
for the former is faster.)
If we find some p divides n, we try a new value
(+ n 2)
with next-prime
.
Using next-prime
, we can make a generator that keeps generating
prime numbers. The following procedure returns a generator
that returns primes above last.
(define (gen-primes-above last) (^[] (set! last (next-prime (+ last 2))) last))
Using generator->lseq
, we can turn the generator returned
by gen-primes-above
into a lazy list, which can be used
as the value of *prime*
. The only caveat is that we need
to have some pre-calculated prime numbers:
(define *primes* (generator->lseq 2 3 5 (gen-primes-above 5)))
Be careful not to evaluate *primes*
directly on REPL,
since it contains an infinite list and it’ll blow up your REPL.
You can look at the first 20 prime numbers instead:
(take *primes* 20) ⇒ (2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71)
Or find what the 10000-th prime number is:
(~ *primes* 10000) ⇒ 104743
Or count how many prime numbers there are below 1000000:
(any (^[p i] (and (>= p 1000000) i)) *primes* (lrange 0)) ⇒ 78498
Note: If you’re familiar with the lazy functional approach, this example may look strange. Why do we use side-effecting generators while we can define a sequence of prime numbers in pure functional way, as follows?
(use gauche.lazy) (define (prime? n) (not (any (^p (zero? (mod n p))) (ltake-while (^k (<= (* k k) n)) *primes*)))) (define (primes-from k) (if (prime? k) (lcons k (primes-from (+ k 2))) (primes-from (+ k 2)))) (define *primes* (llist* 2 3 5 (primes-from 7)))
(The module gauche.lazy
provides ltake-while
, which
is a lazy version of take-while
. We don’t need lazy version
of any
, since it immediately stops when the predicate returns
a true value.)
The use of lcons
and co-recursion in primes-from
is a typical idiom in functional programming. It’s perfectly ok
to do so in Gauche; except that the generator version is much faster
(when you take first 5000 primes, generator version ran 17 times faster
than co-recursion version on the author’s machine).
It doesn’t mean you should avoid co-recursive code; if an algorithm can be expressed nicely in co-recursion, it’s perfectly ok. However, watch out the subtle semantic difference from lazy functional languages—straightforward porting may or may not work.