[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Gauche has two primitive lazy evaluation mechanisms.
The first one is an explicit mechanism, defined in the Scheme
standard: You mark
an expression to be evaluated lazily by delay
,
and you use force
to make the evaluation happen when
needed. Gauche also support another primitive lazy
,
as defined in srfi-45, for space-efficient tail-recursive
lazy algorithms.
The second one is a lazy sequence, in which evaluation happens
implicitly. From a Scheme program, a lazy sequence just looks
as a list—you can take its car
and cdr
, and
you can apply map
or other list procedures on it. However,
internally, its element isn’t calculated until it is required.
6.19.1 Delay, force and lazy | ||
6.19.2 Lazy sequences |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Scheme has traditionally provided an explicit delayed evaluation mechanism
using delay
and force
. After R5RS, however,
it is found that it didn’t mix well with tail-recursive
algorithms: It required unbound memory, despite that the
body of the algorithm could be expressed in iterative manner.
Srfi-45 showed that introducing another primitive syntax lazy
addresses the issue.
For the detailed explanation please look at the srfi-45 document.
Here we explain how to use those primitives.
[SRFI-45][R5RS]
These forms creates a promise that delays the evaluation
of expression. Expression will be evaluated
when the promise is passed to force
.
If expression itself is expected to yield a promise,
you should use lazy
. Othewise, you should use delay
.
If you can think in types, the difference may be clearer.
lazy : Promise a -> Promise a delay : a -> Promise a |
Since we don’t have static typing, we can’t enforce this usage.
The programmer has to choose appropriate one from the context.
Generally, lazy
appearers only to surround the entire
body of function that express a lazy algorithm.
For the real-world example of use of lazy
,
you may want to check the implementation of util.stream
(See section util.stream
- Stream library).
[R5RS] If promise is not a promise, it is just returned.
Otherwise, if promise’s value hasn’t been computed,
force
makes promise’s encapsulated expression
be evaluated, and returns the result.
Once promise’s value is computed, it is memorized in it
so that subsequent force
on it won’t cause the computation.
Returns #t
iff obj is a promise object.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A lazy sequence is a list-like structure
whose elements are calculated lazily.
Internally we have a special type of pairs, whose cdr
is evaluated on demand.
However, in Scheme level, you’ll never see a distinct
“lazy-pair” type. As soon as you try to access the
lazy pair, Gauche automaticlaly force the delayed
calculation, and the lazy pair turns into an ordinary pair.
It means you can pass lazy sequences to ordinary list-processing
procedures such as car
, cdr
or map
.
Look at the following example; generator->lseq
takes a procedure that generates one value at a time, and
returns a lazy sequence consists of those values.
(with-input-from-file "file" (^[] (let loop ([cs (generator->lseq read-char)] [i 0]) (match cs [() #f] [(#\c (or #\a #\d) #\r . _) i] [(c . cs) (loop cs (+ i 1))])))) |
It returns the position of the first occurrence of
character sequence “car” or “cdr” in the file ‘file’.
Characters are read as needed, so once the sequence is
found, the rest of the files won’t be read. If we do
it eagerly, we would have to read entire file first no matter how
big it is, or to give up using the mighty match
macro and
to write a basic state machine that reads one character one at a time.
Other than implicit forcing, Gauche’s lazy sequences are slightly different than the typical lazy stream implementations in Scheme in the following ways:
cdr
side of the lazy pair is lazily evaluated;
the car
side is evaluated immediately. On the other hand, with
stream-cons
in util.stream
(See section util.stream
- Stream library),
both car
and cdr
sides won’t be evaluated until
it is absolutely needed.
car
part is already
calculated, even if you don’t use it. In most cases
you don’t need to care, for calculating one item more
is a negligible overhead. However, when you create
a self-referential lazy structure, in which the earlier
elements of a sequence is used to caculate the latter
elements of itself, a bit of caution is needed;
a valid code for fully lazy circular structure may not
terminate in Gauche’s lazy sequences. We’ll show a
concrete example later. This bit of eagerness
is also visible when side effects are involved;
for example, lazy character sequence reading from a port
may read one character ahead.
Creates a lazy sequence that consists of items produced
by generator, which is just a procedure with zero arguments that yields
an item at a time. Returning EOF marks the end of the sequence
(EOF itself isn’t included in the sequence).
For example, read-char
can work as a generator.
Gauche has a set of convenient utilities to deal with generators
(See section gauche.generator
- Generators).
In the second form, the returned lazy sequence is prepended by
item …. Since there’s no way to distinguish lazy
pairs and ordinary pairs, you can write it as
(cons* item … (generator->lseq generator))
,
but that’s more verbose.
Internally, Gauche’s lazy sequence is optimized to be built on top of generators, so this procedure is the most efficient way to build lazy sequences.
Returns a lazy pair consists of car and cdr.
The expression car is evaluated at the call of lcons
,
but evaluation of cdr is delayed.
You can’t distinguish a lazy pair from an ordinary pair. If you
access either its car
or cdr
, or even you ask pair?
to it, its cdr part is implicitly forced and you get an ordinary pair.
Unlike cons
, cdr should be an expression that yields
a (lazy or ordinary) list, including an empty list.
In other words, lazy sequences can always be a null-terminated list
when entirely forced; there are no “improper lazy sequences”.
(Since Scheme isn’t statically typed, we can’t force the cdr
expression to be a proper list before actually evaluating it.
Currently if cdr expression yields non-list, we just ignore
it and treat as if it yielded an empty list.)
(define z (lcons (begin (print 1) 'a) (begin (print 2) '()))) ⇒ ; prints '1', since the car part is evaluated eagerly. (cdr z) ⇒ () ;; and prints '2' ;; This also prints '2', for accessing car of a lazy pair forces ;; its cdr, even the cdr part isn't used. (car (lcons 'a (begin (print 2) '()))) ⇒ a ;; So as this; asking pair? to a lazy pair causes forcing its cdr. (pair? (lcons 'a (begin (print 2) '()))) ⇒ #t ;; To clarify: This doesn't print '2', because the second lazy ;; pair never be accessed, so its cdr isn't evaluated. (pair? (lcons 'a (lcons 'b (begin (print 2) '())))) ⇒ #t |
Now, let me show you a case where “one item ahead” evaluation becomes
an issue. The following is an elegant definition of infinite
Fibonacci sequence using self-referential lazy structure
(lmap
is a lazy map, defined in gauche.lazy
module):
(use gauche.lazy) ;; for lmap (define *fibs* (lcons* 0 1 (lmap + *fibs* (cdr *fibs*)))) ;; BUGGY |
Unfortunately, Gauche can’t handle it well.
(car *fibs*) ⇒ 0 (cadr *fibs*) ⇒ *** ERROR: Attempt to recursively force a lazy pair. |
When we want to access the second argument (cadr
) of *fibs*
,
we take the car of the second pair,
which is a lazy pair of 1
and (lmap ...)
. The lazy pair
is forced and its cdr part needs to be calculated. The first thing
lmap
returns needs to see the first and second element of *fibs*
,
but the second element of *fibs*
is what we’re calculating now!
We can workaround this issue by avoiding accessing the immedietely preceding value. Fibonacci numbers F(n) = F(n-1) + F(n-2) = 2*F(n-2) + F(n-3), so we can write our sequence as follows.
(define *fibs* (lcons* 0 1 1 (lmap (^[a b] (+ a (* b 2))) *fibs* (cdr *fibs*)))) |
And this works!
(take *fibs* 20) ⇒ (0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181) |
Many lazy algorithms are defined in terms of fully-lazy cons
at the bottom. When you port such algorithms to Gauche using lcons
,
keep this bit of eagerness in mind.
Note also that lcons
needs to create a thunk to delay
the evaluation. So the algorithm to construct lazy list using
lcons
has an overhead of making closure for each item.
For performance-critical part,
you want to use generator->lseq
whenever possible.
A lazy version of cons*
(See section List constructors).
Both lcons*
and llist*
do the same thing; both
names are provided for the symmetry to cons*
/list*
.
The tail argument should be an expression that yields a (possibly lazy) list. It is evaluated lazily. Note that the preceding elements x … are evaluated eagerly. The following equivalences hold.
(lcons* a) ≡ a (lcons* a b) ≡ (lcons a b) (lcons* a b ... y z) ≡ (cons* a b … (lcons y z)) |
Creates a lazy sequence of numbers starting from start,
increasing by step (default 1), to the maximum value that doesn’t
exceed end. The default of end is +inf.0
,
so it creates an infinite list. (Don’t type just
(lrange 0)
in REPL, or it won’t terminate!)
If any of start or step
is inexact, the resulting sequence
has inexact numbers.
(take (lrange -1) 3) ⇒ (-1 0 1) (lrange 0.0 5 0.5) ⇒ (0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5) (lrange 1/4 1 1/8) ⇒ (1/4 3/8 1/2 5/8 3/4 7/8) |
A lazy version of iota
(See section List constructors); returns
a lazy sequence of count integers, starting from start
(default 0), stepping by step (default 1).
See also gauche.lazy
- Lazy sequence utilities, for more utility procedures
that creates lazy sequences.
Let’s consider calculating an infinite sequence of prime numbers.
Just pretend we already have some prime numbers
calculated in a variable *primes*
, and you need
to calculate a prime number equal to or grater than n
(for simplicity, we assume n is an odd number).
(define (next-prime n) (let loop ([ps *primes*]) (let1 p (car ps) (cond [(> (* p p) n) n] [(zero? (modulo n p)) (next-prime (+ n 2))] [else (loop (cdr ps))])))) |
This procedure loops over the list of prime numbers, and
if no prime number p less than or equal to (sqrt n)
divides n, we can say n is prime. (Actual test
is done by (> (* p p) n)
instead of (> p (sqrt n))
,
for the former is faster.)
If we find some p divides n, we try a new value
(+ n 2)
with next-prime
.
Using next-prime
, we can make a generator that keeps generating
prime numbers. The following procedure returns a generator
that returns primes above last.
(define (gen-primes-above last) (^[] (set! last (next-prime (+ last 2))) last)) |
Using generator->lseq
, we can turn the generator returned
by gen-primes-above
into a lazy list, which can be used
as the value of *prime*
. The only caveat is that we need
to have some pre-calculated prime numbers:
(define *primes* (generator->lseq 2 3 5 (gen-primes-above 5))) |
Be careful not to evaluate *primes*
directly on REPL,
since it contains an infinite list and it’ll blow up your REPL.
You can look the first 20 prime numbers instead:
(take *primes* 20) ⇒ (2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71) |
Or find what the 10000-th prime number is:
(~ *primes* 10000) ⇒ 104743 |
Or count how many prime numbers there are below 1000000:
(any (^[p i] (and (>= p 1000000) i)) *primes* (lrange 0)) ⇒ 78498 |
Note: If you’re familiar with a lazy functional approach, this example may look strange. Why do we use side-effecting generators while we can define a sequence of prime numbers in pure functional way, as follows?
(use gauche.lazy) (define (prime? n) (not (any (^p (zero? (mod n p))) (ltake-while (^k (<= (* k k) n)) *primes*)))) (define (primes-from k) (if (prime? k) (lcons k (primes-from (+ k 2))) (primes-from (+ k 2)))) (define *primes* (llist* 2 3 5 (primes-from 7))) |
(The module gauche.lazy
provides ltake-while
, which
is a lazy version of take-while
. We don’t need lazy version
of any
, since it immediately stops when the predicate returns
a true value.)
The use of lcons
and co-recursion in primes-from
is a typical idiom in functional programming. It’s perfectly ok
to do so in Gauche; except that the generator version is much faster
(when you take first 5000 primes, generator version ran 17 times faster
than co-recursion version in the author’s machine).
It doesn’t mean you should avoid co-recursive code; if an algorithm can be expressed nicely in co-recursion, it’s perfectly ok. However, watch out the subtle semantic difference from lazy functional languages—straightforward porting may or may not work.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] |
This document was generated by Shiro Kawai on May 28, 2012 using texi2html 1.82.