For Gauche 0.9.5


Next: , Previous: , Up: Library modules - Gauche extensions   [Contents][Index]

9.10 gauche.generator - Generators

Module: gauche.generator

A generator is merely a procedure with no arguments and works as a source of a series of values. Every time it is called, it yeilds a value. The EOF value indicates the generator is exhausted. For example, read-char can be seen as a generator that generates characters from the current input port.

It is common practice to abstract the source of values in such a way, so it is useful to define utility procedures that work on the generators. This module provides them.

Srfi-121 (Generators) is a subset of this module. Since gauche.generator predates srfi-121, we have different names for some procedures; for the compatibility, we provide both names.

A generator is very lightweight, and handy to implement simple on-demand calculations. However, keep in mind that it is side-effecting construct; you can’t safely backtrack, for example. For more functional on-demand calculation, you can use lazy sequences (see Lazy sequences), which is actually built on top of generators.

The typical pattern of using generator is as follows: First you create a source or sources of the values, using one of generator constructors (see Generator constructors) or rolling your own one. You may connect generator operators that modifies the stream of generated items as you wish (see Generator operations). Eventually you need to extract actual values from the geneator to consume; there are utitlity procedures provided (see Generator consumers). Overall, you create a pipeline (or DAG) of generators that works as lazy value-propagation network.


Next: , Previous: , Up: Generators   [Contents][Index]

9.10.1 Generator constructors

A generator isn’t a special datatype but just an ordinary procedure, so you can make a generator with lambdas. This module provides some common generator constructors for the convenience.

If you want to use your procedure as a generator, note that a generator can be invoked many times even after it returns EOF once. You have to code it so that once it returns EOF, it keeps returning EOF for the subsequent calls.

The result of generator constructors is merely a procedure, and printing it doesn’t show much. In the examples in this section we use generator->list to convert the generator to the list. See Generator consumers for the description of generator->list.

Function: null-generator

An empty generator. Returns just an EOF object when called.

Function: circular-generator arg …

Returns an infinite generator that repeats the given arguments.

(generator->list (circular-generator 1 2 3) 10)
  ⇒ (1 2 3 1 2 3 1 2 3 1)

Note that the above example limits the length of the converted list by 10; otherwise generator->list won’t return.

Function: giota :optional (count +inf.0) (start 0) (step 1)

Like iota (see List constructors), creates a generator of a series of count numbers, starting from start and increased by step.

(generator->list (giota 10 3 2))
  ⇒ (3 5 7 9 11 13 15 17 19 21)

If both start and step are exact, the generator yields exact numbers; otherwise it yields inexact numbers.

(generator->list (giota +inf.0 1/2 1/3) 6)
  ⇒ (1/2 5/6 7/6 3/2 11/6 13/6)
(generator->list (giota +inf.0 1.0 2.0) 5)
  ⇒ (1.0 3.0 5.0 7.0 9.0)
Function: grange start :optional (end +inf.0) (step 1)

Similar to giota, creates a generator of a series of numbers. The series begins with start, increased by step, and continues while the number is below end.

(generator->list (grange 3 8))
  ⇒ (3 4 5 6 7)
Function: generate proc

Creates a generator from coroutine.

The proc argument is a procedure that takes one argument, yield. When called, generate immediately returns a generator G. When G is called, the proc runs until it calls yield. Calling yield causes to suspend the execution of proc and G returns the value passed to yield.

Once proc returns, it is the end of the series—G returns eof object from then on. The return value of proc is ignored.

The following code creates a generator that produces a series 0, 1, and 2 (effectively the same as (giota 3) and binds it to g.

(define g
  (generate
   (^[yield] (let loop ([i 0])
               (when (< i 3) (yield i) (loop (+ i 1)))))))

(generator->list g) ⇒ (0 1 2)
Function: list->generator lis :optional start end
Function: vector->generator vec :optional start end
Function: reverse-vector->generator vec :optional start end
Function: string->generator str :optioanl start end
Function: uvector->generator uvec :optional start end
Function: bytevector->generator u8vector :optional start end

[SRFI-121+] Returns a generator that yields each item in the given argument. A generator returned from reverse-* procedures runs in reverse order. Srfi-121 defines these except uvector->generator, which can take any type of uniform vectors. The srfi-121 version, bytevector->generator, limits the argument to u8vector.

(generator->list (list->generator '(1 2 3 4 5)))
  ⇒ (1 2 3 4 5)
(generator->list (vector->generator '#(1 2 3 4 5)))
  ⇒ (1 2 3 4 5)
(generator->list (reverse-vector->generator '#(1 2 3 4 5)))
  ⇒ (5 4 3 2 1)
(generator->list (string->generator "abcde"))
  ⇒ (#\a #\b #\c #\d #\e)
(generator->list (uvector->generator '#u8(1 2 3 4 5)))
  ⇒ (1 2 3 4 5)

The generator is exhausted once all items are retrieved; the optional start and end arguments can limit the range the generator walks across; start specifies the left bound and end specifies the right bound.

For forward generators, the first value the generator yields is start-th element, and it ends right before end-th element. For reverse generators, the first value is the item right next to the end-th element, and the last value is the start-th element. at the last element, and reverse generators ends at the first element.

(generator->list (vector->generator '#(a b c d e) 2))
  ⇒ (c d e)
(generator->list (vector->generator '#(a b c d e) 2 4))
  ⇒ (c d)
(generator->list (reverse-vector->generator '#(a b c d e) 2))
  ⇒ (e d c b)
(generator->list (reverse-vector->generator '#(a b c d e) 2 4))
  ⇒ (d c)
(generator->list (reverse-vector->generator '#(a b c d e) #f 2))
  ⇒ (b a)
Function: bits->generator n :optional start end
Function: reverse-bits->generator n :optional start end

These procedures take an exact integer and treat it as a sequence of boolean values (0 for false and 1 for true), as integer->list does (see Integers as bits). Bits->generator takes bits from LSB, while reverse-bits->generator takes them from MSB.

(generator->list (bits->generator #b10110))
 ⇒ (#f #t #t #f #t)
(generator->list (reverse-bits->generator #b10110))
 ⇒ (#t #f #t #t #f)

The optional start and/or end arguments are used to specify the range of bitfield, LSB being 0. Unlike list->generator etc, start specifies the rightmost position (inclusive) and end specfies the leftmost position (exclusive). It is consistent with other procedures that accesses bit fields in integers (see Integers as bits).

(generator->list (bits->generator #x56 0 4)
  ⇒ (#f #t #t #f)  ; takes bit 0, 1, 2 and 3
(generator->list (bits->generator #x56 4 8)
  ⇒ (#t #f #t #f)  ; takes bit 4, 5, 6 and 7

(generator->list (reverse-bits->generator #x56 4 8)
  ⇒ (#f #t #f #t)  ; takes bit 7, 6, 5 and 4 
Function: port->sexp-generator input-port
Function: port->line-generator input-port
Function: port->char-generator input-port
Function: port->byte-generator input-port

Returns a generator that reads characters or bytes from the given port, respectively. They’re just (cut read input-port), (cut read-line input-port), (cut read-char input-port) and (cut read-byte input-port), respectively, but we provide them for completeness.

Generic function: x->generator obj

A generic version to convert any collection obj to a generator that walks across the obj. Besides, if obj is an input port, port->char-generator is called.

Function: file->generator filename reader . open-args

Opens a file filename, and returns a generator that reads items from the file by a procedure reader, which takes one argument, an input port. The arguments open-args are passed to open-input-file

The file is closed when the generator is exhausted. If a generator is abandoned before being read to the end, then the file is kept open until the generator is garbage-collected. If you want to make sure the file is closed by a certain point of time, you might want to use a reader procedure as a generator within the dynamic extent of with-input-from-file etc.

Function: file->sexp-generator filename . open-args
Function: file->char-generator filename . open-args
Function: file->line-generator filename . open-args
Function: file->byte-generator filename . open-args

Returns a generator that reads a series of sexps, characters, lines and bytes from a file filename, respectively. These are versions of file->generator specialized by read, read-char, read-line and read-byte as the reader argument.

Like file->generator, open-args are passed to open-input-file (see File ports). The file is closed when the generator is exhausted.

Function: gunfold p f g seed :optional tail-gen

A generator constructor similar to unfold (see SRFI-1 List utilities).

P is a predicate that takes a seed value and determines where to stop. F is a procedure that calculates a value from a seed value. G is a procedure that calculates the next seed value from the current seed value. Tail-gen is a procedure that takes the last seed value and returns a generator that generates the tail.

For each call of the resulting generator, p is called with the current seed value. If it returns a true, then we see we’ve done, and tail-gen is called (if it is given) to get a generator for the tail. Otherwise, we apply f on the current seed value to get the value to generate, and use g to update the seed value.

(generator->list (gunfold (^s (> s 5)) (^s (* s 2)) (^s (+ s 1)) 0))
  ⇒ '(0 2 4 6 8 10)

SRFI-121 compatible procedures

Function: generator item …

[SRFI-121] Returns a generator that generates item ….

Function: make-iota-generator count :optional start step

[SRFI-121] Same as giota, except that the count argument is required.

Function: make-range-generator start :optional end stop

[SRFI-121] Same as grange.

Function: make-coroutine-generator proc

[SRFI-121] Same as generate.

Function: make-bits-generator n

[SRFI-121] (make-bits-generator n) is the same as (bits->generator n). This doesn’t take optional start/end arguments, though. Srfi-121 doesn’t have corresponding procedure for reverse-bits->generator.

Function: make-for-each-generator for-each obj

[SRFI-121] Given collection obj and walker for-each, creates a generator that retrieves one item at a time from the collection. Trivially defined as follows:

(define (make-for-each-generator for-each coll)
  (generate (^[yield] (for-each yield coll))))

If obj is mutated before the returned generator walks all the values, the behavior depends on how the for-each procedure handles the situation; it may or may not be safe. In general it’s better to avoid mutation until the generator returns EOF. Once the generator is exhausted, though, it is safe to mutate obj.

Function: make-unfold-generator stop? mapper successor seed

[SRFI-121] This is the same as gunfold, except it doesn’t take optional tail-gen argument.


Next: , Previous: , Up: Generators   [Contents][Index]

9.10.2 Generator operations

The following procedures take generators (noted as gen and gen2) and return a generator. For the convenience, they also accept any collection to gen and gen2 parameters; if a collection is passed where a generator is expected, it is implicitly coerced into a generator.

(NB: This is Gauche’s extension. For portable srfi-121 programs, you shouldn’t rely on this behavior; instead, explicitly convert collections to generators.)

Function: gcons* item … gen

[SRFI-121] Returns a generator that adds items in front of gen.

(generator->list (gcons* 'a 'b (giota 2)))
 ⇒ (a b 0 1)
Function: gappend gen …

[SRFI-121] Returns a generator that yields the items from the first given generator, and once it is exhausted, use the second generator, and so on.

(generator->list (gappend (giota 3) (giota 2)))
 ⇒ (0 1 2 0 1)

(generator->list (gappend))
 ⇒ ()
Function: gconcatenate gen

The gen argument should generate generators and/or sequences. Returns a generator that yields elements from the first generator/sequence, then the second one, then the third, etc.

It is similar to (apply gappend (generator->list gen)), except that gconcatenate can work even gen generates infinite number of generators.

($ generator->list $ gconcatenate
   $ list->generator `(,(giota 3) ,(giota 2)))
 ⇒ (0 1 2 0 1)
Function: gflatten gen

The argument gen is a generator that yields lists. This procedure returns a generator that’s yield each element of the lists one at a time.

Example: The game Tetris determines the next dropping piece (tetrimino) by the following algorithm: Take a bag of tetriminos with one for each kind (O, I, T, S, Z, L, J), permute it, and draw one by one; once the bag is empty, take another bag and repeat. The algorithm can be implemented by a pipeline of generates as follows. (Tetris is a registered trademark of The Tetris Company).

(use gauche.generator)
(use data.random) ; for permutations-of

(define g
  ($ gflatten $ permutations-of
     $ (circular-generator '(O I T S Z L J))))

(generator->list g 21)
  ⇒
  (L O Z T J S I J L Z T I O S T L Z S I J O)

Note the subtle difference of this example and the example in gconcatenate above—gconcatenate takes a generator of generators, while gflatten takes a generator of lists.

If we use Haskell-ish type notation, you can see the subtle differences of those similar procedures:

gappend            :: (Generator a, Generator a, ...) -> Generator a
(pa$ apply append) :: [(Generator a)] -> Generator a
gconcatenate       :: Generator Generator a -> Generator a
gflatten           :: Generator [a] -> Generator a
Function: gmerge less-than gen gen2 …

Creates a generator that yields elements out of input generators, with the order determined by a procedure less-than. The procedure is called as (less-than a b) and must return #t iff the element a must precede the element b.

Each input generator must yield an ordered elements by itself; otherwise the result won’t be ordered.

If only one generator is given, it is just returned (after coercing the input to a generator). In that case, less-than won’t be called at all.

(generator->list (gmerge < '(1 3 8) '(5) '(2 4)))
  ⇒ '(1 2 3 4 5 8)
Function: gmap proc gen gen2 …

Returns a generator that yields a value returned by proc applied on the values from given generators. The returned generator terminates when any of the given generator is exhausted.

NB: This differs from generator-map (see Folding generated values) which consumes all values at once and returns the results as a list, while gmap returns a generator immediately without consuming input.

Function: gmap-accum proc seed gen gen2 …

A generator version of map-accum (see Mapping over collection), mapping with states.

The proc argument is a procedure that takes as many arguments as the input generators plus one. It is called as (proc v v2 … seed) where v, v2, … are the values yielded from the input generators, and seed is the current seed value. It must return two values, the yielding value and the next seed.

NB: This is called gcombine in srfi-121.

Function: gcombine proc seed gen gen2 …

[SRFI-121] An alias of gmap-accum, provided for the compatibility of srfi-121.

Function: gfilter pred gen
Function: gremove pred gen

[SRFI-121] Returns a generator that yields the items from the source generator gen, except those who makes pred answers false (gfilter) or those who makes pred answers a true value (gremove)

(generator->list (gfilter odd? (grange 0)) 6)
 ⇒ (1 3 5 7 9 11)
(generator->list (gremove odd? (grange 0)) 6)
 ⇒ (0 2 4 6 8 10)
Function: gdelete item gen :optional =

[SRFI-121] Return a generator that yields the items from the source generator gen, except those are the same as item. The comparison is done by the procedure passed to =, which defaults to equal?.

;; Note: This example relies on auto-coercing list to generator.
;; SRFI-121 requires list->generator for the second argument.
(generator->list (gdelete 3 '(1 2 3 4 3 2 1)))
  ⇒  (1 2 4 2 1)
Function: gdelete-neighbor-dups gen :optional =

[SRFI-121] Returns a generator that yields the items from the source generator gen, but the consecutive items of the same value is omitted. The comparison is done by the procedure passed to =, which defaults to equal?.

;; Note: This example relies on auto-coercing list to generator.
;; SRFI-121 requires string->generator for the second argument.
(generator->list (gdelete-neighbor-dups "mississippi"))
  ⇒ (#\m #\i #\s #\i #\s #\i #\p #\i)
Function: gfilter-map proc gen gen2 …

Works the same as (gfilter values (gmap proc gen gen2 …)), only slightly efficiently.

Function: gstate-filter proc seed gen

This allows stateful filtering of a series. The proc argument must be a procedure that takes a value V from the source generator and a seed value. It should return two values, a boolean flag and the next seed value. If it returns true for the boolean flag, the generator yields V. Otherwise, the generator keeps calling proc, with updating the seed value, until it sees the true flag value or the source generator is exhausted.

The following example takes a generator of oscillating values and yields only values that are greater than their previous value.

(generator->list
 (gstate-filter (^[v s] (values (< s v) v)) 0
                (list->generator '(1 2 3 2 1 0 1 2 3 2 1 0 1 2 3))))
 ⇒ (1 2 3 1 2 3 1 2 3)
Function: gbuffer-filter proc seed gen :optional tail-gen

This procedure allows n-to-m mapping between elements in input and output— that is, you can take a look at several input elements to generate one or more output elements.

The procedure proc receives the next input element and accumulated seed value. It returns two values: A list of output values, and the next seed value. If you need to look at more input to generate output, you can return an empty list as the first value.

If the input reaches the end, tail-gen is called with the current seed value; it should return a list of last output values. If omitted, the output ends when the output of the last call to proc is exhausted (the last seed value is discarded).

Suppose you have a text file. Each line contains a command, but if the line ends with backslash, next line is treated as a continuation of the current line. The following code creates a generator that returns logical lines, that is, the lines after such line continuations are taken care of.

(gbuffer-filter (^[v s]
                  (if-let1 m (#/\\$/ v)
                    (values '() (cons (m 'before) s))
                    (values `(,(string-concatenate-reverse (cons v s))) '())))
                '()
                (file->line-generator "input-file.txt")
                (^[s] `(,(string-concatenate-reverse s))))
Function: gtake gen k :optional padding
Function: gdrop gen k

[SRFI-121] Returns a generator that takes or drops initial k elements from the source generator gen.

Those won’t complain if the source generator is exhausted before generating k items. By default, the generator returned by gtake terminates as the source ends, but if you give the optional padding argument, then the returned generator does yield k items, using the value given to padding to fill the rest.

Note: If you pass padding, gtake always returns a generator that generates exactly k elements even the input generator is already exhausted—there’s no general way to know whether you’ve reached the end of the input. If you need to take k items repeatedly from the input generator, you may want to use gslices below.

Note for the compatibility: Until 0.9.4, gtake takes two optional arguments, fill? and padding. That is consistent with Gauche’s builtin take*, but incompatible to srfi-121’s gtake. We think srfi-121’s interface is more compact and intuitive, so we renamed the original one to gtake* (emphasizing the similarity to take*), and made gtake compatible to srfi-121. To ease transition, the current gtake allows two optional arguments (four in total), in which case we assume the caller wants to call gtake*; so the code that gives two optional arguments to gtake would work in both pre-0.9.4 and 0.9.5.

Function: gtake* gen k :optional fill? padding

A variation of gtake; instead of single optional padding argument, this takes two optional arguments just like take* (See List accessors and modifiers.) Up to 0.9.4 this version is called gtake. This is provided for the backward compatibility.

Function: gtake-while pred gen
Function: gdrop-while pred gen

[SRFI-121] The generator version of take-while and drop-while (see List accessors and modifiers). The generator returned from gtake-while yields items from the source generator as far as pred returns true for each. The generator returned from gdrop-while first reads items from the source generator while pred returns true for them, then start yielding items.

Function: gslices gen k :optional (fill? #f) (padding #f)

The generator version of slices (see List accessors and modifiers). This returns a generator, that yields a list of k items from the input generator gen at a time.

(generator->list (gslices (giota 7) 3))
  ⇒ ((0 1 2) (3 4 5) (6))

The fill? and padding arguments controls how the end of input is handled, just like gtake. When fill? is #f (default), the last item from output generator may not have k items if the input is short to fill them, as shown in the above example. If fill? is true and the input is short to complete k items, padding argument is used to fill the rest.

(generator->list (gslices (giota 6) 3 #t 'x))
  ⇒ ((0 1 2) (3 4 5))
(generator->list (gslices (giota 7) 3 #t 'x))
  ⇒ ((0 1 2) (3 4 5) (6 x x))
Function: grxmatch regexp gen

The gen argument must be, after coerced, a generator that yields characters.

A generator returned from this procedure tries to match regexp from the character sequence generated by gen, and once it matches, remember the position after the match and returns #<rxmatch> object. If no more match is found, the generator is exhausted.

($ generator->list
   $ gmap rxmatch-substring
   $ grxmatch #/\w+/ "The quick brown fox jumps over the lazy dog.")
 ⇒ ("The" "quick" "brown" "fox" "jumps" "over" "the" "lazy" "dog")

Note: This procedure is efficient if gen is a string, in which case we actually bypass coercing it to a generator. If gen is other than a string, the current implementation may need to apply regexp as many times as O(n^2) where n is the entire length of the character sequence generated by gen, although the coefficient is small. This may be improved in future, but be careful using this function on very large input.

Note also that, when gen is not a string, rxmatch is applied on some buffered partial input. So rxmatch-after of the returned match does not represents the whole “rest of input” after the match, but merely the rest of strings within the buffer.

Function: gindex vgen igen

[SRFI-121] Both arguments are generators. The igen generator must yield monotonically increasing series of exact nonnegative integers.

Returns a generator that generates items from vgen indexed by the numbers from igen, exhausted when either source generator is exhausted.

An error is thrown when igen yields a value that doesn’t conform the condition.

;; This example takes advantage of Gauche's auto-coercing
;; list to generator.  For portable srfi-121 programs,
;; you need list->generator for each argument:
(generator->list (gindex '(a b c d e) '(0 2 3)))
  ⇒ (a c d)
Function: gselect vgen bgen

[SRFI-121] Both arguments are generators. Creates and returns a generator that yields a value from vgen but only the corresponding value from bgen is true.

The returned generator is exhausted when one of the source generators is exhausted.

;; This example takes advantage of Gauche's auto-coercing
;; list to generator.  For portable srfi-121 programs,
;; you need list->generator for each argument:
(generator->list (gselect '(a b c d e) '(#t #t #f #t #f)))
  ⇒ (a b d)

Combined with a bitgenerator, you can use gselect to extract items using bitmask:

(generator->list (gselect '(a b c d e)
                           (reverse-bits->generator #x1a)))
  ⇒ (a b d)

Previous: , Up: Generators   [Contents][Index]

9.10.3 Generator consumers

Some generator consumers are built-in. See Folding generated values, for generator-fold, generator-fold-right, generator-for-each, generator-map, and generator-find.

Function: generator->list generator :optional k
Function: generator->reverse-list generator :optional k

[SRFI-121] Reads items from generator and returns a list of them (or a reverse list, in case of generator->reverse-list). By default, this reads until the generator is exhausted. If an optional argument k is given, it must be a nonnegative integer, and the list ends either k items are read, or the generator is exhausted.

Be careful not to pass an infinite generator to this without specifying k—this procedure won’t return but hogs all the memory before crash.

Function: generator->vector gen :optional k
Function: generator->string gen :optional k

[SRFI-121] Extracts items from the generator gen up to k items or until it exhausts, and create a fresh vector or string from the extracted items.

For generator->string, gen must yield a character, or an error is reported.

Function: generator->vector! vector at gen

[SRFI-121] Fill vector from index at with the value yielded from gen. It terminates when gen is exhausted or the index reaches at the end of the vector. Returns the number of items generated.

(define v (vector 'a 'b 'c 'd 'e))

(generator->vector! v 2 (giota))
  ⇒ 3

v ⇒ #(a b 0 1 2)
Macro: glet* (binding …) body body2 …

This captures a monadic pattern frequently appears in the generator code. It is in a similar spirit of and-let*, but returns as soon as the evaluating expression returns EOF, instead of #f as and-let* does.

The binding part can be either (var expr) or ( expr ). The actual definition will explain this syntax clearly.

(define-syntax glet*
  (syntax-rules ()
    [(_ () body body2 ...) (begin body body2 ...)]
    [(_ ([var gen-expr] more-bindings ...) . body)
     (let1 var gen-expr
       (if (eof-object? var)
         var
         (glet* (more-bindings ...) . body)))]
    [(_ ([ gen-expr ] more-bindings ...) . body)
     (let1 var gen-expr
       (if (eof-object? var)
         var
         (glet* (more-bindings ...) . body)))]))
Macro: glet1 var expr body body2 …

This is to glet* as let1 is to let*. In other words, it is (glet* ([var expr]) body body2 …).

Macro: do-generator (var gexpr) body …

This is a generator version of dolist and dotimes (see Binding constructs).

Gexpr is an expression that yields a generator. It is evaluated once. The resulting generator is called repeatedly until it returns EOF. Every time the generator is called, body … are evaluated in the scope where var is bound to the value yielded from the generator.

Like dolist and dotimes, this macro exists for side-effects. You can write the same thing with for-each families, but sometimes this macro makes the imperative code more readable:

(do-generator [line (file->line-generator "filename")]
  ;; do some side-effecting stuff with line
  )
Function: generator-any pred gen
Function: generator-every pred gen

[SRFI-121] Like any and every (see Walking over lists), but works on a generator.

Function: generator-count pred gen

[SRFI-121] Returns the number of items in a generator gen that satisfies pred. As a side effect, gen is exhausted.

Function: generator-unfold gen unfold arg …

[SRFI-121]


Previous: , Up: Generators   [Contents][Index]