srfi-42- Eager comprehensions
This module provides a generic comprehension mechanism, which some other languages (e.g. Haskell and Python) offer as a built-in mechanism. It provides a rich set of operators so it can be used not only as a list generator but as a generic loop construct (actually, some may say it is as powerful/evil as Common Lisp’s loop macro).
It runs eagerly as the name suggests, that is, if it generates a list, it creates the entire list when evaluated, instead of generate the elements on demand. Thus it can’t represent an infinite sequence, which Haskell’s comprehension naturally does. Gauche offers a few alternatives to deal with lazy, possibly infinite, sequences: See Lazy sequences, Generators and Stream library.
Let’s begin with some examples.
Generate a list of squares for the first five integers:
(list-ec (: i 5) (* i i)) ⇒ (0 1 4 9 16)
list-ec is a comprehension macro that generates a list.
The first form
(: i 5) is called a qualifier, which
specifies a set of values to repeat over (here it is each integer
from 0 below 5).
The last form
(* i i) is called a body, which is ordinary Scheme expression
evaluated for each values specified by the qualifier.
A comprehension can have more than one qualifiers.
Next example generate set of pair of numbers
(x y), where
is between 2 (inclusive) and 5 (exclusive),
y is between 1 (inclusive) and x (exclusive).
(list-ec (: x 2 5) (: y 1 x) (list x y)) ⇒ ((2 1) (3 1) (3 2) (4 1) (4 2) (4 3))
The qualifiers works as nested; that is,
(: x 2 5)
specifies to repeat the rest of the clauses—
(: y 1 x) and
(list x y).
The above two examples can be written in Haskell as the followings:
[ i*i | i <- [0..4] ] [ (x,y) | x <- [2..4], y <- [1..x-1] ]
Note the differences: (1) In Haskell, the body expression to yield the elements comes first, followed by qualifiers (selectors). In srfi-42, the body expression comes last. (2) In srfi-42, range operator’s lower bound is inclusive but its upper bound is exclusive.
List a set of numbers
(a b c d), where
a^3+b^3 = c^3+d^3:
(define (taxi-number n) (list-ec (: a 1 n) (: b (+ a 1) n) (: c (+ a 1) b) (: d (+ c 1) b) (if (= (+ (expt a 3) (expt b 3)) (+ (expt c 3) (expt d 3)))) (list a b c d)))
If you want to change values of more than one variable simultaneously, instead of nesting, you can bundle the qualifiers like this:
(list-ec (:parallel (: x '(a b c d)) (: y '(1 2 3 4))) (list x y)) ⇒ ((a 1) (b 2) (c 3) (d 4))
You can generate not only a list, but other sequences:
(vector-ec (: i 5) i) ⇒ #(0 1 2 3 4) (string-ec (: i 5) (integer->char (+ i 65))) ⇒ "ABCDE"
Or apply folding operations:
(sum-ec (: i 1 100) i) ⇒ 4950 ;; sum of integers from 1 below 100. (product-ec (: i 1 10) i) ⇒ 362880 ;; ... and product of them.
Each comprehension takes the following form.
(comprehension-macro qualifier … body)
It evaluates body repeatedly as specified by qualifier …. Depending on the type of comprehension, the results of body may be either collected to create an aggregate (list, vector, string, ...), folded by some operator (sum, product, min, max, ...), or simply discarded.
Each qualifier specifies how to repeat the following qualifiers and body. A qualifier can be a generational qualifier that yields a set of values to loop over, or a control qualifier that specify a condition to exclude some values. See the Qualifiers heading below.
A few comprehensions takes extra values before qualifiers or after body.
[SRFI-42] Repeats body. The results of body is discarded. This is for side-effecting operations.
[SRFI-42] Repeats body and collects the results into a list.
[SRFI-42] Repeats body, which must yield a list. Returns a list which is the concatenation of all lists returned by body.
Repeats body, which must yield a character (in
or a string (in
string-append-ec). Returns a string that
consists of the results of body.
[SRFI-42] Repeats body and collects the results into a vector.
This is like
vector-ec, except that the length of the
result vector is known to be k. It can be more efficient
vector-ec. Unless the comprehension repeats exactly
k times, an error is signaled.
[SRFI-42] body must yield a numeric value. Returns sum of and product of the results, respectively.
[SRFI-42] body must yield a numeric value. Returns maximum and minimum value of the results, respectively. body must be evaluated at least once, or an error is signaled.
Evaluates test for each iteration, and returns
#t as soon as it yields non-
#f as soon as it yields
Unlink the comprehensions introduced above, these stop evaluating
test as soon as the condition meets.
If the qualifiers makes no iteration,
First initializes the result by the value of the expression default,
then start iteration, and returns the value of the first and last
evaluation of body, respectively. In fact,
only evaluates body at most once.
These procedures are most useful when used with control qualifiers.
For example, the following
first-ec returns the first
set of distinct integers (x, y, z),
where x*x+y*y+z*z becomes a square
of another integer w.
(first-ec #f (:integers w) (: z 1 w) (: y 1 z) (: x 1 y) (if (= (* w w) (+ (* x x) (* y y) (* z z)))) (list x y z w))
Note that the first qualifier,
(:integers w), generates
infinite number of integers; if you use
list-ec instead of
first-ec it won’t stop.
[SRFI-42] Reduces the values produced by expr.
Suppose expr produces a sequence of values
x0, x1, …, xN.
calculates the following value:
(proc xN (…(proc x1 (proc x0 seed))…))
It’s similar to
fold, except that proc is evaluated
within the scope of qualifier … so you can refer to the
variables introduced by them. On the other hand,
seed is outside of the scope of qualifiers.
Fold-ec3 is almost the same but the initial value calculation.
fold-ec3, seed is only used when qualifiers
makes no iteration. Otherwise it calculates the following value:
(proc xN (…(proc x1 (init x0))…))
This type of qualifiers generates (possibly infinite) values over which the rest of clauses iterate.
In the following descriptions, vars refers to either
a single identifier, or a series of identifier and a form
(index identifier2). The single identifier
in the former case and the first identifier in the latter case
name the variable to which each generated value is bound.
The identifier2 in the latter case names a variable
to which a series of integers, increasing with each generated
element, is bound. See the following example:
(list-ec (: x '(a b c)) x) ⇒ (a b c) (list-ec (: x (index y) '(a b c)) (cons x y)) ⇒ ((a . 0) (b . 1) (c . 2))
A generic dispatcher of generational qualifiers. An appropriate generational qualifier is selected based on the types of arg1 args ….
Arg1 args … should be all lists, vectors, or strings, respectively. Repeats the subsequent clauses while binding each element from those args bound to vars.
(list-ec (:string c "ab" "cd") c) ⇒ (#\a #\b #\c #\d)
If the arguments given to the generic qualifier
all lists, vectors or strings, then these qualifiers are used.
Generates infinite series of increasing integers, starting from 0.
Generates a series of exact integers, starting from start (defaults to 0) and stops below stop, stepping by step (defaults to 1). Giving a negative integer to step makes a decreasing series.
(list-ec (:range v 5) v) ⇒ (0 1 2 3 4) (list-ec (:range v 3 8) v) ⇒ (3 4 5 6 7) (list-ec (:range v 1 8 2) v) ⇒ (1 3 5 7) (list-ec (:range v 8 1 -2) v) ⇒ (8 6 4 2)
If one, two or three exact integer(s) is/are given to the generic
:, this qualifier is used.
Generates a series of real numbers, starting from start (defaults to 0) and stops below stop, stepping by step (defaults to 1). If all the arguments are exact numbers, the result consists of exact numbers; if any of the arguments are inexact, the result consists of inexact numbers.
(list-ec (:real-range v 5.0) v) ⇒ (0.0 1.0 2.0 3.0 4.0) (list-ec (:real-range v 1 4 1/3) v) ⇒ (1 4/3 5/3 2 7/3 8/3 3 10/3 11/3) (list-ec (:real-range v 1 5.0 1/2) v) ⇒ (1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5)
If one, two or three real numbers is/are given to the generic
:, and any one of them isn’t an exact integer,
then this qualifier is used.
Generates a series of characters, starting from min and
ending at max (inclusive). The characters are enumerated
in the order defined by
char<=? (see Characters).
(list-ec (:char-range v #\a #\e) v) ⇒ (#\a #\b #\c #\d #\e)
If two characters are given to the generic
:, this qualifier is used.
Generates a series of values read from an input port port,
by the procedure read-proc (defaults to
series terminates when EOF is read.
(call-with-input-string "a \"b\" :c" (^p (list-ec (:port v p) v))) ⇒ (a "b" :c)
If one or two arguments are given to the generic
: and the first one is an input port,
then this qualifier is used.
This is Gauche’s extension and not defined in SRFI-42. gen must be a procedure with zero arguments. This qualifier repeats until gen returns EOF.
Gauche has a set of utilities to create and operate on such procedures; see Generators.
(use gauche.generator) (list-ec (:generator v (grange 1 8)) v) ⇒ (1 2 3 4 5 6 7)
If one argument is given to the generic
: and it is applicable without arguments,
then this qualifier is used.
This is used to run through mutiple generators in parallel. It terminates when any one of generator is exhausted.
(list-ec (:parallel (: x '(a b c)) (: y "defg")) (cons x y)) ⇒ ((a . #\d) (b . #\e) (c . #\f)) ;; Compare with this: (list-ec (: x '(a b c)) (: y "defg") (cons x y)) ⇒ ((a . #\d) (a . #\e) (a . #\f) (a . #\g) (b . #\d) (b . #\e) (b . #\f) (b . #\g) (c . #\d) (c . #\e) (c . #\f) (c . #\g))