For Development HEAD DRAFTSearch (procedure/syntax/module):

11.11 srfi.42 - Eager comprehensions

Module: srfi.42

This module provides a generic comprehension mechanism, which some other languages (e.g. Haskell and Python) offer as a built-in mechanism. It provides a rich set of operators so it can be used not only as a list generator but as a generic loop construct (actually, some may say it is as powerful/evil as Common Lisp’s loop macro).

It runs eagerly as the name suggests, that is, if it generates a list, it creates the entire list when evaluated, instead of generate the elements on demand. Thus it can’t represent an infinite sequence, which Haskell’s comprehension naturally does. Gauche offers a few alternatives to deal with lazy, possibly infinite, sequences: See Lazy sequences, gauche.generator - Generators and util.stream - Stream library.

Eager comprehension examples

Let’s begin with some examples.

Generate a list of squares for the first five integers:

(list-ec (: i 5) (* i i)) ⇒ (0 1 4 9 16)

list-ec is a comprehension macro that generates a list. The first form (: i 5) is called a qualifier, which specifies a set of values to repeat over (here it is each integer from 0 below 5). The last form (* i i) is called a body, which is an ordinary Scheme expression evaluated for each values generated by the qualifier.

A comprehension can have more than one qualifiers. Next example generate set of pair of numbers (x y), where x is between 2 (inclusive) and 5 (exclusive), and y is between 1 (inclusive) and x (exclusive).

(list-ec (: x 2 5) (: y 1 x) (list x y))
  ⇒ ((2 1) (3 1) (3 2) (4 1) (4 2) (4 3))

The qualifiers works as nested; that is, (: x 2 5) specifies to repeat the rest of the clauses—(: y 1 x) and (list x y).

The above two examples can be written in Haskell as the followings:

[ i*i   | i <- [0..4] ]
[ (x,y) | x <- [2..4], y <- [1..x-1] ]

Note the differences: (1) In Haskell, the body expression to yield the elements comes first, followed by qualifiers (selectors). In SRFI-42, the body expression comes last. (2) In SRFI-42, range operator’s lower bound is inclusive but its upper bound is exclusive.

List a set of numbers (a b c d), where a^3+b^3 = c^3+d^3:

(define (taxi-number n)
  (list-ec (: a 1 n)
           (: b (+ a 1) n)
           (: c (+ a 1) b)
           (: d (+ c 1) b)
           (if (= (+ (expt a 3) (expt b 3))
                  (+ (expt c 3) (expt d 3))))
           (list a b c d)))

If you want to change values of more than one variable simultaneously, instead of nesting, you can bundle the qualifiers like this:

(list-ec (:parallel (: x '(a b c d)) (: y '(1 2 3 4)))
         (list x y))
  ⇒ ((a 1) (b 2) (c 3) (d 4))

You can generate not only a list, but other sequences:

(vector-ec (: i 5) i) ⇒ #(0 1 2 3 4)
(string-ec (: i 5) (integer->char (+ i 65))) ⇒ "ABCDE"

Or apply folding operations:

(sum-ec (: i 1 100) i)
  ⇒ 4950    ;; sum of integers from 1 below 100.
(product-ec (: i 1 10) i)
  ⇒ 362880 ;; ... and product of them.

Comprehension macros

Each comprehension takes the following form.

(comprehension-macro qualifier ... body)

It evaluates body repeatedly as specified by qualifier …. Depending on the type of comprehension, the results of body may be either collected to create an aggregate (list, vector, string, ...), folded by some operator (sum, product, min, max, ...), or simply discarded.

Each qualifier specifies how to repeat the following qualifiers and body. A qualifier can be a generational qualifier that yields a set of values to loop over, or a control qualifier that specify a condition to exclude some values. See the Qualifiers heading below.

A few comprehensions takes extra values before qualifiers or after body.

Macro: do-ec qualifier … body

[SRFI-42]{srfi.42} Repeats body. The results of body is discarded. This is for side-effecting operations.

Macro: list-ec qualifier … body

[SRFI-42]{srfi.42} Repeats body and collects the results into a list.

Macro: append-ec qualifier … body

[SRFI-42]{srfi.42} Repeats body, which must yield a list. Returns a list which is the concatenation of all lists returned by body.

Macro: string-ec qualifier … body
Macro: string-append-ec qualifier … body

[SRFI-42]{srfi.42} Repeats body, which must yield a character (in string-ec) or a string (in string-append-ec). Returns a string that consists of the results of body.

Macro: vector-ec qualifier … body

[SRFI-42]{srfi.42} Repeats body and collects the results into a vector.

Macro: vector-of-length-ec k qualifier … body

[SRFI-42]{srfi.42} This is like vector-ec, except that the length of the result vector is known to be k. It can be more efficient than vector-ec. Unless the comprehension repeats exactly k times, an error is signaled.

Macro: sum-ec qualifier … body
Macro: product-ec qualifier … body

[SRFI-42]{srfi.42} body must yield a numeric value. Returns sum of and product of the results, respectively.

Macro: min-ec qualifier … body
Macro: max-ec qualifier … body

[SRFI-42]{srfi.42} body must yield a real number. Returns maximum and minimum value of the results, respectively. body must be evaluated at least once, or an error is signaled.

Macro: any?-ec qualifier … test
Macro: every?-ec qualifier … test

[SRFI-42]{srfi.42} Evaluates test for each iteration, and returns #t as soon as it yields non-#f (for any?-ec), or returns #f as soon as it yields #f (for every?-ec). Unlink the comprehensions introduced above, these stop evaluating test as soon as the condition meets. If the qualifiers makes no iteration, #f and #t are returned, respectively.

Macro: first-ec default qualifier … body
Macro: last-ec default qualifier … body

[SRFI-42]{srfi.42} First initializes the result by the value of the expression default, then start iteration, and returns the value of the first and last evaluation of body, respectively. In fact, first-ec only evaluates body at most once.

These forms are most useful when used with control qualifiers. For example, the following first-ec returns the first set of distinct integers (x, y, z), where x*x+y*y+z*z becomes a square of another integer w.

(first-ec #f (:integers w) (: z 1 w) (: y 1 z) (: x 1 y)
          (if (= (* w w) (+ (* x x) (* y y) (* z z))))
          (list x y z w))

Note that the first qualifier, (:integers w), generates infinite number of integers; if you use list-ec instead of first-ec it won’t stop.

Macro: fold-ec seed qualifier … expr proc
Macro: fold3-ec seed qualifier … expr init proc

[SRFI-42]{srfi.42} Reduces the values produced by expr.

Suppose expr produces a sequence of values x0, x1, …, xN. Fold-ec calculates the following value:

(proc xN (...(proc x1 (proc x0 seed))...))

It’s similar to fold, except that proc is evaluated within the scope of qualifier … so you can refer to the variables introduced by them. On the other hand, seed is outside of the scope of qualifiers.

Fold-ec3 is almost the same but the initial value calculation. In fold-ec3, seed is only used when qualifiers makes no iteration. Otherwise it calculates the following value:

(proc xN (...(proc x1 (init x0))...))

Qualifiers

Generational qualifiers

This type of qualifiers generates (possibly infinite) values over which the rest of clauses iterate.

In the following descriptions, vars refers to either a single identifier, or a series of identifier and a form (index identifier2). The single identifier in the former case and the first identifier in the latter case name the variable to which each generated value is bound. The identifier2 in the latter case names a variable to which a series of integers, increasing with each generated element, is bound. See the following example:

(list-ec (: x '(a b c)) x)
  ⇒ (a b c)
(list-ec (: x (index y) '(a b c)) (cons x y))
  ⇒ ((a . 0) (b . 1) (c . 2))
EC Qualifier: : vars arg1 args …

A generic dispatcher of generational qualifiers. An appropriate generational qualifier is selected based on the types of arg1 args ….

EC Qualifier: :list vars arg1 args …
EC Qualifier: :vector vars arg1 args …
EC Qualifier: :uvector vars arg1 args …
EC Qualifier: :string vars arg1 args …

Arg1 args … should be all lists, vectors, uniform vectors or strings, respectively. Repeats the subsequent clauses while binding each element from those args bound to vars. (The :uvector qualifier is Gauche’s extension.)

(list-ec (:string c "ab" "cd") c) ⇒ (#\a #\b #\c #\d)

If the arguments given to the generic qualifier : are all lists, vectors, uniform vectors or strings, then these qualifiers are used.

EC Qualifier: :integers vars

Generates infinite series of increasing integers, starting from 0.

EC Qualifier: :range vars stop
EC Qualifier: :range vars start stop
EC Qualifier: :range vars start stop step
EC Qualifiter: :range vars range

The first three forms generates a series of exact integers, starting from start (defaults to 0) and stops below stop, stepping by step (defaults to 1). Giving a negative integer to step makes a decreasing series.

(list-ec (:range v 5) v)      ⇒ (0 1 2 3 4)
(list-ec (:range v 3 8) v)    ⇒ (3 4 5 6 7)
(list-ec (:range v 1 8 2) v)  ⇒ (1 3 5 7)
(list-ec (:range v 8 1 -2) v) ⇒ (8 6 4 2)

If one, two or three exact integer(s) is/are given to the generic qualifier :, this qualifier is used.

If a range object (see data.range - Range) is given to this qualifier, as in the fourth form, this generates each element in the range sequentially.

EC Qualifier: :real-range vars stop
EC Qualifier: :real-range vars start stop
EC Qualifier: :real-range vars start stop step

Generates a series of real numbers, starting from start (defaults to 0) and stops below stop, stepping by step (defaults to 1). If all the arguments are exact numbers, the result consists of exact numbers; if any of the arguments are inexact, the result consists of inexact numbers.

(list-ec (:real-range v 5.0) v)
  ⇒ (0.0 1.0 2.0 3.0 4.0)
(list-ec (:real-range v 1 4 1/3) v)
  ⇒ (1 4/3 5/3 2 7/3 8/3 3 10/3 11/3)
(list-ec (:real-range v 1 5.0 1/2) v)
  ⇒ (1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5)

If one, two or three real numbers is/are given to the generic qualifier :, and any one of them isn’t an exact integer, then this qualifier is used.

EC Qualifier: :char-range vars min max

Generates a series of characters, starting from min and ending at max (inclusive). The characters are enumerated in the order defined by char<=? (see Characters).

(list-ec (:char-range v #\a #\e) v)
  ⇒ (#\a #\b #\c #\d #\e)

If two characters are given to the generic qualifier :, this qualifier is used.

EC Qualifier: :port vars port
EC Qualifier: :port vars port read-proc

Generates a series of values read from an input port port, by the procedure read-proc (defaults to read). The series terminates when EOF is read.

(call-with-input-string "a \"b\" :c"
  (^p (list-ec (:port v p) v)))
  ⇒ (a "b" :c)

If one or two arguments are given to the generic qualifier : and the first one is an input port, then this qualifier is used.

EC Qualifier: :generator vars gen

This is Gauche’s extension and not defined in SRFI-42. gen must be a procedure with zero arguments. This qualifier repeats until gen returns EOF.

Gauche has a set of utilities to create and operate on such procedures; see gauche.generator - Generators.

(use gauche.generator)
(list-ec (:generator v (grange 1 8)) v)
  ⇒ (1 2 3 4 5 6 7)

If one argument is given to the generic qualifier : and it is applicable without arguments, then this qualifier is used.

EC Qualifier: :collection vars coll

This is Gauche’s extension and not defined in SRFI-42. coll must be an instance of <collection> or its subclass. (see gauche.collection - Collection framework). This qualifier repeats over the elements in the collection.

If srfi-42 has a specialized qualifier, it is faster to use it (e.g. a vector is also a collection, but using :vector is faster than :collection).

If one argument is given to the generic qualifier : and it is a collection other than the specific types supported directly in SRFI-42, this qualifer is used.

EC Qualifier: :parallel generator …

This is used to run through multiple generators in parallel. It terminates when any one of generator is exhausted.

(list-ec (:parallel (: x '(a b c))
                    (: y "defg"))
  (cons x y))
 ⇒ ((a . #\d) (b . #\e) (c . #\f))

;; Compare with this:
(list-ec (: x '(a b c))
         (: y "defg")
  (cons x y))
 ⇒ ((a . #\d) (a . #\e) (a . #\f) (a . #\g)
     (b . #\d) (b . #\e) (b . #\f) (b . #\g)
     (c . #\d) (c . #\e) (c . #\f) (c . #\g))
EC Qualifier: :let vars expr

Evaluate expr, and bind the result to vars and execure the following clauses once. If vars has an index var, it is bound to 0. It is effectively the same as (:list vars (list expr)).

EC Qualifier: :while g-qualifier expr
EC Qualifier: :until generator expr

Generates values from g-qualifier while/until expr evaluates true. Here, g-qualifier is one of generative EC qualifiers. The variable bound in g-qualifier is visible from expr.

(use math.prime)
(sum-ec (:while (: p (index k) *primes*) (<= k 100)) p)
EC Qualifier: :dispatched vars dispatch arg1 args …
EC Qualifier: :do (lb …) ne1? (ls …)
EC Qualifier: :do (let (ob …) oc …) (lb …) ne1? (let (ib …) ic …) ne2? (ls …)

Control qualifiers

EC Qualifier: if test

Evaluates test, and if it yields false, stops that iteration and start the next iteration.

The following examples returns a list of pythagorian triplets (a b c), which satisfies a^2 + b^2 = c^2, less than 100.

(list-ec (: c 1 100) (: a 1 c) (: b a c)
         (if (= (square c) (+ (square a) (square b))))
         (list a b c))
  ⇒ ((3 4 5) (6 8 10) (5 12 13) (9 12 15) (8 15 17) ...)
EC Qualifier: not test
EC Qualifier: and test …
EC Qualifier: or test …

A shorthand of (if (not test)), (if (and test …)), and (if (or test …)).

EC Qualifier: begin command … expr

During iteration, evaluates command … for side effects before evalutes expr. The result of commands are discarded.

EC Qualifier: nested qualifier …

Splices qualifier …. For example, (list-ec (: a 2) (nested (: b 2) (: c 2)) (: d 2) (list a b c d)) is equivalent to (list-ec (: a 2) (: b 2) (: c 2) (: d 2) (list a b c d)).



For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT