gauche.sequence
- Sequence framework ¶Provides a generic operations on sequences. A sequence is a collection with ordered elements. Besides all the operations applicable on collections, you can associate integer index to each element, and apply order-aware operations on the elements.
This module inherits gauche.collection
(see gauche.collection
- Collection framework).
All the collection generic operations can be applied to a sequence as well.
Among Gauche builtin class, lists, vectors and strings are sequences and the specialized methods are defined for them. Other extension types, such as uniform vectors, have the methods as well.
• Fundamental sequence accessors: | ||
• Slicing sequence: | ||
• Mapping over sequences: | ||
• Other operations over sequences: | ||
• Implementing sequences: |
{gauche.sequence
}
Returns index-th element of the sequence seq.
This method enables uniform access for any sequence types.
When index is less than zero, or greater than or equal to the size of the sequence, fallback is returned if provided, or an error is signaled if not.
(ref '(a b c) 1) ⇒ b (ref '#(a b c) 1) ⇒ b (ref "abc" 1) ⇒ #\b
{gauche.sequence
}
Sets value to the index-th element of the sequence seq.
This is the uniform sequence modifier.
Note: Some sequences may not support arbitrary modification by index. For example, if you have a sequence representing a set of sorted integers, you cannot modify i-th element with arbitrary value. Yet such sequence may provide other means of modification, such as inserting or deleting elements.
(let ((x (list 'a 'b 'c))) (set! (ref x 1) 'z) x) ⇒ (a z c) (let ((x (vector 'a 'b 'c))) (set! (ref x 1) 'z) x) ⇒ #(a z c) (let ((x (string #\a #\b #\c))) (set! (ref x 1) #\z) x) ⇒ "azc"
{gauche.sequence
}
{gauche.sequence
}
{gauche.sequence
}
Retrieve a subsequence of the sequence seq, from
start-th element (inclusive) to end-th element (exclusive).
If end is omitted, up to the end of sequence is taken.
The type of the returned sequence is the same as seq.
(subseq '(a b c d e) 1 4) ⇒ (b c d) (subseq '#(a b c d e) 1 4) ⇒ #(b c d) (subseq "abcde" 1 4) ⇒ "bcd" (subseq '(a b c d e) 3) ⇒ (d e)
{gauche.sequence
}
Sets the elements of value-seq from the start-th element
(inclusive) to the end-th element (exclusive) of the sequence seq.
Value-seq can be any sequence, but its size
must be larger than (end - start).
In the second form, end is figured out by the length of value-seq.
(define s (vector 'a 'b 'c 'd 'e)) (set! (subseq s 1 4) '(4 5 6)) s ⇒ #(a 4 5 6 e) (set! (subseq s 0) "ab") s ⇒ #(#\a #\b 5 6 e)
You can use extended fold
, map
, for-each
and other
generic functions on sequences, since a sequence is also a collection.
However, sometimes you want to have index as well as the element itself
during iteration. There are several generic functions for it.
{gauche.sequence
}
Like generic fold
, except kons is given
the index within seq, as the first argument,
as well as each element from seqs and the accrued value.
(fold-with-index acons '() '(a b c)) ⇒ ((2 . c) (1 . b) (0 . a))
{gauche.sequence
}
Like map
, map-to
and for-each
, except proc
receives the index as the first argument.
(map-with-index list '(a b c d) '(e f g h)) ⇒ ((0 a e) (1 b f) (2 c g) (3 d h)) (map-to-with-index <vector> cons '(a b c d)) ⇒ #((0 . a) (1 . b) (2 . c) (3 . d))
{gauche.sequence
}
Finds the first element in seq that satisfies pred
like find
, but returns two values, the index of the element
and the element itself. If no element satisfies pred,
two #f
’s are returned.
(find-with-index char-upper-case? "abraCadabra") ⇒ 4 and #\C (find-with-index char-numeric? "abraCadabra") ⇒ #f and #f
{gauche.sequence
}
Like find
, but returns the index of the first element
that satisfies pred in seq, instead of the element itself.
If no element in seq satisfies pred, #f
is returned.
(find-index char-upper-case? "abraCadabra") ⇒ 4 (find-index char-numeric? "abraCadabra") ⇒ #f
See also list-index
in scheme.list
(see scheme.list
- R7RS lists).
{gauche.sequence
}
Generalization of fold-right
on lists.
Like fold
, this method applies a higher-order function
kons over given sequence(s), passing the "seed" value
whose default is knil. The difference between fold
and fold-right
is the associative order of elements on
which kons is applied.
When we have one sequence, [E0, E1, ..., En]
, fold
and fold-right
work as follows, respectively.
fold: (kons En (kons En-1 (kons ... (kons E1 (kons E1 knil)) ...))) fold-right (kons E0 (kons E1 (kons ... (kons En-1 (kons En knil)) ...)))
This method isn’t defined on <collection>
, since
collections don’t care the order of elements.
{gauche.sequence
}
Both needle and haystack must be sequences.
Searches needle from haystack from the beginning of haystack.
If needle is found, the index in haystack where it begins is
returned. Otherwise #f
is returned. The keyword argument
test is used to compare elements; its default is eqv?
.
(sequence-contains '#(a b r a c a d a b r a) '#(b r a)) ⇒ 1 (sequence-contains '#(a b r a c a d a b r a) '#(c r a)) ⇒ #f
This can be regarded as generalization of string-contains
in SRFI-13
(see String searching).
{gauche.sequence
}
Searches a sequence needle from list,
and if found, breaks list
to two parts—the prefix of list up to right before needle
begins, and the rest—and returns them.
List must be a list, but needle can be any sequence.
Elements are compared by test, defaulted to eqv?
.
(break-list-by-sequence '(a b r a c a d a b r a) '(c a d))
⇒ (a b r a) and (c a d a b r a)
If needle isn’t found in list, it returns list itself
and ()
. This behavior is aligned to span
and break
(see scheme.list
- R7RS lists), which split a list by predicate but
returns the whole list if split condition isn’t met.
(break-list-by-sequence '(a b r a c a d a b r c a) '(c a z))
⇒ (a b r a c a d a b r c a) and ()
The linear update version break-list-by-sequence!
modifies list to create the return value if necessary,
so list must be mutable. The caller must use the return
value instead of relying on side-effects, though,
for list may not be modified.
{gauche.sequence
}
This is an internal routine to search subsequence (needle) inside
larger sequence, using Knuth-Morris-Pratt (KMP) algorithm.
It is used in sequence-contains
,
break-list-by-sequence
and break-list-by-sequence!
.
Returns a procedure that performs one step of KMP search.
The procedure takes two arguments, an element elt
and an index k. It compares elt with
(~ needle k)
, and returns two values—the next index and
a flag indicating the match is completed. When the match is
completed, the next index is equal to the length of needle.
As an edge case, if needle is an empty sequence,
sequence->kmp-stepper
returns #f
.
Elements are compared using test, which is defaulted to eqv?
.
The following is a skeleton of searcher using
sequence->kmp-stepper
. Here we assume haystack
is a list, and we just return whether the needle is
found or not, or needle is empty; you might want to carry
around other info in the loop (e.g. sequence-contains
tracks the current index of haystack in order to
return the found index.)
(if-let1 stepper (sequence->kmp-stepper needle) (let loop ([haystack haystack] [k 0]) (if (null? haystack) 'not-found (receive (k found) (stepper (car haystack) k) ; KMP step (if found 'found (loop (cdr haystack) k))))) 'needle-is-empty)
Note that selection and searching methods for collections can also be applied to sequences. See Selection and searching in collection.
{gauche.sequence
}
Groups consecutive elements in a sequence seq which
have the common key value. A key value of an element is
obtained by applying the procedure key to the element;
the default procedure is identity
.
For each element in seq, key is applied exactly once.
The equal-ness of keys are compared by test procedure,
whose default is eqv?
.
(group-sequence '(1 1 1 2 3 4 4 2 2 3 1 1 3)) ⇒ ((1 1 1) (2) (3) (4 4) (2 2) (3) (1 1) (3)) (group-sequence '(1 1 1 2 3 4 4 2 2 3 1 1 3) :key (cut modulo <> 2))) ⇒ ((1 1 1) (2) (3) (4 4 2 2) (3 1 1 3)) (group-sequence '#("a" "a" "b" "b" "c" "d" "d") :test string=?) ⇒ (("a" "a") ("b" "b") ("c") ("d" "d")) (group-sequence "aabbcdd" :test char=?) ⇒ ((#\a #\a) (#\b #\b) (#\c) (#\d #\d))
This method is similar to Haskell’s group
.
If you want to group elements that are not adjacent,
use group-collection
(see Selection and searching in collection).
If you simply need to reduce each group for one instance,
that is, removing adjacent duplicated elements,
you can use delete-neighbor-dups
below.
{gauche.sequence
}
Group contiguous elements in seq.
(group-contiguous-sequence '(1 2 3 4 7 8 9 11 13 14 16)) ⇒ ((1 2 3 4) (7 8 9) (11) (13 14) (16))
If the keyword argument squeeze is true, each subsequence is represented with its first and last elements, except when the subsequence has only one element.
(group-contiguous-sequence '(1 2 3 4 7 8 9 11 13 14 16) :squeeze #t) ⇒ ((1 4) (7 9) (11) (13 14) (16))
The keyword argument key must be a procedure taking one argument,
and it is applied to every element in the sequence once, to construct
the result. Its default is identity
.
The keyword argument next must be a procedure
taking one argument, which is the key value (whatever key procedure
returns) and must return the “next” key value.
Its default is (^n (+ n 1))
.
The test argument must be a procedure taking two argument
and used to compare two key values. Its default is eqv?
.
(group-contiguous-sequence "AbCdFgH" :key char-upcase :next (^c (integer->char (+ 1 (char->integer c))))) ⇒ ((#\A #\B #\C #\D) (#\F #\G #\H))
{gauche.sequence
}
Returns a sequence of the same type as seq, in which
elements in seq are included in the original order, except
duplicate adjacent elements.
The type of seq must has a builder.
(delete-neighbor-dups '(1 1 1 2 3 4 4 2 2 3 1 1 3)) ⇒ (1 2 3 4 2 3 1 3) (delete-neighbor-dups '#(1 1 1 2 3 4 4 2 2 3 1 1 3)) ⇒ #(1 2 3 4 2 3 1 3) (delete-neighbor-dups "1112344223113") ⇒ "12342313"
Elements are compared with eqv?
by default. You can pass
alternative procedure to test keyword argument; it is always
called as (test x y)
,
where x and y are the contiguous elements in seq.
If elements are compared equal, the first one is kept:
(delete-neighbor-dups "AaaAbBBbCCcc" :test char-ci=?) ⇒ "AbC"
If key is provided, it must be a procedure that takes one arguments. It is applied to each element of seq at most once, and each resulting value is used for the comparison instead of elements themselves.
(delete-neighbor-dups '((1 . "a") (1 . "b") (2 . "c") (2 . "d")) :key car) ⇒ ((1 . "a") (2 . "c"))
The start and end arguments specify indexes in the seq to limit the range to look at. Where start is inclusive, end is exclusive.
(delete-neighbor-dups "1112344223113" :start 3 :end 9) ⇒ "2342"
{gauche.sequence
}
Scan seq from left to right, dropping consecutive duplicated
elements. The result is stored into seq, packed to left.
Note seq must be modifiable by index, i.e. modifier
method
must be defined. The rest of seq will be untouched.
Returns the next index after the last modified entry.
(let1 v (vector 1 1 2 2 3 3 2 2 4 4) (list (delete-neighbor-dups! v) v)) ⇒ (5 #(1 2 3 2 4 3 2 2 4 4))
The semantics of keyword arguments key, test,
start and end are the same as
delete-neighbor-dups
.
(let1 v (vector 1 1 2 2 3 3 2 2 4 4) (list (delete-neighbor-dups! v :start 2) v)) ⇒ (6 #(1 1 2 3 2 4 2 2 4 4))
Note: This method works on any sequence with modifier
method, but it’s not necessarily more efficient than
delete-neighbor-dups
, which creates a new sequence.
If seq is a list or a string, each modification by index takes O(n)
time (for a string even it costs O(n) extra storage), so the
total cost is O(n^2), whereas delete-neighbor-dups
needs O(n) time and storage.
This works best for vectors and alike,
with which it doesn’t cost extra allocation and runs in O(n) time.
{gauche.sequence
}
Operates like delete-neighbor-dups
but reuses storage of seq,
which will be resized by dropping duplicated elements.
Returns the sequence after dupes are removed.
Not all sequences are resizable, so this method won’t be defined
for such sequences. The gauche.sequence
module only
provides this method for <list>
, in which dropping the
middle of the sequence is very efficient as it is just a
single set-cdr!
.
(delete-neighbor-dups-squeeze! (list 1 1 1 2 2 2 3 3 3)) ⇒ (1 2 3) (delete-neighbor-dups-squeeze! (list 1 1 1 2 2 2 3 3 3 4 4) :start 3 :end 9) ⇒ (2 3)
The semantics of keyword arguments key, test,
start and end are the same as
delete-neighbor-dups
.
{gauche.sequence
}
Returns a new sequence of the same type of a which contains
the common prefix of sequences a and b. The types of
a and b doesn’t need to match.
The type of a must have a builder.
For each corresponding element in a and b, the key
procedure is applied (default identity
), then compared with
test procedure (default eqv?
).
(common-prefix '(a b c d e) '(a b c e f)) ⇒ (a b c) (common-prefix "abcef" '#(#\a #\b #\c #\d #\e)) ⇒ "abc"
For strings, SRFI-13 has a specific function with
related feature: string-prefix-length
(see String prefixes & suffixes).
{gauche.sequence
}
Returns a new sequence of the type class which contains
the common prefix of sequences a and b. The types of
a and b doesn’t need to match, and neither
needs to have a builder. The class must be a sequence class
with a builder.
The meanings of keyword arguments are the same as common-prefix
.
(common-prefix-to <list> "abcde" "ABCEF" :test char-ci=?) ⇒ '(#\a #\b #\c)
{gauche.sequence
}
Returns a newly created sequence of the same type as src, in which
the elements are permuted from src according to permuter.
Permuter is a sequence of exact integers. When the k-th element
of permuter is i, the k-th element of the result
is (ref src i)
. Therefore, the size of the result
sequence is the same as the size of permuter. Permuter
can be any kind of sequence, unrelated to the type of src.
It is allowed that the same index i can appear more than once in permuter.
(permute '(a b c d) '(3 2 0 1)) ⇒ (d c a b) (permute '(a b c d) '(0 2)) ⇒ (a c) (permute '(a b c d) '(0 0 1 1 2 2)) ⇒ (a a b b c c)
If an integer in permuter is out of the valid range as the index
of src, then an error is signaled unless fallback is given.
If fallback is given, what value is used depends on the result of
(ref src i fallback)
—which usually returns
fallback for the out-of-range index i.
(permute '#(a b c) '(3 2 1 0) 'foo) ⇒ #(foo c b a) (permute "!,HWdelor" #(2 5 6 6 7 1 -1 3 7 8 6 4 0) #\space) ⇒ "Hello, World!"
{gauche.sequence
}
Like permute
, but the result will be an instance of the given class
instead of the class of src.
(permute-to <string> '(#\a #\b #\c #\d #\r) '(0 1 4 0 2 0 3 0 1 4 0)) ⇒ "abracadabra"
{gauche.sequence
}
Also like permute
, but the result is stored back to src.
Src must be a mutable sequence, and the length of src and
permuter must be the same.
{gauche.sequence
}
Returns a new sequence of the same type and size as src,
in which elements are randomly permuted.
(shuffle '(a b c d e)) ⇒ (e b d c a) (shuffle "abcde") ⇒ "bacde"
This generic function uses SRFI-27 (see srfi.27
- Sources of Random Bits).
By default it uses default-random-source
, but you can pass
an alternative random source by the optional argument.
{gauche.sequence
}
Like shuffle
, except that the result will be an instance of
class instead of the class of src.
{gauche.sequence
}
Like shuffle
, but the result is stored back to src.
Src must be a mutable sequence.
When you define your own aggregate class that holds ordered objects, you can follow the sequence protocol below to make it sequence.
Inherit <sequence>
: You need to derive your class
from <sequence>
, so that sequence-related methods can be uniformly
used. Note that <sequence>
inherits <collection>
, so your
class will also be a collection.
Implement collection protocol: A sequence is also a collection,
and you need to implement a collection protocol
(see Implementing collections).
For the very least, you need to have call-with-iterator
method.
If you want your sequence to be buildable,
you also need to have call-with-builder
method.
All other collection generic functions have
default method on top of these two, so they’ll work; though
you can specialize other collection generic functions if it’s
more efficient. We strongly recommend to specialize size-of
,
for its default method walks over the entire sequence to find its size,
which can be very ineffient (O(N) time).
Implement sequence protocol: You need to define referencer
,
modifier
, at least.
If your sequence is immutable, you can leave modifier
undefined
(the default method raises an error). Again, you may want to
specialize other generic functions if it’s more efficient.
It is recommended to define call-with-reverse-iterator
as well,
which is used for operations that walks the sequence in the reverse order
(e.g. fold-right)
. If call-with-reverse-iterator
is not
specialized, the default method uses call-with-iterator
and keeps all the elements temporarily during execution, which
takes O(N) space.