For Gauche 0.9.5


Next: , Previous: , Up: Library modules - SRFIs   [Contents][Index]

11.6 srfi-13 - String library

Module: srfi-13

Defines a large set of string-related functions. In Gauche, those functions are splitted to number of files and the form (use srfi-13) merely sets up autoloading of those files. So it is not likely to slow down the script startup. See SRFI-13 (SRFI-13) for the detailed specification and discussion of design issues. This manual serves as a reference of function API. Some SRFI-13 functions are Gauche built-in and not listed here. Note: SRFI-13 documents suggests the name of the module that implements these functions to be “string-lib” and “string-lib-internals”. Gauche uses the name “srfi-13” for consistency.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.1 General conventions

There are a few common factors in string library API, which I don’t repeat in each function description

argument convention

The following argument names imply their types.

s, s1, s2

Those arguments must be strings.

char/char-set/pred

This argument can be a character, a character-set object, or a predicate that takes a single character and returns a boolean value. “Applying char/char-set/pred to a character” means, if char/char-set/pred is a character, it is compared to the given character; if char/char-set/pred is a character set, it is checked if the character set contains the given character; if char/char-set/pred is a procedure, it is applied to the given character. “A character satisfies char/char-set/pred” means such application to the character yields true value.

start, end

Lots of SRFI-13 functions takes these two optional arguments, which limit the area of input string from start-th character (inclusive) to end-th character (exclusive), where the operation is performed. When specified, the condition 0 <= start <= end <= length of the string must be satisfied. Default value of start and end is 0 and the length of the string, respectively.

shared variant

Some functions have variants with “/shared” attached to its name. SRFI-13 defines those functions to allow to share the part of input string, for better performance. Gauche doesn’t have a concept of shared string, and these functions are mere synonyms of their non-shared variants. However, Gauche internally shares the storage of strings, so generally you don’t need to worry about the overhead of copying substrings.

right variant

Most functions works from left to right of the input string. Some functions have variants with “-right” to its name, that works from right to left.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.2 String predicates

Function: string-null? s

[SRFI-13] Returns #t if s is an empty string, "".

Function: string-every char/char-set/pred s :optional start end

[SRFI-13] Sees if every character in s satisfies char/char-set/pred. If so, string-every returns the value that is returned at the last application of char/char-set/pred. If any of the application returns #f, string-every returns #f immediately.

Function: string-any char/char-set/pred s :optional start end

[SRFI-13] Sees if any character in s satisfies char/char-set/pred. If so, string-any returns the value that is returned by the application. If no character satisfies char/char-set/pred, #f is returned.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.3 String Constructors

Function: string-tabulate proc len

[SRFI-13] proc must be a procedure that takes an integer argument and returns a character. string-tabulate creates a string, whose i-th character is calculated by (proc i).

(string-tabulate
  (lambda (i) (integer->char (+ i #x30))) 10)
 ⇒ "0123456789"
Function: reverse-list->string char-list

[SRFI-13] ≡ (list->string (reverse char-list)).


Next: , Previous: , Up: String library   [Contents][Index]

11.6.4 String selection

Function: substring/shared s start :optional end

[SRFI-13] In Gauche, this is the same as substring, except that the end argument is optional.

(substring/shared "abcde" 2) ⇒ "cde"
Function: string-copy! target tstart s :optional start end

[SRFI-13] Copies a string s into a string target from the position tstart. The target string must be mutable. Optional start and end arguments limits the range of s. If the copied string run over the end of target, an error is signaled.

(define s (string-copy "abcde"))
(string-copy! s 2 "ZZ")
s ⇒ "abZZe"

It is ok to pass the same string to target and s; this always work even if the regions of source and destination are overlapping.

Function: string-take s nchars
Function: string-drop s nchars
Function: string-take-right s nchars
Function: string-drop-right s nchars

[SRFI-13] Returns the first nchars-character string of s (string-take) or the string without first nchars (string-drop). The *-right variation counts from the end of string. It is guaranteed that the returned string is always a copy of s, even no character is dropped.

(string-take "abcde" 2) ⇒ "ab"
(string-drop "abcde" 2) ⇒ "cde"

(string-take-right "abcde" 2) ⇒ "de"
(string-drop-right "abcde" 2) ⇒ "abc"
Function: string-pad s len :optional char start end
Function: string-pad-right s len :optional char start end

[SRFI-13] If a string s is shorter than len, returns a string of len where char is padded to the left or right, respectively. If s is longer than len, the rightmost or leftmost len chars are taken. Char defaults to #\space. If start and end are provided, the substring of s is used as the source.

(string-pad "abc" 10)    ⇒ "       abc"
(string-pad "abcdefg" 3) ⇒ "efg"

(string-pad-right "abc" 10) ⇒ "abc       "

(string-pad "abcdefg" 10 #\+ 2 5)
  ⇒ "+++++++cde"
Function: string-trim s :optional char/char-set/pred start end
Function: string-trim-right s :optional char/char-set/pred start end
Function: string-trim-both s :optional char/char-set/pred start end

[SRFI-13] Removes characters that match char/char-set/pred from s. String-trim removes the characters from left of s, string-trim-right does from right, and string-trim-both does from both sides. Char/char-set/pred defaults to #[\s], i.e. a char-set of whitespaces. If start and end are provided, the substring of s is used as the source.

(string-trim "   abc  ")       ⇒ "abc  "
(string-trim-right "   abc  ") ⇒ "   abc"
(string-trim-both "   abc  ")  ⇒ "abc"

Next: , Previous: , Up: String library   [Contents][Index]

11.6.5 String comparison

Function: string-compare s1 s2 proc< proc= proc> :optional start1 end1 start2 end2
Function: string-compare-ci s1 s2 proc< proc= proc> :optional start1 end1 start2 end2

[SRFI-13] Compares two strings s1 and s2 codepoint-wise from left. When mismatch is found at the index k of s1, calls proc< with k if s1’s codepoint is smaller than the corresponding s2’s, or calls proc> if s1’s one is greater than s2’s. If two strings are the same, calls proc= with the index of the last compared position in s1.

(string-compare "abcd" "abzd"
                (^i `(< ,i)) (^i `(= ,i)) (^i `(> ,i)))
  ⇒ (< 2)

(string-compare "abcd" "abcd"
                (^i `(< ,i)) (^i `(= ,i)) (^i `(> ,i)))
  ⇒ (= 3)

The optional arguments restricts the range of the input strings; however, the index passed to one of the procedures is always an index from the beginning of s1.

(string-compare "zzabcdyy" "abcz"
   (^i `(< ,i)) (^i `(= ,i)) (^i `(> ,i)) 2 6 0 4)
 ⇒ (< 5)

(string-compare "zzabcdyy" "abcz"
   (^i `(< ,i)) (^i `(= ,i)) (^i `(> ,i)) 2 5 0 3)

 ⇒ (= 4)

The case-insensitive variant, string-compare-ci, compares each codepoint with character-wise case-folding. It won’t consider special case folding such as German eszett.

Function: string= s1 s2 :optional start1 end1 start2 end2
Function: string<> s1 s2 :optional start1 end1 start2 end2
Function: string< s1 s2 :optional start1 end1 start2 end2
Function: string<= s1 s2 :optional start1 end1 start2 end2
Function: string> s1 s2 :optional start1 end1 start2 end2
Function: string>= s1 s2 :optional start1 end1 start2 end2

[SRFI-13] Compare two strings s1 and s2. Optional arguments can limit the portion of strings to be compared. Comparison is done by character-wise.

Note: The builtin procedures string=? etc. can also be used for character-wise string comparison, but they take arguments differently. See String Comparison.

Function: string-ci= s1 s2 :optional start1 end1 start2 end2
Function: string-ci<> s1 s2 :optional start1 end1 start2 end2
Function: string-ci< s1 s2 :optional start1 end1 start2 end2
Function: string-ci<= s1 s2 :optional start1 end1 start2 end2
Function: string-ci> s1 s2 :optional start1 end1 start2 end2
Function: string-ci>= s1 s2 :optional start1 end1 start2 end2

[SRFI-13] Compare two strings s1 and s2 in case-insensitive way. Optional arguments can limit the portion of strings to be compared. Case folding and comparison is done by character-wise, so they don’t consider case folding that affects multiple characters.

Note: We have two other sets of string comparison operations, both are named as string-ci=? etc. The builtin version (see String Comparison) does character-wise comparison. The one in gauche.unicode uses full-string case conversion (see Full string case conversion). R7RS version is the latter.

Function: string-hash s :optional bound start end
Function: string-hash-ci s :optional bound start end

[SRFI-13] (Note: Gauche has builtin string-hash and string-ci-hash according to SRFI-128. See Hashing, for the details. SRFI-13’s API is upper-compatible to SRFI-128’s. The underlying hash algorighm is the same as the builtin ones, so string-hash returns the same value as the builtin ones for the same string if optional arguments are omitted. On the other hand, the builtin string-ci-hash uses string case folding (e.g. German eszett and SS are the same), while SRFI-13’s string-hash-ci uses character-wise case folding. Unless there’s a strong reason, we recommend new code should use builtin SRFI-128 version instead of SRFI-13.)

Calculates hash value of a string s. For string-hash-ci, character-wise case folding is done before calculating the hash value.

If the optional bound argument is given, it must be a positive exact integer, and the return value is limited below it. The optional start and end arguments allows using that portion for calculation.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.6 String Prefixes & Suffixes

Function: string-prefix-length s1 s2 :optional start1 end1 start2 end2
Function: string-suffix-length s1 s2 :optional start1 end1 start2 end2
Function: string-prefix-length-ci s1 s2 :optional start1 end1 start2 end2
Function: string-suffix-length-ci s1 s2 :optional start1 end1 start2 end2

[SRFI-13] Returns the length of the longest common prefix/suffix of two strings, s1 and s2. The optional arguments restrict the range of search. The *-ci variations use case foling character comparison.

(string-prefix-length "abacus" "abalone")   ⇒ 3
(string-prefix-length "machine" "umbrella") ⇒ 0
(string-suffix-length "peeking" "poking")   ⇒ 4

(string-prefix-length "obvious" "oblivious" 2 7 4 9)
  ⇒ 5
Function: string-prefix? s1 s2 :optional start1 end1 start2 end2
Function: string-suffix? s1 s2 :optional start1 end1 start2 end2
Function: string-prefix-ci? s1 s2 :optional start1 end1 start2 end2
Function: string-suffix-ci? s1 s2 :optional start1 end1 start2 end2

[SRFI-13] Returns true iff s2 is a prefix or suffix of s1, respectively. The optional arguments limit the range of s1 and s2 to look at. The *-ci variations use case foling character comparison.

(string-prefix? "scheme" "sch")   ⇒ #t
(string-prefix? "scheme" "lisp")  ⇒ #f

(string-prefix? "mit-scheme" "scheme" 4) ⇒ #t

Next: , Previous: , Up: String library   [Contents][Index]

11.6.7 String searching

Function: string-index s char/char-set/pred :optional start end
Function: string-index-right s char/char-set/pred :optional start end

[SRFI-13] Looks for the first element in a string s that matches char/char-set/pred, and returns its index. If char/char-set/pred is not found in s, returns #f. Optional start and end limit the range of s to search.

(string-index "Aloha oe" #\a) ⇒ 4
(string-index "Aloha oe" #[Aa]) ⇒ 0
(string-index "Aloha oe" #[\s]) ⇒ 5
(string-index "Aloha oe" char-lower-case?) ⇒ 1
(string-index "Aloha oe" #\o 3) ⇒ 6

See also the Gauche built-in procedure string-scan (String utilities), if you need speed over portability.

Function: string-skip s char/char-set/pred :optional start end
Function: string-skip-right s char/char-set/pred :optional start end

[SRFI-13] Looks for the first element that does not match char/char-set/pred and returns its index. If such element is not found, returns #f. Optional start and end limit the range of s to search.

Function: string-count s char/char-set/pred :optional start end

[SRFI-13] Counts the number of elements in s that matches char/char-set/pred. Optional start and end limit the range of s to search.

Function: string-contains s1 s2 :optional start1 end1 start2 end2
Function: string-contains-ci s1 s2 :optional start1 end1 start2 end2

[SRFI-13] Looks for a string s2 inside another string s1. If found, returns an index in s1 from where the matching string begins. Returns #f otherwise. Optional start1, end1, start2 and end2 limits the range of s1 and s2.

See also the Gauche built-in procedure string-scan (String utilities), if you need speed over portability.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.8 String case mapping

Function: string-titlecase s :optional start end
Function: string-titlecase! s :optional start end
Function: string-upcase s :optional start end
Function: string-upcase! s :optional start end
Function: string-downcase s :optional start end
Function: string-downcase! s :optional start end

[SRFI-13] Converts a string s to titlecase, upcase or downcase, respectively. These operations uses character-by-character mapping provided by char-upcase etc. That is, string-upcase and string-downcase can be understood as follow:

(string-upcase s)
  ≡ (string-map char-upcase s)
(string-downcase s)
  ≡ (string-map char-downcase s)

If you need full case mapping that handles the case when a character is mapped to more than one characters, use the procedures with the same name in gauche.unicode module (see Full string case conversion).

The linear-update version string-titlecase!, string-upcase! and string-downcase! destroys s to store the result. Note that in Gauche, using those procedures doesn’t save anything, since string mutation is expensive by design. They are provided merely for completeness.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.9 String reverse & append

Function: string-reverse s :optional start end
Function: string-reverse! s :optional start end

[SRFI-13] Returns a string in which the character positions are reversed from s. string-reverse! modifies s.

(string-reverse "mahalo") ⇒ "olaham"
(string-reverse "mahalo" 3) ⇒ "ola"
(string-reverse "mahalo" 1 4) ⇒ "aha"

(let ((s (string-copy "mahalo")))
  (string-reverse! s 1 5)
  s)
  ⇒ "mlahao"
Function: string-concatenate string-list

[SRFI-13] Concatenates list of strings.

(string-concatenate '("humuhumu" "nukunuku" "apua" "`a"))
  ⇒ "humuhumunukunukuapua`a"
Function: string-concatenate/shared string-list
Function: string-append/shared s …

[SRFI-13] “Shared” version of string-concatenate and string-append. In Gauche, these are just synonyms of them.

Function: string-concatenate-reverse string-list
Function: string-concatenate-reverse/shared string-list

[SRFI-13] Reverses string-list before concatenation. “Shared” version works the same in Gauche.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.10 String mapping

Function: string-map proc s :optional start end
Function: string-map! proc s :optional start end

[SRFI-13] string-map applies proc on every character of s, and collects the results into a string and returns it. On the other hand, string-map! modifies s.

(string-map char-upcase "wikiwiki") ⇒ "WIKIWIKI"
(string-map char-upcase "wikiwiki" 4) ⇒ "WIKI"

(let ((s (string-copy "wikiwiki")))
  (string-map! char-upcase s 4)
  s)
  ⇒ "wikiWIKI"
Function: string-fold kons knil s :optional start end
Function: string-fold-right kons knil s :optional start end

[SRFI-13] Like fold and fold-right (see Walking over lists), but works on a string instead of a list.

(string-fold cons '() "abcde")
  ⇒ (#\e #\d #\c #\b #\a)
(string-fold-right cons '() "abcde")
  ⇒ (#\a #\b #\c #\d #\e)
Function: string-unfold p f g seed :optional base make-final

[SRFI-13] A fundamental string builder. The p, f and g are procedures, taking the current seed value. The stop predicate p determines when to stop: If it returns a true value, string building stops. The mapping function f returns a character from the current seed value. The next seed function g returns a next seed value from the current seed value. The seed argument gives the initial seed value.

(string-unfold (^n (= n 10))
               (^n (integer->char (+ n 48)))
               (^n (+ n 1))
               0)
  ⇒ "0123456789"

The optional argument base is, when given, prepended to the result string. Another optional argument make-final is a procedure that takes the last return value of g and returns a string that becomes the suffix of the result string.

(string-unfold (^n (= n 10))
               (^n (integer->char (+ n 48)))
               (^n (+ n 1))
               0 "foo" x->string)
  ⇒ "foo012345678910"
Function: string-unfold-right p f g seed :optional base make-final

[SRFI-13] Another fundamental string builder. The meanings of arguments are the same as ‘string-unfold’. The only difference is that the string is build right-to-left. The optional base, if given, becomes the suffix of result, and the result of make-final becomes the prefix.

(string-unfold-right (^n (= n 10))
                     (^n (integer->char (+ n 48)))
                     (^n (+ n 1))
                     0 "foo" x->string)
  ⇒ "109876543210foo"
Function: string-for-each proc s :optional start end

[SRFI-13] Apply proc on each character of string s, from left to right. Optional start and end arguments limit the range of the input string.

Function: string-for-each-index proc s :optional start end

[SRFI-13] Call proc on each index of the string s.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.11 String rotation

Function: xsubstring s from :optional to start end

[SRFI-13] Takes a substring of inifinite repetition of string s between index from (inclusive) and index to (exclusive).

For example, if s is "abcde", we repeat it infinitely to both sides. So 5n-th character for integer n is always #\a, which extends negative n as well.

(xsubstring "abcde" 2 10)
  ⇒ "cdeabcde"
(xsubstring "abcde" -9 -2)
  ⇒ "bcdeabc"
Function: string-xcopy! target tstart s sfrom :optional sto start end

[SRFI-13]


Next: , Previous: , Up: String library   [Contents][Index]

11.6.12 Other string operations

Function: string-replace s1 s2 start1 end1 :optional start2 end2

[SRFI-13] Returns a new string whose content is a copy of a string s1, except the part beginning from the index start1 (inclusive) and ending at the index end1 (exclusive) are replaced by a string s2. When optional start2 and end2 arguments are given, s2 is trimmed first according to them. The size of the gap, (- end1 start1), doesn’t need to be the same as the size of the inserted string. Effectively, this is the same as the following code.

(string-append (substring s1 0 start1)
               (substring s2 start2 end2)
               (substring s1 end1 (string-length s1)))
Function: string-tokenize s :optional token-set start end

[SRFI-13] Splits the string s into a list of substrings, where each substring is a maximal non-empty contiguous sequence of characters from the character set token-set. The default of token-set is char-set:graphic (see SRFI-14 Predefined character-set).

See also Gauche’s built-in string-split (see String utilities), which provides similar features but different criteria.


Next: , Previous: , Up: String library   [Contents][Index]

11.6.13 String filtering

Function: string-filter char/char-set/pred s :optional start end
Function: string-delete char/char-set/pred s :optional start end

[SRFI-13] Returns a string consists of characters in a string s that passes (or don’t pass) the test indicated by char/char-set/pred, respectively.

(string-filter char-upper-case? "Hello, World!")
  ⇒ "HW"

(string-delete char-upper-case? "Hello, World!")
  ⇒ "ello, orld!"

(string-delete #\l "Hello, World!")
  ⇒ "Heo, Word!"

(string-filter #[\w] "Hello, World!")
  ⇒ "HelloWorld"

Note: Srfi-13 was revised after finalization to switch the order of arguments char/char-set/pred and s was. At the time of finalization, the order was (string-filter s pred) and Gauche implemented it accordingly. However, most existing implementations follows the revised order, since that was what the srfi-13 reference implementation had.

So, from 0.9.4, we revised the API to comply the current srfi-13 spec, but we also accept the old order as well not to break the old code. We recommend the new code to use the new order.


Previous: , Up: String library   [Contents][Index]

11.6.14 Low-level string procedures

Function: string-parse-start+end proc s args
Function: string-parse-final-start+end proc s args

[SRFI-13]

Macro: let-string-start+end (start end [rest]) proc-exp s-exp args-exp body …

[SRFI-13]

Function: check-substring-spec proc s start end
Function: substring-spec-ok? s start end

[SRFI-13]

Function: make-kmp-restart-vector s :optional c= start end

[SRFI-13]

Function: kmp-step pat rv c i c= p-start

[SRFI-13]

Function: string-kmp-partial-search pat rv s i :optional c= p-start s-start s-end

[SRFI-13]


Previous: , Up: String library   [Contents][Index]