[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

12.61 util.trie - Trie

Module: util.trie

This module provides Trie, a dictionary-like data structure that maps keys to values, where a key is an arbitrary sequence. Internally it stores the data as a tree where each node corresponds to each element in the key sequence. Key lookup is O(n) where n is the length of the key, and not affected much by the number of total entries. Also it is easy to find a set of values whose keys have a common prefix.

The following example may give you the idea.

 
(define t (make-trie))   ;; create a trie

(trie-put! t "pho" 3)    ;; populate the trie
(trie-put! t "phone" 5)
(trie-put! t "phrase" 6)

(trie-get t "phone")  ⇒ 5  ;; lookup

(trie-common-prefix t "pho")       ;; common prefix search
  ⇒ (("phone" . 5) ("pho" . 3))
(trie-common-prefix-keys t "ph")
  ⇒ ("phone" "pho" "phrase")

Tries are frequently used with string keys, but you are not limited to do so; any sequence (See section gauche.sequence - Sequence framework) can be a key. If the types of keys differ, they are treated as different keys:

 
(trie-put! t '(#\p #\h #\o) 8)  ;; different key from "pho"

Trie inheris <collection> and implements collection framework including the builder. So you can apply generic collection operations on a trie (See section gauche.collection - Collection framework). When iterated, each element of a trie appears as a pair of a key and a value.

Class: <trie>

A class for Trie. No slots are intended for public. Use the following procedures to operate on tries.

This class also implements the dictionary interface (See section Generic functions for dictionaries).

Function: make-trie :optional tab-make tab-get tab-put! tab-fold

Creates and returns an empty trie. The optional four arguments are procedures to customize how the nodes of the internal tree are managed.

Each node can have a table to store its child nodes, indexed by an element of the key sequence (e.g. if the trie uses strings as keys, a node’s table is indexed by characters).

tab-make

A procedure with no arguments. When called, creates and returns an empty table for a node.

tab-get tab elt

Returns a child node indexed by elt, or returns #f if the table doesn’t have a child for elt.

tab-put! tab elt child-node

If child-node isn’t #f, stores a child-node with index elt. If child-node is #f, removes the entry with index elt. In both cases, this procedure should return the updated table.

tab-fold tab proc seed

Calls proc for every index and node in tab, while passing a seed value, whose initial value is seed. That is, proc has a type of (index, node, seed) -> seed. Should return the last result of proc.

The default assumes eqv?-hashtables, i.e. the following procedures are used.

 
tab-make: (lambda () (make-hash-table 'eqv?))

tab-get:  (lambda (tab k) (hash-table-get tab k #f))

tab-put!: (lambda (tab k v)
            (if v
              (hash-table-put! tab k v)
              (hash-table-delete! tab k))
            tab)

tab-fold: hash-table-fold

The following example creates a trie using assoc list to manage children, while comparing string keys with case-insensitive way:

 
(make-trie list
           (cut assoc-ref <> <> #f char-ci=?)
           (lambda (t k v)
             (if v
               (assoc-set! t k v char-ci=?)
               (alist-delete! k t char-ci=?)))
           (lambda (t f s) (fold f s t)))

It is important that tab-put! must return an updated table—by that, you can replace the table structure on the fly. For example, you may design a table which uses assoc list when the number of children are small, and then switches to a vector (indexed by character code) once the number of children grows over a certain threshold.

Function: trie params kv …

Construct a trie with the initial contents kv …, where each kv is a pair of a key and a value. Params are a list of arguments which will be given to make-trie to create the trie. The following example creates a trie with two entries and the default table procedures.

 
(trie '() '("foo" . a) '("bar" . b))
Function: trie-with-keys params key …

A convenient version of trie when you only concern the keys. Each value is the same as its key. The following example creates a trie with two entries and the default table procedures.

 
(trie-with-keys '() "foo" "bar")
Function: trie? obj

Returns #t if obj is a trie, or #f otherwise.

Function: trie-num-entries trie

Returns the number of entries in trie.

Function: trie-exists? trie key

Returns #t if trie contains an entry with key, or returns #f otherwise.

Function: trie-get trie key :optional fallback

Returns the value associated with key in trie, if such an entry exists. When there’s no entry for key, if fallback is given, it is returned; otherwise, an error is signaled.

Function: trie-put! trie key value

Puts value associated to key into trie.

Function: trie-update! trie key proc :optional fallback

Works like the following code, except that the lookup of entry in trie is done only once.

 
(let ((val (trie-get trie key fallback)))
  (trie-put! trie key (proc val)))
Function: trie-delete! trie key

Removes an entry associated with key from trie. If there’s no such entry, this procedure does nothing.

Function: trie->list trie

Makes each entry in trie to a pair (key . value) and returns a list of pairs of all entries. The order of entries are undefined.

Function: trie-keys trie
Function: trie-values trie

Returns a list of all keys and values in trie, respectively. The order of keys/values are undefined.

Function: trie->hash-table trie ht-type

Creates a hash table with type ht-type (see Hashtables, about hash table types), and populates it with every key and value pair in trie.

Function: trie-common-prefix trie prefix
Function: trie-common-prefix-keys trie prefix
Function: trie-common-prefix-values trie prefix

Gathers all entries whose keys begin with prefix; trie-common-prefix returns those entries in a list of pairs (key . value); trie-common-prefix-keys returns a list of keys; and trie-common-prefix-values returns a list of values. The order of entries in a returned list is undefined. If trie contains no entry whose key has prefix, an empty list is returned.

Note that prefix matching doesn’t consider the type of sequence; if trie has entries for "foo" and (#\f #\o #\o), (trie-common-prefix trie "foo") will return both entries.

Function: trie-common-prefix-fold trie prefix proc seed

For each entry whose key begins with prefix, calls proc with three argments, the entry’s key, its value, and the current seed value. Seed is used for the first seed value, and the value proc returns is used for the seed value of the next call of proc. The last returned value from proc is returned from trie-common-prefix-fold. The order of entries on which proc is called is undefined. If trie contains no entry whose key has prefix, proc is never called and seed is returned.

Function: trie-common-prefix-map trie prefix proc
Function: trie-common-prefix-for-each trie prefix proc

These are to trie-common-prefix-fold as map and for-each are to fold; trie-common-prefix-map calls proc with key and value for matching entries and gathers its result to a list; trie-common-prefix-for-each also applies proc, but discards its results.

Function: trie-fold trie proc seed
Function: trie-map trie proc
Function: trie-for-each trie proc

These procedures are like their common-prefix versions, but traverse entire trie instead.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]

This document was generated on July 19, 2014 using texi2html 1.82.