For Development HEAD DRAFTSearch (procedure/syntax/module):

9.10 gauche.dictionary - Dictionary framework

Module: gauche.dictionary

A dictionary is an abstract class for objects that can map a key to a value. This module provides some useful generic functions for dictionaries, plus generic dictionary classes built on top of other dictionary classes.


9.10.1 Generic functions for dictionaries

The gauche.dictionary module provides the following generic functions as the dictionary protocol. (Note that call-with-iterator and size-of are from gauche.collection).

The functions marked with * should be specialized by each dictionary’s implementation. The functions marked with + return default value if not specialized. Other functions have fallback implementation that uses the *-marked generic functions, but the implementation may provide specialized method for efficiency.

Predicates

dict-immutable?(+), dict-transparent?(+)

Accessors

dict-get(*), dict-exists?, size-of(*), dict-comparator(*)

Mutators

dict-put!(*), dict-delete!(*), dict-push!, dict-pop!, dict-update!, dict-clear! (If the dictionary is immutable, these methods are not specialized and the default methods throw an error.)

Walkers

call-with-iterator(*), dict-find, dict-any, dict-fold, dict-fold-right, dict-for-each, dict-map, dict-keys, dict-values, dict->alist

These generic functions are useful to implement algorithms common to any dictionary-like objects, a data structure that maps discrete, finite set of keys to values. (Theoretically we can think of continuous and/or infinite set of keys, but implementation-wise it is cleaner to limit the dictionary

Among built-in classes, <hash-table> and <tree-map> implement the dictionary interface. All the <dbm> classes provided by dbm module also implement it.

Predicates

Generic fucntion: dict-immutable? (dict <dictionary>)

{gauche.dictionary} Returns #t if dict is immutable, #f otherwise. Methods that tries to mutate an immutable dictionary throws an error.

If the dictionary implementation does not specialize this method, the default method returns #t.

Generic Function: dict-transparent? (dict <dictionary>)

{gauche.dictionary} If this method returns a true value, the pretty printer displays the dictionary with their contents. Built-in hashtables, for example, are transparent:

gosh$ (alist->hash-table (map-with-index cons "abcdefghijkl") eqv-comparator)
#<hash-table eqv[12]( (1 . #\b) (8 . #\i) (9 . #\j) (2 . #\c) (3 . #\d)
  (10 . #\k) (11 . #\l) (4 . #\e) (5 . #\f) (0 . #\a) (6 . #\g) (7 . #\h))>

Not all dictionaries are like this. For example, a database-backed dictionary may not want to run the query every time the instance is printed.

If the dictionary implementation does not specialize this method, the default method returns #f.

Accessors

Generic function: dict-get (dict <dictionary>) key :optional default

{gauche.dictionary} Returns the value corresponding to the key. If the dictionary doesn’t have an entry with key, returns default when it is provided, or raises an error if not.

Generic function: dict-exists? (dict <dictionary>) key

{gauche.dictionary} Returns #t if the dictionary has an entry with key, #f if not.

Generic function: dict-comparator (dict <dictionary>)

{gauche.dictionary} Should return a comparator used to compare keys.

Note: This method may return #f if a dictionary implementation doesn’t need to compare keys as a whole (e.g. <trie> compares keys element-wise, so no direct key comparator is used. See data.trie - Trie). The caller need to handle such cases.

Mutators

Generic function: dict-put! (dict <dictionary>) key value

{gauche.dictionary} Puts the mapping from key to value into the dictionary.

Generic function: (setter dict-get) (dict <dictionary>) key value

{gauche.dictionary} This works the same as dict-put!.

Generic function: dict-delete! (dict <dictionary>) key

{gauche.dictionary} Removes an entry with key form the dictionary. If the dictionary doesn’t have such an entry, this function is noop.

Generic function: dict-clear! (dict <dictionary>)

{gauche.dictionary} Empties the dictionary. Usually this is much faster than looping over keys to delete them one by one.

Generic function: dict-push! (dict <dictionary>) key value

{gauche.dictionary} A shorthand way to say (dict-put! dict key (cons value (dict-get dict key '()))). A concrete implementation may be more efficient (e.g. it may not search key twice.)

Generic function: dict-pop! (dict <dictionary>) key :optional fallback

{gauche.dictionary} If (dict-get dict key) is a pair p, the entry value is replaced with (cdr p) and the procedure returns (car p). If no entry for key is in the table, or the entry isn’t a a pair, the table isn’t modified, and fallback is returned if given, or an error is raised.

Generic function: dict-update! (dict <dictionary>) key proc :optional fallback

{gauche.dictionary} Works like the following code, except that the concrete implementation may be more efficient by looking up key only once.

(rlet1 x (proc (dict-get dict key fallback))
  (dict-put! dict key x))

Walkers

Generic function: dict-fold (dict <dictionary>) proc seed

{gauche.dictionary} Calls a procedure proc over each entry in a dictionary dict, passing a seed value. Three arguments are given to proc; an entry’s key, an entry’s value, and a seed value. Initial seed value is seed. The value returned from proc is used for the seed value of the next call of proc. The result of the last call of proc is returned from dict-fold.

If dict is <ordered-dictionary>, proc is called in the way to keep the following associative order, where the key is ordered from K0 (minimum) to Kn (maximum), and the corresponding values is from V0 to Vn:

(proc Kn Vn (proc Kn-1 Vn-1 ... (proc K0 V0 seed)))
Generic function: dict-fold-right (dict <ordered-dictionary>) proc seed

{gauche.dictionary} Like dict-fold, but the associative order of applying proc is reversed as follows:

(proc K0 V0 (proc K1 V1 ... (proc Kn Vn seed)))

This generic function is only defined on <ordered-dictionary>.

Generic function: dict-for-each (dict <dictionary>) proc

{gauche.dictionary} Calls proc with a key and a value of every entry in the dictionary dict. For ordered dictionaries, proc is guaranteed to be called in the increasing order of keys.

Generic function: dict-map (dict <dictionary>) proc

{gauche.dictionary} Calls proc with a key and a value of every entry in the dictionary dict, and gathers the result into a list and returns it. For ordered dictionaries, the result is in the increasing order of keys (it doesn’t necessarily mean proc is called in that order).

Generic function: dict-keys (dict <dictionary>)
Generic function: dict-values (dict <dictionary>)

{gauche.dictionary} Returns a list of all keys or values of a dictionary dict, respectively. For ordered dictionaries, the returned list is in the increasing order of keys.

Generic function: dict->alist (dict <dictionary>)

{gauche.dictionary} Returns a list of pairs of key and value in the dictionary. The order of pairs is undefined.

9.10.1.1 Defining dictionary protocol

Macro: define-dict-interface dict-class method proc method2 proc2 …

{gauche.dictionary} Many dictionary-like datatypes already has their own procedures that directly corresponds to the generic dictionary API, and adding dictionary interface tends to become a simple repetition of define-methods, like this:

(define-method dict-put! ((dict <my-dict>) key value)
  (my-dict-put! key value))

The define-dict-interface macro is a convenient way to define those methods in a batch. Each method argument is a keyword that corresponds to dict-method, and proc is the name of the datatype-specific procedure. Here’s the definition of dict interface for <tree-map> and you’ll get the idea. You don’t need to provide every dictionary interface.

(define-dict-interface <tree-map>
  :get        tree-map-get
  :put!       tree-map-put!
  :delete!    tree-map-delete!
  :clear!     tree-map-clear!
  :comparator tree-map-comparator
  :exists?    tree-map-exists?
  :fold       tree-map-fold
  :fold-right tree-map-fold-right
  :for-each   tree-map-for-each
  :map        tree-map-map
  :keys       tree-map-keys
  :values     tree-map-values
  :pop!       tree-map-pop!
  :push!      tree-map-push!
  :update!    tree-map-update!
  :->alist    tree-map->alist
  :transparent? (^_ #t))

9.10.2 Generic dictionaries

Bimap

Class: <bimap>

{gauche.dictionary} Provides a bidirectional map (bimap), a relation between two set of values, of which you can lookup both ways.

Internally, a bimap consists of two dictionaries, left map and right map. Think a bimap as a relation between xs and ys. The left map takes an x as a key and returns corresponding y as its value. The right map takes an y as a key and returns corresponding x as its value.

Currently, <bimap> only supports strict one-to-one mapping. Mutating interface (bimap-*-put!, bimap-*-delete! etc.) modifies both left and right maps to maintain this one-to-one mapping. (In future, we may provide an option to make many-to-one and many-to-many mappings).

A bimap can be used as a dictionary, with the generic dictionary functions such as dict-get. In such cases, the left map takes precedence; that is, the key given to dict-get etc. is regarded as the key to the left map.

Function: make-bimap left-map right-map :key on-conflict

{gauche.dictionary} Creates a new bimap consists of two dictionaries, left-map and right-map. It is the caller’s responsibility to choose appropriate type of dictionaries; for example, if you want to create a relation between a string and a number, you man want to create it like this:

(make-bimap (make-hash-table 'string=?)  ; string -> number
            (make-hash-table 'eqv?))     ; number -> string

The keyword argument on-conflict specifies what will happen when the added entry would conflict the existing entries. The following values are allowed:

:supersede

This is the default behavior. Duplicate relations are silently removed in order to maintain one-to-one mapping. For example, suppose a bimap between strings and numbers has had ("foo", 1) and ("bar", 2). When you try to put ("bar", 2) with this option, the first two entries are removed. Returns #t.

:error

Raises an error when duplicate relations can occur.

#f

When duplicate relations can occur, does nothing and returns #f.

Note: At this moment, an attempt to add a relation exactly same as the existing one is regarded as a conflict. This limitation may be lifted in future.

Function: bimap-left bimap
Function: bimap-right bimap

{gauche.dictionary} Returns the left or right map of bimap, respectively. Do not mutate the returned map, or you’ll break the consistency of the bimap.

Function: bimap-left-get bimap key :optional default
Function: bimap-right-get bimap key :optional default

{gauche.dictionary} Lookup the value corresponding to the key in the left or right map of bimap. If no entry is found for key, default is returned if provided, otherwise an error is raised.

Function: bimap-left-exists? bimap key
Function: bimap-right-exists? bimap key

{gauche.dictionary} Returns #f if the left or right map of bimap has an entry of the key, #t otherwise.

Function: bimap-put! bimap x y :key on-conflict

{gauche.dictionary} Put a relation (x, y) into the bimap. After this, (bimap-left-get x) will return y, and (bimap-right-get y) will return x.

If the bimap already have relations with x and/or y, the conflict is handled according to the value of on-conflict; see make-bimap for the possible values and their meanings. The on-conflict keyword argument can override the bimap’s default setting specified at its creation time.

Function: bimap-left-delete! bimap key
Function: bimap-right-delete! bimap key

{gauche.dictionary} Deletes an relation with the given left key or right key from bimap. Both left and right maps are modified so that the consistency is maintained. If there’s no relations with given key, these are noop.

Stacked map

Class: <stacked-map>

{gauche.dictionary} Stacked map allows you to stack (layer) multiple maps (dictionaries) and treat as if they are a single map. The upper map "shadows" the lower maps. Think of nested scopes, for example.

As a dictionary, it behaves as follows:

  • The dict-get operation searches the given key from the top dictionary to the bottom dictionary, returns the first found entry.
  • The dict-put! operation always put the entry to the topmost dictionary. If there’s an entry with the same key exists in a lower dictionary, it will be shadowed. If you want to alter the existing entry of non-top dictionary, use stacked-map-entry-update!.
  • The dict-delete! operations deletes all the entries with the given key from all the dictionaries. This is for consistency as a dictionary—you don’t want an entry pop up after deleting it. If you want to delete only from the first entry in the stack, use stacked-map-entry-delete!.
  • The dict-fold operation traverse all the dictionaries from top to bottom. When there are duplicate keys, the second and after will be skipped.
Function: make-stacked-map dic dic2 …
Function: make-stacked-map key-comparator dic dic2 …

{gauche.dictionary} Creates a new stacked map with the given dictionaries dic dic2 …, where dic is the topmost dictionary. You can give a comparator key-comparator to be used to compare keys during traversal with dict-fold. If key-comparator is omitted, the comparator from dic is used. The key comparator is returned from dict-comparator on the stacked map.

Method: stacked-map-stack (smap <stacked-map>) (dic <dictionary>)

{gauche.dictionary} Creates a new stacked map, adding dic on top of the existing stacked map smap. The key comparator is inherited from smap.

Method: stacked-map-push! (smap <stacked-map>) (dic <dictionary>)

{gauche.dictionary} Adds dic on top of the dictionaries smap holds.

Method: stacked-map-pop! (smap <stacked-map>)

{gauche.dictionary} Remove the topmost dictionary smap holds. An error is signaled if you attempt to pop from an smap that has no dictionaries.

Method: stacked-map-depth (smap <stacked-map>)

{gauche.dictionary} Returns a number of dictionaries smap has.

Method: stacked-map-entry-update! (smap <stacked-map>) key proc :optional fallback

{gauche.dictionary} Run (dict-update! d key proc fallback) on the first dictionary d that has the given key.

This differs from running dict-update! on smap; if the topmost dictionary doesn’t have an entry with key but some lower dictionary has, dict-update! takes the existing value and applies proc to it, then insert the result to the topmost dictionary, while the original entry remains intact.

Method: stacked-map-entry-delete! (smap <stacked-map>) key

{gauche.dictionary} Deletes the first entry with key. If there’s another entry with key in a lower dictionary, it will become visible.

This differs from running dict-delete! on smap; it deletes all entries with key, so that dict-exists? after dict-delete! is guaranteed to return #f.



For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT