For Development HEAD DRAFTSearch (procedure/syntax/module):

12.28 dbm - Generic DBM interface

Module: dbm

DBM-like libraries provides an easy way to store values to a file, indexed by keys. You can think it as a persistent associative memory.

This modules defines <dbm> abstract class, which has a common interface to use various DBM-type database packages. As far as you operate on the already opened database, importing dbm module is enough.

To create or open a database, you need a concrete implementation of the database. With the default build-time configuration, the following implementations are included in Gauche. Bindings to various other dbm-like libraries are available as extension packages. Each module defines its own low-level accessing functions as well as the common interface. Note that your system may not have one or more of those DBM libraries; Gauche defines only what the system provides.

dbm.fsdbm

file-system dbm (see dbm.fsdbm - File-system dbm).

dbm.gdbm

GDBM library (see dbm.gdbm - GDBM interface).

dbm.ndbm

NDBM library (see dbm.ndbm - NDBM interface).

dbm.odbm

DBM library (see dbm.odbm - Original DBM interface).

The following code shows a typical usage of the database.

(use dbm)         ; dbm abstract interface
(use dbm.gdbm)    ; dbm concrete interface

; open the database
(define *db* (dbm-open <gdbm> :path "mydb" :rw-mode :write))

; put the value to the database
(dbm-put! *db* "key1" "value1")

; get the value from the database
(define val (dbm-get *db* "key1"))

; iterate over the database
(dbm-for-each *db* (lambda (key val) (foo key val)))

; close the database
(dbm-close *db*)

The <dbm> abstract class implements collection and dictionary framework. (See gauche.collection - Collection framework and gauche.dictionary - Dictionary framework, respectively).


12.28.1 Opening and closing a dbm database

Class: <dbm>

{dbm} An abstract class for dbm-style database. Inherits <dictionary> (see gauche.dictionary - Dictionary framework). Defines the common database operations. This class has the following instance slots. They must be set before the database is actually opened by dbm-open.

The concrete class may add more slots for finer control on the database, such as locking.

Instance Variable of <dbm>: path

Pathname of the dbm database. Some dbm implementation may append suffixes to this.

Instance Variable of <dbm>: rw-mode

Specifies read/write mode. Can be either one of the following keywords:

:read

The database will be opened in read-only mode. The database file must exist when dbm-open is called.

:write

The database will be opened in Read-write mode. If the database file does not exist, dbm-open creates one.

:create

The database will be created and opened in Read-write mode. If the database file exists, dbm-open truncates it.

Instance Variable of <dbm>: file-mode

Specifies the file permissions (as sys-chmod) to create the database. The default value is #o664.

Instance Variable of <dbm>: key-convert
Instance Variable of <dbm>: value-convert

By default, you can use only strings for both key and values. With this option, however, you can specify how to convert other Scheme values to/from string to be stored in the database. The possible values are the followings:

#f

The default value. Keys (values) are not converted. They must be a string.

#t

Keys (values) are converted to its string representation, using write, to store in the database, and converted back to Scheme values, using read, to retrieve from the database. The data must have an external representation that can be read back. (But it is not checked when the data is written; you’ll get an error when you read the data). The key comparison is done in the string level, so the external representation of the same key must match.

a list of two procedures

Both procedure must take a single argument. The first procedure must receive a Scheme object and returns a string. It is used to convert the keys (values) to store in the database. The second procedure must receive a string and returns a Scheme object. It is used to convert the stored data in the database to a Scheme object. The key comparison is done in the string level, so the external representation of the same key must match.

Metaclass: <dbm-meta>

{dbm} A metaclass of <dbm> and its subclasses.

Method: dbm-open (dbm <dbm>)

{dbm} Opens a dbm database. dbm must be an instance of one of the concrete classes that derived from the <dbm> class, and its slots must be set appropriately. On success, it returns the dbm itself. On failure, it signals an error.

Method: dbm-open (dbm-class <dbm-meta>) options …

{dbm} A convenient method that creates dbm instance and opens it. It is defined as follows.

(define-method dbm-open ((class <class>) . initargs)
  (dbm-open (apply make class initargs)))

Database file is closed when it is garbage collected. However, to ensure the modification is properly synchronized, you should close the database explicitly.

Method: dbm-close (dbm <dbm>)

{dbm} Closes a database dbm. Once the database is closed, any operation to access the database content raises an error.

Method: dbm-closed? (dbm <dbm>)

{dbm} Returns true if a database dbm is already closed, false otherwise.

Function: dbm-type->class dbmtype

{dbm} Sometimes you don’t know which type of dbm implementation you need to use in your application beforehand, but rather you need to determine the type according to the information given at run-time. This procedure fulfills the need.

The dbmtype argument is a symbol that names the type of dbm implementation; for example, gdbm for dbm.gdbm, and fsdbm for dbm.fsdbm. We assume that the dbm implementation of type foo is provided as a module dbm.foo, and its class is named as <foo>.

This procedure first checks if the required module has been loaded, and if not, it tries to load it. If the module loads successfully, it returns the class object of the named dbm implementation. If it can’t load the module, or can’t find the dbm class, this procedure returns #f.

(use dbm)

(dbm-type->class 'gdbm)
  ⇒ #<class <gdbm>>

(dbm-type->class 'nosuchdbm)
  ⇒ #f

12.28.2 Accessing a dbm database

Once a database is opened, you can use the following methods to access individual key/value pairs.

Method: dbm-put! (dbm <dbm>) key value

{dbm} Put a value with key.

Method: dbm-get (dbm <dbm>) key :optional default

{dbm} Get a value associated with key. If no value exists for key and default is specified, it is returned. If no value exists for key and default is not specified, an error is signaled.

Method: dbm-exists? (dbm <dbm>) key

{dbm} Return true if a value exists for key, false otherwise.

Method: dbm-delete! (dbm <dbm>) key

{dbm} Delete a value associated with key.


12.28.3 Iterating on a dbm database

To walk over the entire database, following methods are provided.

Method: dbm-fold (dbm <dbm>) procedure knil

{dbm} The basic iterator. For each key/value pair, procedure is called as (procedure key value r), where r is knil for the fist call of procedure, and the return value of the previous call for subsequent calls. Returns the result of the last call of procedure. If no data is in the database, knil is returned.

The following method returns the sum of all the integer values.

(dbm-fold dbm (lambda (k v r) (if (integer? v) (+ v r) r)) 0)
Method: dbm-for-each (dbm <dbm>) procedure

{dbm} For each key/value pair in the database dbm, procedure is called. Two arguments are passed to procedure—a key and a value. The result of procedure is discarded.

Method: dbm-map (dbm <dbm>) procedure

{dbm} For each key/value pair in the database dbm, procedure is called. Two arguments are passed to procedure—a key and a value. The result of procedure is accumulated to a list which is returned as a result of dbm-map.


12.28.4 Managing dbm database instance

Each dbm implementation has its own way to store the database. Legacy dbm uses two files, whose names are generated by adding .dir and .pag to the value of path slot. Fsdbm creates a directory under path. If dbm database is backed up by some database server, path may be used only as a key to the database in the server. The following methods hide such variations and provides a convenient way to manage a database itself. You have to pass a class that implements a concrete dbm database to their first argument.

Generic Function: dbm-db-exists? class name

{dbm} Returns #t if a database of class class specified by name exists.

;; Returns #t if testdb.dir and testdb.pag exist
(dbm-db-exists? <odbm> "testdb")
Generic Function: dbm-db-remove class name

{dbm} Removes an entire database of class class specified by name.

Generic Function: dbm-db-copy class from to

{dbm} Copy a database of class class specified by from to to. The integrity of from is guaranteed if the class’s dbm implementation supports locking (i.e. you won’t get a corrupted database even if some other process is trying to write to from during copy). If the destination database to exists, its content is destroyed. If this function is interrupted, whether to is left in incomplete state or not depends on the dbm implementation. The implementation usually tries its best to provide transactional behavior, that is, to recover original to when the copy fails. However, for the robust operations the caller have to check the state of to if dbm-db-copy fails.

(dbm-db-copy <gdbm> "testdb.dbm" "backup.dbm")
Generic Function: dbm-db-move class from to

{dbm} Moves or renames a database of class class specified by from to to. Like dbm-db-copy, the database integrity is guaranteed as far as class’s dbm implementation supports locking. If the destination database to exists, its content is destroyed.


12.28.5 Dumping and restoring dbm database

Most dbm implementations use some kind of binary format, and some of them are architecture dependent. That makes it difficult to pass around dbm databases between different machines. A safe way is to write out the content of a dbm database into some portable format on the source machine, and rebuild another dbm database from it on the destination machine.

The operation is so common that Gauche provides convenience scripts that does the job. They are installed into the standard Gauche library directory, so it can be invoked by gosh <scriptname>.

To write out the content of a dbm database named by dbm-name, you can use dbm/dump script:

$ gosh dbm/dump [-o outfile][-t type] dbm-name

The outfile argument names the output file. If omitted, the output is written out to stdout. The type argument specifies the implementation type of the dbm database; e.g. gdbm or fsdbm. The program calls dbm-type->class (see Opening and closing a dbm database) on the type argument to load the necessary dbm implementation.

The dumped format is simply a series of S-expressions, each of which is a dotted pair of string key and string value. Character encodings are assumed to be the same as gosh’s native character encoding.

The dumped output may contain S-expressions other than dotted pair of strings to include meta information. For now, programs that deals with dumped output should just ignore S-expressions other than dotted pairs.

To read back the dumped dbm format, you can use dbm/restore script:

$ gosh dbm/restore [-i infile][-t type] dbm-name

The infile argument names the dumped file to be read. If omitted, it reads from stdin. The type argument specifies the dbm type, as in dbm/dump script. The dbm-name argument names the dbm database; if the database already exists, its content is cleared, so be careful.


12.28.6 Writing a dbm implementation

When you write an extension module that behaves like a persistent hashtable, it is a good idea to adapt it to the dbm interface, so that the application can use the module in a generic way.

The minimum procedures to conform the dbm interface are as follow:

Besides above, you may define the following methods.

It is generally recommended to name the implementation module as dbm.foo, and the class of the implementation as <foo>. With this convention it is easier to write an application that dynamically loads and uses dbm implementation specified at runtime.



For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT