util.relation
- Relation framework ¶Provides a set of common operations for relations.
Given set of values S1, S2, ..., Sn, a relation R is a set of tuples such that the first element of a tuple is from S1, the second from S2, ..., and the n-th from Sn. In another word, R is a subset of Cartesian product of S1, ..., Sn. (The definition, as well as the term relation, is taken from the Codd’s 1970 paper, "A Relational Model of Data for Large Shared Data Banks", in CACM 13(6) pp.377–387.)
This definition can be applied to various datasets: A set of Gauche object system instances is a relation, if you view each instance as a tuple and each slot value as the actual values. A list of lists can be a relation. A stream that reads from CSV table produces a relation. Thus it would be useful to provide a module that implements generic operations on relations, no matter how the actual representation is.
From the operational point of view, we can treat any datastructure
that provides the following four methods; relation-rows
,
which retrieves a collection of tuples (rows);
relation-column-names
, relation-accessor
, and
relation-modifier
, which provide the means to access
meta-information.
All the rest of relational operations are built on top of
those primitive methods.
A concrete implementation of relation can use duck typing,
i.e. it doesn’t need to inherit a particular base class to
use the relation methods. However, for the convenience,
a base class <relation>
is provided in this module.
It works as a mixin class—a concrete class typically wants
to inherit <relation>
and <collection>
or
<sequence>
. Check out the sample implementations
in the lib/util/relation.scm in the source tree, if
you’re curious.
This module is still under development. The plan is to build useful relational operations on top of the common methods.
{util.relation
}
An abstract base class of relations.
{util.relation
}
A subclass must implement this method.
It should return a sequence of names of the columns.
The type of column names is up to the relation; we don’t
place any restriction on it, as far as they are different
each other in terms of equal?
.
{util.relation
}
A subclass must implement this method.
It should return a procedure that takes two arguments, a row from
the relation r and a column name, and returns the value
of the specified column.
{util.relation
}
A subclass must implement this method. It should returns a procedure
that takes three arguments, a row from the relation r, a column
name, and a value to set.
If the relation is read-only, this method returns #f
.
{util.relation
}
A subclass must implement this method.
It should return the underlying instance of <collection>
or
its subclass (e.g. <sequence>
)
The rest of method are built on top of the above four methods.
A subclass of <relation>
may overload some of the
methods below for better performance, though.
{util.relation
}
Returns true iff column is a valid column name for the relation
r.
{util.relation
}
Returns a procedure to access the specified column of a row
from the relation r. Relation-column-getter
should return a procedure that takes one argument, a row.
Relation-column-setter
should return a procedure that
takes two arguments, a row and a new value to set.
If the relation is read-only, relation-column-setter
returns #f
.
{util.relation
}
Row is a row from the relation r. Returns value of
the column in row. If column is not a valid
column name, default is returned if it is given, otherwise
an error is signaled.
{util.relation
}
Row is a row from the relation r. Sets value
as the value of column in row. This may signal
an error if the relation is read-only.
{util.relation
}
Returns full list of getters and setters. Usually the default
method is sufficient, but the implementation may want to cache
the list of getters, for example.
{util.relation
}
Returns a procedure that coerces a row into a sequence.
If the relation already uses a sequence to represent a row,
it can return row as is.
{util.relation
}
Returns true iff new rows can be inserted to the relation r.
{util.relation
}
Insert a row row to the relation r.
{util.relation
}
Returns true iff rows can be deleted from the relation r.
{util.relation
}
Deletes a row row from the relation r.
{util.relation
}
Applies proc to the values of column … of each row,
passing seed as the state value. That is, for each row in
r, proc is called as follows:
(proc v_0 v_1 ... v_i seed) where v_k = (relation-ref r row column_k)
The result of the call becomes a new seed value, and the final result is returned from relation-fold.
For example, if a relation has a column named amount
,
and you want to sum up all of them in a relation r,
you can write like this:
(relation-fold r + 0 'amount)
{util.relation
}
{util.relation
}