Copyright (C) 2000 Shiro Kawai (shiro@acm.org)
This package provides an interface for a serializer, which converts Scheme objects to an external presentation which can later be read back to reconstruct the structure topologically equal to the original one.
Object serialization technique is important when applications exchange data with each other by persistent storage or over networks. Some programming languages such as Java and Python support the serialization feature by the standard library. Scheme does have a clearly defined external presentation, but it doesn't handle shared substructure, circular reference, or user defined classes.
Having a standard serializer is handy but also has its drawback. The requirements for a serializer may differ among applications; some applications may need human-readable ASCII representation, while others need to have binary ones prefering serialization speed, even compensating the portability of the serialized data.
Instead of providing a single instance of implementation,
this package defines an abstract class <serializer>
for the common interface. Libraries which need a serializer
can use <serializer>
class without knowing actual
implementation. You can implement your own serializer
independently, and just "plug-in" it to your application.
A couple of implementations, <aserializer>
and
<dserialzier>
, come with this package.
The most recent version of this document and the package can be obtained at the following URL. (The package tarball includes the document as well).
You need STk 4.0.1 or above. STk is available at http://kaolin.unice.fr/stk/.
serializer
directory.
cd
to the directory.
stk Makefile.stk
. It creates Makefile
tailored to
your STk intallation parameters. Tweak it if necessary.
make test
performs some primitive testing. You can skip
this step.
make install
copies the Scheme files to your site-scheme
directory.
A serializer interface is available by this `require' form:
(require "serializer")
If you actually create an instance of a serializer, you need
to use one of actual implementations instead of serializer
module. An implementation called aserializer
comes with
the serializer package. Thus you can say the following instead of
the above.
(require "aserializer")
<serializer>
<serializer>
This class has three slots.
<serializer>
port
<serializer>
direction
:in
if this is an input serializer, or
:out
if this is an output serializer.
<serializer>
preserve-equality
A serializer is instantiated by the generic method make.
Create a serializer when class is a subclass of <serializer>
.
The keyword argument :port is mandatory. It specifies a port the serializer is associated to.
The keyword argument :direction must be either :in
or :out
, specifying the direction of the serializer.
If omitted, it is guessed by the direction of the port;
the direction is :in
if the port is an input port,
and :out
if the port is an output port.
If port is a bidirectional port, direction should be
specified (there's no bidirectional serializer).
If the direction of port and direction contradict,
an error is signalled.
In general, serialized forms may or may not equal even if the
original objects are equal, since a serializer may insert extra
information such as a timestamp in the serialized form. In other
words, even if (equal? a b)
is true for two objects a
and
b
, (equal? (write-to-string-with-serializer a)
(write-to-string-with-serializer b))
need not be true (of course, if you deserialize the serialized forms
using read-from-string-with-serializer
, they must produce
objects equal?
to each other).
However, if the keyword argument :preserve-equality is to advice
the serializer that, when it serializes objects equal?
to each
other, its serialized form should be equal, too. This feature is
used, for example, a dbm interface which passes a serialized form
as a key to the dbm database.
Two methods are defined to perform actual serialization. Subclasses will overload these method.
<serializer>
) obj
Types of object which can be serializable depends on the implementation. For the general purpose serializer, however, the application programmer can assume that the most of standard scheme objects, kewords, hash tables and STklos objects are serializable. See section 4. Serializer behavior for details.
This method may throw an error when it encounters an unserializable object.
<serializer>
)
If there's no more object in the input serializer, an eof-object is returned.
Following utility methods are also defined.
input-serializer?
returns true iff obj
is a serializer and its direction is input. output-serializer?
returns true iff obj is a serializer and its direction is output.
<serializer>
)
<serializer>
)
:in
or :out
)
of a serializer obj
.
<class>
) obj &rest options
<class>
) obj filename &rest options
Convenience funcionts to create a serialized form of a Scheme object obj.
write-to-string-with-serializer
returns a serialized form as
a Scheme string. write-to-file-with-serializer
writes out
a serialized form to the file specified by filename.
A temporary output serializer of class
serializer-class is created to serialize the object.
Its port and direction are set appropriately.
If extra arguments options are provded, they are passed to
the constructor of the serializer (make
) as well.
<class>
) str &rest options
<class>
) filename &rest options
Convenience funcionts to read a Scheme object from its serialized form.
read-from-string-with-serializer
takes a serialized from
from a string str,
and read-from-file-with-serializer
from a file specified by
filename.
A temporary input serializer of class
serializer-class is created to deserialize the object.
Its port and direction are set appropriately.
If extra arguments options are provded, they are passed to
the constructor of the serializer (make
) as well.
A simple extension mechansism is provided so that the user can customize serializer behavior on STklos object without knowing the actual implemantation.
<object>
)
The output serializer retrieves the value of those slots by
slot-ref
. The input serializer uses slot-set!
on those
slots. The input serializer sets the slot values in the order of
the list get-serlializable-slots
returns.
Default method returns all slots except virtual ones.
Alternatively, an implementation of a serializer may define its own extension protocol. This may potentially more efficient, although it can only be used only with that implementation.
Various serialization algorithms can be implemented by subclassing
<serializer>
, overloading write-to-serializer
and
read-from-serializer
method. We don't set any restriction
to the serializer behavior. The application programmer
may implement his own serializer with supporting only limited
type of objects if it is all the application needs.
However, to ease the programming, we define a few general guidelines for the behavior of serializers. If you aim to implement a general purpose serializer, it is recommended to follow these guidelines.
A general purpose serializer should support following STk primitive types:
The items marked by (*) may have certain restrictions described in the following sections.
When a serializer is created, it creates an internal object dictionary.
For each object which goes through the serializer, it registers
the object to the dictionary so that when it encounters the same
(eq?
) object again it can use a reference of the object
to preserve eq?-ness.
Each serializer keeps its own dictionary, and which is kept until the serializer is garbage-collected. Thus object eq?-ness is not preserved among different serializers.
There are other cases where object eq?-ness is not preserved:
General purpose serializer may take an STklos instance. The instance does not need to be a subclass of certain class. However, to ease the implementation of serializers, it is desirable for the instances to be serialized to follow the following guidelines.
First, the instance must be created without any initialization parameters, and all the serializable slots have to be able to be set by slot-set!. For the input serializer may not know the initialization protocol of your object, so it may create an instance without initialization parameters and then fill out the slot values using slot-set!.
Second, full information about metaobjects may not serialized.
For the input serializer to work, the classes (and metaclasses)
of the serialized object should be defined before calling
read-from-serializer
. With this assumption, the
output serializer can only puts the minimal class information
for instances to be serialized.
The implementation may detect if the class on memory matches the one on the serialized form and throws an error (like Java), or may try to fill the new class slots by the ones read from the serialized form as much as possible.
A couple of simple implementations are available in the package.
A simple implementation, aserializer
, is provided with
the serializer package. It follows the serializer guidelines.
The serialized data is represented by ASCII characters, and
portable among different architectures. It is not particularly optimized for
speed nor space.
<aserializer>
The object is serialized as followings:
#f
. Later version will properly treat it as
an unbound slot.
When an instance is read back, the slot values are recovered according
to the classinfo. If the class implementation has been changed since
the instance was serialized, the serializer fills only the slots whose
name match the ones in the classinfo. If a slot is missing in the class
implementation on the memory, the value of the slot is simply discarded.
<dserializer>
write*
and read
. The capability is
limited to the objects those functions can deal with: booleans, numbers,
symbols, strings, lists and vectors. The output serializer doesn't
complain if an object other than these types is passed, but
it cannot be read back by the input serializer.
Eq?-ness of strings are not preserved.
Circular references and shared substructures are,
however, properly treated.
In spite of these restrictions, sometimes this is useful when you know your object to be serialized meets the conditions. And it works fast.
Jump to: < - d - g - i - m - o - p - r - s - w
<aserializer>
<dserializer>
<serializer>
This document was generated on 5 December 2000 using texi2html 1.56k.