gauche.vport
- Virtual ports ¶Virtual ports, or procedural ports, are the ports whose behavior can be programmed in Scheme.
This module provides two kinds of virtual ports: Fully virtual ports, in which every I/O operation invokes user-provided procedures, and virtual buffered ports, in which I/O operations are done on an internal buffer and user-provided procedures are called only when the buffer needs to be filled or flushed.
This module also provides virtual buffered ports backed up by a uniform vector, as an example of the feature.
This type of virtual ports are realized by classes
<virtual-input-port>
and <virtual-output-port>
.
You can customize the port behavior by setting
appropriate slots with procedures.
{gauche.vport
}
An instance of this class can be used as an input port.
The behavior of the port depends on the settings of the
instance slot values.
To work as a meaningful input port, at least either one of
getb
or getc
slot must be set. Otherwise,
the port returns EOF for all input requests.
<virtual-input-port>
: getb ¶If set, the value must be a procedure that takes no arguments. Every time binary input is required, the procedure is called.
The procedure must return an exact integer between 0 and 255
inclusive, or #f
or an EOF object. If it returns an
integer, it becomes the value read from the port. If it returns
other values, the port returns EOF.
If the port is requested a character input and it doesn’t have
the getc
procedure, the port calls this procedure, possibly
multiple times, to construct a whole character.
<virtual-input-port>
: getc ¶If set, the value must be a procedure that takes no arguments. Every time character input is required, the procedure is called.
The procedure must return a character,
#f
or an EOF object. If it returns a character,
it becomes the value read from the port. If it returns
other values, the port returns EOF.
If the port is requested a binary input and it doesn’t have
the getb
procedure, the port calls this procedure, then converts
a character into a byte sequence, and use it as the binary
value(s) read from the port.
<virtual-input-port>
: gets ¶If set, the value must be a procedure that takes one argument,
a positive exact integer. It is called when the block binary
input, such as read-uvector
, is requested.
It must return a (maybe incomplete) string up to the specified size,
or #f
or EOF object. If it returns a null string, #f
or EOF object,
the port thinks it reached EOF. If it returns other string,
it is used as the result of block read.
It shouldn’t return a string larger than the given size
(Note: you must count size (bytes), not the number of characters).
The reason of this procedure is efficiency; if this procedure
is not provided, the port calls getb
procedure repeatedly
to prepare the block of data. In some cases, providing block input
can be much more efficient (e.g. suppose you’re reading from
a block of memory chunk).
You can leave this slot unset if you don’t need to take such advantage.
<virtual-input-port>
: ready ¶If set, the value must be a procedure that takes one boolean argument.
It is called when char-ready?
or byte-ready?
is
called on the port. The value returned from your procedure will
be the result of these procedures.
The boolean argument is #t
if
char-ready?
is called, or #f
if byte-ready?
is called.
If unset, char-ready?
and byte-ready?
always return #t
on the port
<virtual-input-port>
: close ¶If set, the value must be a procedure that takes no arguments. It is called when the port is closed. Return value is discarded. You can leave this unset if you don’t need to take an action when the port is closed.
This procedure may be called from a finalizer, so you have to be careful to write it. See the note on finalization below.
<virtual-input-port>
: seek ¶If set, the value must be a procedure that takes two arguments,
offset and whence. The meaning of them is the same as the arguments
to port-seek
(see Common port operations).
The procedure must adjust the port’s internal read pointer
so that the next read begins from the new pointer.
It should return the updated pointer (the byte offset from the
beginning of the port).
If unset, call of port-seek
and port-tell
on this
port will return #f
.
Note that this procedure may be called for the purpose of merely
querying the current position, with 0 as offset
and SEEK_CUR
as whence
. If your port knows the read pointer but cannot move it,
you can still provide this procedure, which returns the current pointer
position for such queries and returns #f
for other arguments.
{gauche.vport
}
An instance of this class can be used as an output port.
The behavior of the port depends on the settings of the
instance slot values.
To work as an output port, at least either one of putb
or
putc
slot has to be set.
<virtual-output-port>
: putb ¶If set, the value must be a procedure that takes one argument, a byte value (exact integer between 0 and 255, inclusive). Every time binary output is required, the procedure is called. The return value of the procedure is ignored.
If this slot is not set and binary output is requested,
the port may signal an <io-unit-error>
error.
<virtual-output-port>
: putc ¶If set, the value must be a procedure that takes one argument, a character. Every time character output is required, the procedure is called. The return value of the procedure is ignored.
If this slot is not set but putb
slot is set,
the virtual port decomposes the character into a sequence of bytes
then calls putb
procedures.
<virtual-output-port>
: puts ¶If set, the value must be a procedure that takes a (possibly incomplete) string. The return value of the procedure is ignored.
This is for efficiency. If this slot is not set, the virtual port
calls putb
or putc
repeatedly to output a chunk of data.
But if your code can perform chunked output efficiently,
you can provide this procedure.
<virtual-output-port>
: flush ¶If set, the value must be a procedure that takes no arguments.
It is called when flushing a port is required (e.g. flush
is called on the port, or the port is being closed).
This procedure is useful that your port does some sort of buffering, or needs to keep some state. If your port doesn’t do stateful operation, you can leave this unset.
This procedure may be called from a finalizer, and needs a special care. See notes on finalizers below.
<virtual-output-port>
: close ¶The same as <virtual-input-port>
’s close
slot.
<virtual-output-port>
: seek ¶The same as <virtual-input-port>
’s seek
slot.
This type of virtual ports are realized by classes
<buffered-input-port>
and <buffered-output-port>
.
You can customize the port behavior by setting
appropriate slots with procedures.
Those ports have internal buffer and only calls Scheme procedures when the buffer needs to be filled or flushed. Generally it is far more efficient than calling Scheme procedures for every I/O operation. Actually, the internal buffering mechanism is the same as Gauche’s file I/O ports.
These ports uses u8vector
as a buffer. See Uniform vectors
for the details.
{gauche.vport
}
An instance of this class behaves as an input port.
It has the following instance slots. For a meaningful input
port, you have to set at least fill
slot.
<buffered-input-port>
: fill ¶If set, it must be a procedure that takes one argument,
a u8vector
. It must fill the data from the
beginning of the vector. It doesn’t need to fill the entire
vector if there’s not so many data. However, if there are remaining
data, it must fill at least one byte; if the data isn’t readily
available, it has to wait until some data becomes available.
The procedure must return a number of bytes it actually filled. It may return 0 or an EOF object to indicate the port has reached EOF.
<buffered-input-port>
: ready ¶If set, it must be a procedure that takes no arguments.
The procedure must return a true value if there are some data
readily available to read, or #f
otherwise.
Unlike fully virtual ports, you don’t need to distinguish
binary and character I/O.
If this slot is not set, the port is regarded as it always has data ready.
<buffered-input-port>
: close ¶If set, it must be a procedure that takes no arguments. The procedure is called when the virtual buffered port is closed. You don’t need to set this slot unless you need some cleaning up when the port is closed.
This procedure may be called from a finalizer, and needs special care. See the note on finalization below.
<buffered-input-port>
: filenum ¶If set, it must be a procedure that returns underlying
file descriptor number (exact nonnegative integer).
The procedure is called when port-file-number
is called
on the port.
If there’s no such underlying file descriptor, you can
return #f
, or you can leave this slot unset.
<buffered-input-port>
: seek ¶If set, it must be a procedure that takes two arguments,
offset and whence.
It works the same way as <virtual-input-port>
’s seek procedure;
see above.
This procedure may be called from a finalizer, and needs special care. See the note on finalization below.
Besides those slot values, you can pass an exact nonnegative integer
as the :buffer-size
keyword argument to the make method
to set the size of the port’s internal buffer. If :buffer-size
is omitted, or zero is passed, the system’s default buffer size
(something like 8K) is used. :buffer-size
is not an instance
slot and you cannot set it after the instance of the buffered port
is created. The following example specifies the buffered port
to use a buffer of size 64K:
(make <buffered-input-port> :buffer-size 65536 :fill my-filler)
{gauche.vport
}
An instance of this class behaves as an output port.
It has the following instance slots.
You have to set at least flush
slot.
<buffered-output-port>
: flush ¶If set, it must be a procedure that takes two arguments,
an u8vector
buffer and a flag.
The procedure must output data in the buffer to somewhere,
and returns the number of bytes actually output.
If the flag is false, the procedure may output less than entire buffer (but at least one byte). If the flag is true, the procedure must output entire buffer.
<buffered-output-port>
: close ¶Same as <buffered-input-port>
’s close
slot.
<buffered-output-port>
: filenum ¶Same as <buffered-input-port>
’s filenum
slot.
<buffered-output-port>
: seek ¶Same as <buffered-input-port>
’s seek
slot.
Besides those slot values, you can pass an exact nonnegative integer
as the :buffer-size
keyword argument to the make method
to set the size of the port’s internal buffer. See the description
of <buffered-input-port>
above for the details.
The following two procedures return a buffered input/output port
backed up by a uniform vector. The source or destination vector
can be any type of uniform vector, but they will be aliased
to u8vector
(see uvector-alias
in
Uvector conversion operations).
If used together with pack
/unpack
(see binary.pack
- Packing binary data), it is useful to parse
or construct binary data structure. It is also an
example of using virtual ports; read gauche/vport.scm
(or ext/vport/vport.scm in the source tree) if you’re
curious about the implementation.
{gauche.vport
}
Returns an input port that reads the content of the given
uniform vector uvector from its beginning. If reading
operation reaches the end of uvector, EOF is returned.
Seek operation is also implemented.
[R7RS base]
{gauche.vport
}
Similar to open-input-uvector
,
but the argument must be an u8vector.
This is an R7RS base procedure.
{gauche.vport
}
Returns an output port that uses the given uvector as the
storage for the data output to the port.
If uvector is completely filled, what happens after that depends on extendable - if it is false (default), the rest of data is discarded silently. If it is true, the storage is extended automatically to accommodate more data.
If you give true value to extendable, you have to retrieve
the result by get-output-uvector
below, since the uvector
you passed in won’t contain spilled data.
As a special case, you can omit uvector argument; then
u8vector
is used as the storage. In that case
you can’t specify extendable keyword argument, but it is
assumed true, since it won’t make sense otherwise. Use
get-output-uvector
to retrieve the stored result.
Seek operation is also implemented. Note that the meaning
of SEEK_END
whence differ between extendable and
fixed-size uvector ports. For extendable ports,
the end whence placed next to the biggest offset of
the data ever written; if you open a port and just write
one byte, the end whence is the second byte, no matter
how big the existing buffer is. On the other hand,
for fixed-size uvector ports, end whence is fixed to
the next to the end of the given buffer, no matter
how much data you’ve written to it. In the latter case,
you can’t seek on or past the end (you need to pass negative
number along SEEK_END
to port-seek
).
[R7RS base]
{gauche.vport
}
Same as open-output-uvector
without arguments.
Uses extenable u8vector as the buffer.
This is an R7RS base procedure.
{gauche.vport
}
If port is a port created by open-output-uvector
, returns
a uvector that contains accumulated data.
If port is not a port created by open-output-uvector
,
#f
is returned.
The returned uvector is the same type as the one passed to
open-output-uvector
, containing up to actually written data; it may
be smaller than the uvector passed to open-output-uvector
;
it can be larger if the port is extendable.
If the type of uvector is other than s8vector
and u8vector
,
and the written data doesn’t fill up the whole element won’t be
in the result.
For example, if you use s32vector
to create the port,
then write 7 bytes to it, get-output-uvector
returns a
single element s32vector
, for the last 3 bytes does not
consist a whole 32bit integer.
By default, the returned vector is a fresh copy of the contents. Passing true value to shared may avoid copying and allow sharing storage for the one being used by port. If you do so, keep in mind that if you seek back and write to port subsequently, the content of returned vector may be changed.
[R7RS base]
{gauche.vport
}
Extract the data put to an bytevector output port as an u8vector.
The port must be created by open-output-bytevector
or
open-output-uvector
.
This is an R7RS base procedure.
The following procedures allow you to use list of characters
or octets as a source of an input port. These are (a kind of)
opposite of port->list
family
(see Input utility functions) or port->char-lseq
family
(see Lazy sequences).
{gauche.vport
}
Creates and returns an input port that uses
the given list of characters and bytes as the source.
(read (open-input-char-list '(#\a #\b))) ⇒ ab
{gauche.vport
}
If port is the one created by open-input-char-list
or open-input-byte-list
, returns a list of remaining data
that hasn’t been read yet. If the port already read everything,
or the port is not the one created by open-input-char-list
or open-input-byte-list
, an empty list is returned.
A caveat: Gauche allows mixing binary input and textual input from the same port. If you read or even peek a byte from a port created from a character list, the port buffers a character and disassembles it to bytes; the disassembled character may not be included in the remaining input list.
The following procedures allow you to use character generators
or byte generators as a source of an input port. These are (a kind of)
opposite of port->char-generator
family
(see Generator constructors).
{gauche.vport
}
Creates and returns an input port that uses
the given generators as the source. The cgen argument
must be a generator that yields characters. The bgen argument
must be a generator that yields bytes (exact integers between 0 and
255, inclusive).
An error will be raised if the given generator yields
incorrect type of objects.
(read (open-input-char-generator (string->generator "foo"))) ⇒ foo
Since the generators are objects relying on side effects, you shouldn’t use cgen or bgen after you pass them to those procedures; if you use them afterwards, the result is undefined.
{gauche.vport
}
If port is the one created by open-input-char-generator
or open-input-byte-generator
, returns a generator that
yields the characters or bytes that haven’t been read yet.
If the port already read everything, an empty generator
is returned.
Once you take the remaining input generator, you should no longer read from the input generator ports; they share internal states and mixing them will likely to cause unexpected behaviors. If side-effects safe behavior is desired, use lazy sequence and input list ports.
Accumulators are dual to generators; it’s a procedure that accepts a value
at a time, and the end of the value is indicated by EOF.
See scheme.generator
- R7RS generators, for the basic operations of accumulators.
The following procedures turns an accumulator that accepts characters or octets into an output port.
{gauche.vport
}
Returns an output port that sends the output characters to an
accumulator acc
, which takes a character as the argument.
When the returned output port is closed, EOF is passed to acc.
Note: The behavior is undefined if you try to perform binary output to the returned output port.
{gauche.vport
}
Returns an output port that sends the output bytes to an
accumulator acc
, which takes a byte as the argument.
When the returned output port is closed, EOF is passed to acc.
A character sent to the output port is converted to octets in the Gauche’s native encoding.
{gauche.vport
}
The accumulator acc must accept a byte, a character or a string.
Returns an output port that sends the output data to the acc.
When the returned output port is closed, EOF is passed to acc.
If an unclosed virtual port is garbage collected, its close procedure is called (in case of virtual buffered ports, its flush procedure may also be called before close procedure). It is done by a finalizer of the port. Since it is a part of garbage-collection process (although the Scheme procedure itself is called outside of the garbage collector main part), it requires special care.
flush
procedure sends its output to
Y. However, if flush
procedure can be called from a
finalizer, it may be possible that Y’s finalizer has already been
called and Y is closed. So X’s flush
procedure
has to check if Y has not been closed.
close
or flush
of virtual ports
need to lock or access the global resource, it needs to take
extra care of avoiding dead lock or conflict of access.
Even in single thread programs, the finalizer can run anywhere in Scheme programs, so effectively it should be considered as running in a different thread.