srfi.181
- Custom ports ¶This srfi defines a way to implement a port in Scheme.
Gauche has native support of such ports (see gauche.vport
- Virtual ports),
but this srfi is useful for portable code.
The interface is upper compatible to R6RS.
Note that R7RS Scheme distinguishes binary and textual ports, while Gauche ports can handle both.
• Creating custom ports: | ||
• Transcoded ports: |
[SRFI-181]{srfi.181
}
Creates a new binary and textual input port and returns it, respectively.
The id argument is an arbitrary Scheme procedure. SRFI doesn’t
specify how it is used. In Gauche, id will be returned
with port-name
procedure (see Common port operations).
The read! argument is a procedure to be called
as (read! buffer start count)
.
For make-custom-binary-input-port
, buffer is a
bytevector (u8vector).
For make-custom-textual-input-port
, buffer is either
a string or a vector of characters (Gauche’s implementation always
use a vector, but portable code should handle both).
It should generate up to count bytes/characters of data and fill buffer beginning from start, then return the number of bytes/characters generated. It should generate at least 1 byte/character if there’s still data. To indicate the end of the data, it writes no data and returns 0.
The get-pos argument is a procedure to be called without arguments,
and returns an implementation-dependent object that indicates
the current position of the input stream. The ‘current position’ is
where next read! will generate the data. This can be #f
if the port doesn’t provide positions.
For make-custom-binary-input-port
, there’s a special rule that
if get-pos returns an exact integer, it should be a byte position
in the stream.
The set-pos! argument is a procedure to be called with one argument,
a new position.
It should set the source’s position so that next read! starts
generating data from there. The passed position is something that has
been returned by get-pos, or, for make-custom-binary-input-port
,
an exact integer that indicates the byte offset from the beginning of
the input. This argument can be #f
if the port doesn’t
support setting positions. The returned value of set-pos! is
ignored.
If the position passed to set-pos! is invalid,
an error that satisfy i/o-invalid-position-error?
should be
thrown. Portably, it can be done by throwing a condition created
by make-i/o-invalid-position-error
(see srfi.192
- Port positioning).
For Gauche-specific code, you can throw a condition
<io-invalid-position-error>
.
The close argument is a procedure to be called without
argument, when the custom port is closed. It can be #f
if you
don’t need a special cleanup.
[SRFI-181]{srfi.181
}
Creates a new binary and textual out port and returns it, respectively.
The id argument is an arbitrary Scheme object. This SRFI doesn’t
specify how it is used. In Gauche, id will be returned
with port-name
procedure (see Common port operations).
The write! argument is a procedure to be called as
(write! buffer start count)
.
For make-custom-binary-output-port
, buffer is a bytevector.
For make-custom-textual-output-port
, buffer is either
a string or a vector of characters (Gauche’s implementation always
use a vector, but portable code should handle both).
The write! procedure writes data in buffer starting from start, upto count items at maximum, to the external sink. It must return the number of actual items written.
The get-pos argument should be a procedure without taking argument,
or #f
. If it is a procedure, it should return the position
of the sink where the next write! writes to. The position
can be an arbitrary Scheme object, but for
make-custom-binary-output-port
, a position represented as
an exact integer should correspond to the byte offset in the port.
The set-pos! argument is a procedure to be called with one argument,
a new position, or #f
.
It should set the sink’s position so that next write! starts
to write data from there. The passed position is something that has
been returned by get-pos, or, for make-custom-binary-output-port
,
an exact integer that indicates the byte offset from the beginning of
the output. This argument can be #f
if the port doesn’t
support setting positions. The returned value of set-pos! is
ignored.
If the position passed to set-pos! is invalid,
an error that satisfy i/o-invalid-position-error?
should be
thrown. Portably, it can be done by throwing a condition created
by make-i/o-invalid-position-error
(see srfi.192
- Port positioning).
For Gauche-specific code, you can throw a condition
<io-invalid-position-error>
.
The close argument is a procedure to be called without
argument, when the port is closed. It can be #f
if you
don’t need a special cleanup.
The flush argument, if provided and not #f
,
should be a procedure taking no arguments. It is called when
the port is requested to flush the data buffered in the sink, if any.
[SRFI-181]{srfi.181
}
Creates a bidirectional binary i/o port. Since Gauche doesn’t distinguish
binary and textual ports, you can use the returned port for textual i/o as well,
but portable code must avoid it.
(The reason textual input/output port is not defined in the SRFI is that it is difficult to define a consistent semantics agnostic to the internal representation of character stream. In Gauche, we immediately convert characters to the octet stream of internal character encoding.)
The arguments, id, read!, write!,
get-pos, set-pos!, close and flush are
the same as make-custom-binary-input-port
and
make-custom-binary-output-port
.
A transcoded port is a portable way to read/write characters in an encodings other than the system’s default one. This API is defined first in R6RS, and adopted in SRFI-181.
In SRFI-181 (and R6RS) world, strings and characters are all an abstract entity without the concept of encodings (internally, you can think them being encoded in the system’s native encoding), and the explicit encodings only matter when you refer to the outside resource, e.g. files or a binary data represented in a bytevector. Therefore, conversions are only defined between binary ports (external world) and textual ports (internal), or a bytevector (external world) and a string (internal).
Here’s some terms:
A codec specifies a character encoding scheme.
Specifies (non)converson of EOL character(s).
A transcoder bundles a codec, an eol-style, and error handling mode.
NB: For Gauche-specific code, you can use gauche.charconv
module
(see gauche.charconv
- Character Code Conversion) that provides more options.
[SRFI-181]{srfi.181
}
Creates and returns a transcoder with the given parameters.
A transcoder is an immutable object.
The codec argument must be a codec object, either created by
make-codec
or one of the predefined codecs; see below.
The eol-style argument is one of the following symbols to specify End-of-likne style:
none
End-of-line character is passed as-is.
lf
Output port converts #\newline
to LF
octet.
Input port converts any line ending to #\newline
.
crlf
Output port converts #\newline
to CR-LF
octet sequence.
Input port converts any line ending to #\newline
.
The error-handling argument is a symbol to specify the behavior when an encoding or decoding error occurs. It can be one of the following symbols.
replace
Octets that do not consist a valid character encoding are replaced
with #\xFFFD
(or #\?
if the target encoding of an output
transcoding port does not contain #\xFFFD
.)
raise
Raise an error with a condition satisfying i/o-encoding-error?
.
The erroneous octet are consumed and next I/O on the port continues
with the next character.
[SRFI-181]{srfi.181
}
Returns a singleton of the transcoder representing systems native
(internal) codec and eol-style. In Gauche, the native codec
is the codec uses Gauche’s native encoding
(returned by gauche-character-encoding
, see Characters),
and eol-style is none
.
[SRFI-181]{srfi.181
}
Creates a transcoded port wrapping binary-port, performing the
conversion specified by transcoder.
If binary-port is an input port, it returns an input port, converting the CES specified in transcoder to the system’s native encoding.
If binary-port is an output port, it returns an output port, converting the system’s native encoding to the CES specified in transcoder.
In Gauche, conversion is done by conversion ports. See Conversion ports, for the details.
[SRFI-181]{srfi.181
}
Decode the binary data in bytevector as the CES specified
by transcoder, and returns a string in the native encoding.
It is a wrapper of Gauche’s ces-convert
; see Conversion ports.
[SRFI-181]{srfi.181
}
Encode the string in the CES specified by transcoder,
and returns a bytevector.
It is a wrapper of Gauche’s ces-convert-to
; see Conversion ports.
[SRFI-181]{srfi.181
}
Returns a codec representing a character encoding scheme named by
name. A portable code should only use string for name,
while Gauche accepts a symbol as well.
If name isn’t recognized as a supported codec name,
a condition that satisfies unknown-encoding-error?
is thrown.
[SRFI-181]{srfi.181
}
If the system sees unknown or unsupported codec, a condition
that satisfies this predicate is thrown.
[SRFI-181]{srfi.181
}
The argument must be an unknown encoding error condition
that satisfies unknown-encoding-error?
. It returns
the name that caused the condition to be thrown.
[SRFI-181]{srfi.181
}
A pre-defined codecs for latin-1
(ISO8859-1),
utf-8
, and utf-16
.
The utf-16
codec recognizes
BOM when used for input; if no BOM is found, UTF-16BE is assumed.
When used for output, utf-16
always attaches BOM.
[SRFI-181]{srfi.181
}
Returns the default eol style. In Gauche, it is none
.
[SRFI-181]{srfi.181
}
When an input transcoded port encounters a sequence that’s not valid
for the input codec, a condition that satisfies this predicate
is thrown.
In Gauche, such condition is <io-decoding-error>
.
[SRFI-181]{srfi.181
}
When an output transcoded port encounters a character that can’t be
encoded in the output codec, and the handling mode is raise
,
a condition that satisfies this predicate is thrown.
In Gauche, such condition is <io-encoding-error>
.
[SRFI-181]{srfi.181
}
Retries the character that caused the <io-encoding-error>
is thrown.