For Development HEAD DRAFTSearch (procedure/syntax/module):

11.40 srfi.181 - Custom ports

Module: srfi.181

This srfi defines a way to implement a port in Scheme. Gauche has native support of such ports (see gauche.vport - Virtual ports), but this srfi is useful for portable code.

The interface is upper compatible to R6RS.

Note that R7RS Scheme distinguishes binary and textual ports, while Gauche ports can handle both.


11.40.1 Creating custom ports

Function: make-custom-binary-input-port id read! get-pos set-pos! close
Function: make-custom-textual-input-port id read! get-pos set-pos! close

[SRFI-181]{srfi.181} Creates a new binary and textual input port and returns it, respectively.

The id argument is an arbitrary Scheme procedure. SRFI doesn’t specify how it is used. In Gauche, id will be returned with port-name procedure (see Common port operations).

The read! argument is a procedure to be called as (read! buffer start count). For make-custom-binary-input-port, buffer is a bytevector (u8vector). For make-custom-textual-input-port, buffer is either a string or a vector of characters (Gauche’s implementation always use a vector, but portable code should handle both).

It should generate up to count bytes/characters of data and fill buffer beginning from start, then return the number of bytes/characters generated. It should generate at least 1 byte/character if there’s still data. To indicate the end of the data, it writes no data and returns 0.

The get-pos argument is a procedure to be called without arguments, and returns an implementation-dependent object that indicates the current position of the input stream. The ‘current position’ is where next read! will generate the data. This can be #f if the port doesn’t provide positions.

For make-custom-binary-input-port, there’s a special rule that if get-pos returns an exact integer, it should be a byte position in the stream.

The set-pos! argument is a procedure to be called with one argument, a new position. It should set the source’s position so that next read! starts generating data from there. The passed position is something that has been returned by get-pos, or, for make-custom-binary-input-port, an exact integer that indicates the byte offset from the beginning of the input. This argument can be #f if the port doesn’t support setting positions. The returned value of set-pos! is ignored.

If the position passed to set-pos! is invalid, an error that satisfy i/o-invalid-position-error? should be thrown. Portably, it can be done by throwing a condition created by make-i/o-invalid-position-error (see srfi.192 - Port positioning). For Gauche-specific code, you can throw a condition <io-invalid-position-error>.

The close argument is a procedure to be called without argument, when the custom port is closed. It can be #f if you don’t need a special cleanup.

Function: make-custom-binary-output-port id write! get-pos set-pos! close :optional flush
Function: make-custom-textual-output-port id write! get-pos set-pos! close :optional flush

[SRFI-181]{srfi.181} Creates a new binary and textual out port and returns it, respectively.

The id argument is an arbitrary Scheme object. This SRFI doesn’t specify how it is used. In Gauche, id will be returned with port-name procedure (see Common port operations).

The write! argument is a procedure to be called as (write! buffer start count). For make-custom-binary-output-port, buffer is a bytevector. For make-custom-textual-output-port, buffer is either a string or a vector of characters (Gauche’s implementation always use a vector, but portable code should handle both).

The write! procedure writes data in buffer starting from start, upto count items at maximum, to the external sink. It must return the number of actual items written.

The get-pos argument should be a procedure without taking argument, or #f. If it is a procedure, it should return the position of the sink where the next write! writes to. The position can be an arbitrary Scheme object, but for make-custom-binary-output-port, a position represented as an exact integer should correspond to the byte offset in the port.

The set-pos! argument is a procedure to be called with one argument, a new position, or #f. It should set the sink’s position so that next write! starts to write data from there. The passed position is something that has been returned by get-pos, or, for make-custom-binary-output-port, an exact integer that indicates the byte offset from the beginning of the output. This argument can be #f if the port doesn’t support setting positions. The returned value of set-pos! is ignored.

If the position passed to set-pos! is invalid, an error that satisfy i/o-invalid-position-error? should be thrown. Portably, it can be done by throwing a condition created by make-i/o-invalid-position-error (see srfi.192 - Port positioning). For Gauche-specific code, you can throw a condition <io-invalid-position-error>.

The close argument is a procedure to be called without argument, when the port is closed. It can be #f if you don’t need a special cleanup.

The flush argument, if provided and not #f, should be a procedure taking no arguments. It is called when the port is requested to flush the data buffered in the sink, if any.

Function: make-custom-binary-input/output-port id read! write! get-pos set-pos! close :optional flush

[SRFI-181]{srfi.181} Creates a bidirectional binary i/o port. Since Gauche doesn’t distinguish binary and textual ports, you can use the returned port for textual i/o as well, but portable code must avoid it.

(The reason textual input/output port is not defined in the SRFI is that it is difficult to define a consistent semantics agnostic to the internal representation of character stream. In Gauche, we immediately convert characters to the octet stream of internal character encoding.)

The arguments, id, read!, write!, get-pos, set-pos!, close and flush are the same as make-custom-binary-input-port and make-custom-binary-output-port.


11.40.2 Transcoded ports

A transcoded port is a portable way to read/write characters in an encodings other than the system’s default one. This API is defined first in R6RS, and adopted in SRFI-181.

In SRFI-181 (and R6RS) world, strings and characters are all an abstract entity without the concept of encodings (internally, you can think them being encoded in the system’s native encoding), and the explicit encodings only matter when you refer to the outside resource, e.g. files or a binary data represented in a bytevector. Therefore, conversions are only defined between binary ports (external world) and textual ports (internal), or a bytevector (external world) and a string (internal).

Here’s some terms:

Codec

A codec specifies a character encoding scheme.

EOL-style

Specifies (non)converson of EOL character(s).

Transcoder

A transcoder bundles a codec, an eol-style, and error handling mode.

NB: For Gauche-specific code, you can use gauche.charconv module (see gauche.charconv - Character Code Conversion) that provides more options.

Transcoders

Function: make-transcoder codec eol-style error-handling

[SRFI-181]{srfi.181} Creates and returns a transcoder with the given parameters. A transcoder is an immutable object.

The codec argument must be a codec object, either created by make-codec or one of the predefined codecs; see below.

The eol-style argument is one of the following symbols to specify End-of-likne style:

none

End-of-line character is passed as-is.

lf

Output port converts #\newline to LF octet. Input port converts any line ending to #\newline.

crlf

Output port converts #\newline to CR-LF octet sequence. Input port converts any line ending to #\newline.

The error-handling argument is a symbol to specify the behavior when an encoding or decoding error occurs. It can be one of the following symbols.

replace

Octets that do not consist a valid character encoding are replaced with #\xFFFD (or #\? if the target encoding of an output transcoding port does not contain #\xFFFD.)

raise

Raise an error with a condition satisfying i/o-encoding-error?. The erroneous octet are consumed and next I/O on the port continues with the next character.

Function: native-transcoder

[SRFI-181]{srfi.181} Returns a singleton of the transcoder representing systems native (internal) codec and eol-style. In Gauche, the native codec is the codec uses Gauche’s native encoding (returned by gauche-character-encoding, see Characters), and eol-style is none.

Function: transcoded-port binary-port transcoder

[SRFI-181]{srfi.181} Creates a transcoded port wrapping binary-port, performing the conversion specified by transcoder.

If binary-port is an input port, it returns an input port, converting the CES specified in transcoder to the system’s native encoding.

If binary-port is an output port, it returns an output port, converting the system’s native encoding to the CES specified in transcoder.

In Gauche, conversion is done by conversion ports. See Conversion ports, for the details.

Function: bytevector->string bytevector transcoder

[SRFI-181]{srfi.181} Decode the binary data in bytevector as the CES specified by transcoder, and returns a string in the native encoding.

It is a wrapper of Gauche’s ces-convert; see Conversion ports.

Function: string->bytevector string transcoder

[SRFI-181]{srfi.181} Encode the string in the CES specified by transcoder, and returns a bytevector.

It is a wrapper of Gauche’s ces-convert-to; see Conversion ports.

Codecs

Function: make-codec name

[SRFI-181]{srfi.181} Returns a codec representing a character encoding scheme named by name. A portable code should only use string for name, while Gauche accepts a symbol as well.

If name isn’t recognized as a supported codec name, a condition that satisfies unknown-encoding-error? is thrown.

Function: unknown-encoding-error? obj

[SRFI-181]{srfi.181} If the system sees unknown or unsupported codec, a condition that satisfies this predicate is thrown.

Function: unknown-encoding-error-name obj

[SRFI-181]{srfi.181} The argument must be an unknown encoding error condition that satisfies unknown-encoding-error?. It returns the name that caused the condition to be thrown.

Function: latin-1-codec
Function: utf-8-codec
Function: utf-16-codec

[SRFI-181]{srfi.181} A pre-defined codecs for latin-1 (ISO8859-1), utf-8, and utf-16.

The utf-16 codec recognizes BOM when used for input; if no BOM is found, UTF-16BE is assumed. When used for output, utf-16 always attaches BOM.

EOL style

Function: native-eol-style

[SRFI-181]{srfi.181} Returns the default eol style. In Gauche, it is none.

Transcoding errors

Function: i/o-decoding-error? obj

[SRFI-181]{srfi.181} When an input transcoded port encounters a sequence that’s not valid for the input codec, a condition that satisfies this predicate is thrown.

In Gauche, such condition is <io-decoding-error>.

Function: i/o-encoding-error? obj

[SRFI-181]{srfi.181} When an output transcoded port encounters a character that can’t be encoded in the output codec, and the handling mode is raise, a condition that satisfies this predicate is thrown.

In Gauche, such condition is <io-encoding-error>.

Function: i/o-encoding-error-char i/o-encoding-condition

[SRFI-181]{srfi.181} Retries the character that caused the <io-encoding-error> is thrown.



For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT