rfc.zlib
- zlib compression library ¶This module provides bindings to zlib compression library. Most features of zlib can be used through this module.
Zlib supports reading and writing of Zlib compressed data format (RFC1950), DEFLATE compressed data format (RFC1951), and GZIP file format (RFC1052). It also provides procedures to calculate CRC32 and Adler32 checksums.
Compression and decompression are done through specialized ports. There are number of parameters to fine-tune compression; refer to zlib documentation for the details.
The following condition types are defined to represent errors during processing by zlib.
{rfc.zlib
}
Subclass of <error>
and superclass of the following
condition types. This class is an abstract class to catch any of the
zlib-specific errors. Zlib-specific errors raised by
procedures in rfc.zlib
are always an instance (or a compound
condition including) one of the following specific classes.
{rfc.zlib
}
Subclasses of <zlib-error>
. Those condition type correspond
to zlib’s
Z_NEED_DICT_ERROR
,
Z_STREAM_ERROR
,
Z_DATA_ERROR
,
Z_MEMORY_ERROR
, and
Z_VERSION_ERROR
errors.
When an error occurs during reading data, a compound
condition of a subclass of <zlib-error>
and
<io-read-error>
is raised.
When an error occurs without I/O, a simple condition
of a subclass of <zlib-error>
is raised.
Errors unrelated to zlib, such as invalid argument error,
would be a simple <error>
condition.
{rfc.zlib
}
Compression and decompression functions are provided
via ports. A deflating port is an output port
that compresses the output data. An inflating port
is an input that reads compressed data and decompress it.
When an inflating port encounters a corrupted compressed
data, a compound condition of <io-read-error>
and <zlib-data-error>
is raised during read operation.
{rfc.zlib
}
Creates and returns an instance of <deflating-port>
,
an output port that compresses the output data and sends
the compressed data to another output port drain.
This combines the functionality of zlib’s deflateInit2()
and deflateSetDictionary()
.
You can specify an exact integer between 1 and 9 (inclusive) to compression-level. Larger integer means larger compression ratio. When omitted, a default compression level is used, which is usually 6.
The following constants are defined to specify compression-level conveniently:
{rfc.zlib
}
The buffer-size argument specifies the buffer size of the port in bytes. The default is 4096.
The window-bits argument specifies the size of the window in exact integer. Typically the value should be between 8 and 15, inclusive, and it specifies the base two logarithm of the window size used in compression. Larger number yields better compression ratio, but more memory usage. The default value is 15.
There are a couple of special modes specifiable by window-bits. When an integer between -8 and -15 is given to window-bits, the port produces a raw deflated data, that lacks zlib header and trailer. In this case, Adler32 checksum isn’t calculated. The actual window size is determined by the absolute value of window-bits.
When window-bits is between 24 and 31, the port uses GZIP encoding;
that is, instead of zlib wrapper,
the compressed data is enveloped by simple gzip header and trailer.
The gzip header added by this case doesn’t have filenames, comments,
header CRC and other data, and have zero modified time, and 255 (unknown)
in the OS field. The zstream-adler32
procedure will return
CRC32 checksum instead of Adler32.
The actual window size is determined by window-bits-16.
The memory-level argument specifies how much memory should be allocated to keep the internal state during compression. 1 means smallest memory, which causes slow and less compression. 9 means fastest and best compression with largest amount of memory. The default value is 8.
To fine tune compression algorithm, you can use the strategy argument. The following constants are defined as the valid value as strategy:
{rfc.zlib
}
The default strategy, suitable for most ordinary data.
{rfc.zlib
}
Suitable for data generated by filters.
Filtered data consists mostly of small values with a
random distribution, and this makes the compression algorithm
to use more huffman encoding and less string match.
{rfc.zlib
}
Force huffman encoding only (no string match).
{rfc.zlib
}
Limit match distance to 1 (that is, to force run-length encoding).
It is as fast as Z_HUFFMAN_ONLY
and gives better compression
for png image data.
{rfc.zlib
}
Prohibits dynamic huffman encoding. It allows a simple decoder
for special applications.
The choice of strategy only affects compression ratio and speed. Any choice produces correct and decompressable data.
You can give an initial dictionary to the dictionary argument to be used in compression. The compressor and decompressor must use exactly the same dictionary. See the zlib documentation for the details.
By default, a deflating port leaves drain open after all conversion is done, i.e. the deflating port itself is closed. If you don’t want to bother closing drain, give a true value to the owner? argument; then drain is closed after the deflating port is closed and all data is written out.
Note: You must close a deflating port explicitly, or the compressed data can be chopped prematurely. When you leave a deflating port open to be GCed, the finalizer will close it; however, the order in which finalizers are called is undeterministic, and it is possible that the drain port is closed before the deflating port is closed. In such cases, the deflating port’s attempt to flush the buffered data and trailer will fail.
{rfc.zlib
}
Takes an input port source from which a compressed data
can be read, and creates and returns a new instance of
<inflating-port>
, that is, a port that allows
decompressed data from it.
This procedure covers zlib’s functionality of
inflateInit2()
and inflateSetDictionary()
.
The meaning of buffer-size and owner are
the same as open-deflating-port
.
The meaning of window-bits is almost the same, except that if a value increased by 32 is given, the inflating port automatically detects whether the source stream is zlib or gzip by its header.
That is, you can specify between 8 to 15 to read zlib, 24 to 31 to read gzip, or 40 to 47 to use automatic detection.
The window bits must be equal to or greater than the window bits
used to compress the source,
or a <zlib-data-error>
condition is thrown.
If you don’t know the compression
parameters of the input (which is most likely the case), you need to
specify the maximum value, i.e. 15 for zlib, 31 for gzip, or 47 to
autodetect.
If the input data is compressed with specified dictionary,
the same dictionary must be given to the dictionary argument.
Otherwise, a compound condition of
<io-read-error>
and <zlib-need-dict-error>
will be raised.
{rfc.zlib
}
The xflating-port argument must be either
inflating and deflating port, or an error is raised.
Returns the value of total_in
, total_out
,
adler32, and data_type
fields of the z_stream
structure associated to the given inflating or deflating port,
respectively.
The value of data_type
can be one of the following
constants:
{rfc.zlib
}
Changes compression level and/or strategy during compressing.
{rfc.zlib
}
When a dictionary is given to open-deflating-port
, the
dictionary’s adler32 checksum is calculated. This
procedure returns the checksum. If no dictionary has been given,
this procedure returns #f
.
{rfc.zlib
}
Flush the data buffered in the deflating-port, and
resets compression state. The decompression routine can
skip the data to the full-flush point by inflate-sync
.
{rfc.zlib
}
Skip the (possibly corrupted) compressed data up to the
next full-flush point marked by deflating-port-full-flush
.
You may want to use this procedure when you get
<zlib-data-error>
. Returns the number of bytes
skipped when the next full-flush point is found, or
#f
when the input reaches EOF before finding the next point.
{rfc.zlib
}
Returns Zlib’s version in string.
{rfc.zlib
}
Compresses the given string and returns zlib-compressed data
in a string. All optional arguments are passed to
open-deflating-port
as they are.
{rfc.zlib
}
Takes zlib-compressed data in string, and returns decompressed data
in a string. All optional arguments are passed to
open-inflating-port
as they are.
{rfc.zlib
}
Like deflate-string
and inflate-string
, but
uses the gzip format instead. It is same as giving
more than 15 to the window-bits argument of deflate-string
and inflate-string
.
{rfc.zlib
}
Returns CRC32 checksum of string. If optional checksum
is given, the returned checksum is an update of checksum by
string.
{rfc.zlib
}
Returns Adler32 checksum of string. If optional checksum
is given, the returned checksum is an update of checksum by
string.
Calculating Adler32 is faster than CRC32, but it is known to produce uneven distribution of hash values for small input. See RFC3309 for the detailed description. If it matters, use CRC32 instead.