A cache is similar to a dictionary, associating keys to values, but its entries may dissapear according to the policy of the cache algorithm. This module defines a common protocol for cache datatypes, and also provides several typical cache implementations.
Let’s start from simple examples to get the idea.
Suppose you want to read files and you want to
cache the frequently read ones. The following code defines
a cached version of
(use data.cache) (use file.util) (define file->string/cached (let1 file-cache (make-lru-cache 10 :comparator string-comparator) (^[path] (cache-through! file-cache path file->string))))
The procedure closes a variable
which is an LRU (least recently used) cache that associates
string pathnames to the file contents.
The actual logic is in
cache-through!, which first consults the
cache if it has an entry for the
path. If the cache has
the entry, its value (the file content) is returned. If not,
it calls file->string with the path to fetch the file content,
register it to the cache, and return it. The capacity of cache is
set to 10 (the first argument of
make-lru-cache), so when the
11th file is read, the least recently used file will be purged from the
The effect of cache isn’t very visible in the above example. You can insert some print stubs to see the cache is actually in action:
(define file->string/cached (let1 file-cache (make-lru-cache 10 :comparator string-comparator) (^[path] (print #"file->string/cached called on ~path") (cache-through! file-cache path (^[path] (print #"cache miss. fetching ~path") (file->string path))))))
Caveat: A cache itself isn’t MT-safe. If you are using it in multithreaded programs, you have to wrap it with an atom (see Synchronization primitives):
(use data.cache) (use file.util) (use gauche.threads) (define file->string/cached (let1 file-cache (atom (make-lru-cache 10 :comparator string-comparator)) (^[path] (atomic file-cache (cut cache-through! <> path file->string)))))
A cache of any kind has a comparator and a storage.
The comparator is used to compare keys; in the example above,
string-comparator to compare string pathnames
(See Basic comparators, for more about comparators).
The storage is a dictionary that maps keys to internal
structures of the cache. By default, a hashtable is
created automatically using the given comparator (or,
if a comparator is omitted, using
The comparator must have hash function.
Alternatively, you can give a pre-filled dictionary (copied from another instance of the same kind of cache) to start cache with some data already in it. Note that what the cache keeps in the dictionary totally depends on the cache algorithm, so you can’t just pass a random dictionary; it has to be created by the same kind of cache. If you pass in the storage, the comparator is taken from it.
Thus, the comparator constructors uniformly take keyword arguments
storage; you can specify either one,
or omit both to use the defaults.
comparator keyword arguments, see above.
Creates and returns a FIFO (first-in, first-out) cache that can hold up to capacity entries. If the number of entries exceeds capacity, the oldest entry is removed.
Creates and returns an LRU (least recently used) cache that can hold up to capacity entries. If the number of entries exceeds capacity, the least recently used entry is removed.
Creates and returns a TTL (time to live) cache with the timeout value timeout. Each entry is timestamped when it’s inserted, and it is removed when the current time passes timeout unit from the timestamp. The actual entry removal is done when the cache is accessed.
By default, the Unix system time (seconds from Epoch) is used as a timestamp, and timeout is in seconds. It may not be fine-grained enough if you add multiple entries in shorter intervals than seconds. You can customize it by giving a thunk to timestamper; the thunk is called to obtain a timestamp, which can be any monotonically increasing real number. If you give timestamper, the unit of timeout value should be the same as whatever timestamper returns.
A variation of TTL cache, but the entry’s timestamp is updated (refreshed) whenever the entry is read. Hence we call it TTL with refresh (TTLR). But you can also think it as a variation of LRU cache with timeout.
The unit of timeout, and the role of timestamper argument, are
the same as
The following APIs are for the users of a cache.
Look for an entry with key in cache, and returns its value if it exists. If there’s no entry, the procedure returns default if it is provided, or throws an error otherwise.
Some types of cache algorithms update cache by this operation, hence the bang is in the name.
Look for an entry with key in cache, and returns its value if it exists. If there’s no entry, a procedure value-fn is called with key as the argument, and its return value is inserted into cache and also returned.
This inserts association of key and value into cache. If there’s already an entry with key, it is overwritten. Otherwise a new entry is created.
The same effect can be achieved by calling
cache-through!, but cache algorithms may provide efficient
way through this method.
Removes an entry with key from cache, if it exists.
Removes all entries from cache.
Each cache algorithm must define a class inheriting
and implement the following two essential methods.
The higher-level API calls them.
Looks for an entry with key in cache.
If it exists, returns a pair of key and the associated value.
#f. It may update the cache,
for example, the timestamp of the entry for being read.
Add an entry with key and associated value into cache. This is called after key is confirmed not being in cache.
Additionally, the implementation should consider the following points.
next-methodfirst, which sets up the
storageslots. You should check if
storagehas pre-filled entries, and if so, set up other internal structures appropriately.
cache-clear!only takes care of the storage of the cache. You should implement them if your auxiliary structure needs to be taken care of.
cache-register!. You may provide alternative method if you can do it more efficiently, which is often the case.
There are several procedures that help implementing cache subclasses:
Returns the comparator and the storage of the cache, respectively.
Typical caches may be constructed with a storage (dictionary) and
a queue, where the storage maps keys to
(<n> . <value>),
and queues holds
(<key> . <n>),
<n> being a number
(timestamp, counter, etc.) Here are some common operations work
on this queue-and-dictionary scheme:
You can call this in the
initialize method to set up the
queue. This procedure walks storage to construct
(<key> . <n>)
pairs, sorts it in increasing order of
<n>, and pushes them
into the queue.
The queue may contain multiple pairs with the same key. Sometimes the queue gets to have too many duplicated entries (e.g. the same entry is read repeatedly). This scans the queue and removes duplicated entries but the up-to-date one. After this operation, the length of the queue and the number of entries in the storage should match.
This procedure renumbers
<n>s in the queue and the storage
starting from 0, without changing their order, and returns the
<n>. The duplicated entries in the queue is removed
When you’re using monotonically increasing counter for
you don’t want
<n> to get too big (i.e. bignums), you can
call this procedure occasionally to keep
<n>’s in reasonable