For Development HEAD DRAFTSearch (procedure/syntax/module):

12.10 control.pmap - Parallel map

Module: control.pmap

This module provides high-level utilities to run code in parallel using threads. For example, the pmap procedure applies the given procedure on each elements in the collection and gathers the restuls into a list, just like map, but the appliation of the procedure is done in parallel.

A desired parallelization strategy differs for application, so we also pvovide mapper objects, that encapsulate how the work is distributed.

High-level API

Function: pmap proc collection :key mapper

{control.pmap} The proc argument must be a procedure that takes one argument, and collection must be a collection (see gauche.collection - Collection framework).

Applies proc on each element of collection, possibly concurrently using multiple threads. The result is gathered into a list and returned.

You can pass a mapper to the mapper keyword argument to specify how the task is distibuted to multiple threads.

Function: pfind pred collection :key mapper
Function: pany pred collection :key mapper

{control.pmap} These are to be used to find one element that satisfies the predicate pred. As soon as the element is found, other tasks are cancelled.

pfind is like find that returns the element that satisfies pred, while pany is like any that returns the result of pred that isn’t #f.

If no element satisfies pred, #f is returned.

If there are more than one element that satisfy pred, which one is picked depends on various factors, so you shouln’t count on a deterministic behavior.

Mappers

A mapper is an object that encapsulates a storategy to run tasks in parallel. We provide the following mappers.

Static mapper

Creates several threads and distribute the tasks evenly. It is suitable when the number of tasks are large and each task is expected to take mostly same amount of time, for it takes less overhead than other multi-threading mappers.

Pool mapper

Uses thread pool to process the tasks. It is suitable when the number of tasks are large and/or the execution time of each task varies a lot. You can also reuse the pooled threads, so that you can reduce the overhead of thread creation.

Fully concurrent mapper

Creates one thread per each task. It is suitable when the task involves blocking I/O calls, and the number of tasks are not so large.

Sequential mapper

This runs tasks sequentially in a calling thread. No concurrency involved. It serves two purposes: (1) On a single-core system, this is the least overhead strategy, and (2) You can test the algorithmic correctness without complication of concurrency. On single-core systems, this mapper is the default value of default-mapper.

Parameter: default-mapper

{control.pmap} A parameter keeping a mapper to be used by pmap etc. when no mapper is specified.

The default is a static mapper (with the number of threads same as the number of available cores) if Gauche is running system with more than one core, or a sequential mapper otherwise.

The mapper set to this parameter is reused, or even is used simultaneously from multiple pmap calls. Pool mappers with external pool keeps a given thread pool in it, so you should be careful to use such mapper as the default mapper.

Function: sequential-mapper

{control.pmap} Returns a sigleton instance of the sequential mapper.

Function: make-static-mapper :optional num-threads

{control.pmap} Returns a new instance of a static mapper, which spawns num-threads threads on execution, each of which handles evenly divided tasks. This mapper is suitable if you have large number of small tasks with even load.

If num-threads is omitted, the number of available processors returned by sys-available-processors is used (see Environment inquiry).

Function: make-pool-mapper :optional external-pool

{control.pmap} Returns a new instance of a pool mapper, which uses a thread pool (see control.thread-pool - Thread pools) to run the tasks. It is suitable when the load of tasks varies a lot.

If external-pool is not given, the mapper creats a thread pool, and shut it down, every time high-level mapping operation is called. This usage is local; that is, the thread pool is contained within one call of pmap etc., and won’t be shared.

Alternatively, you can pass an existing thread pool to external-pool to be used. The pool will be reused every time you use this mapper instance. Using an external pool will eliminate overhead of thread pool creation and shutting down every time you run pmap; however, you have to be aware of those caveats:

  • It’s your responsibility to shut down the thread pool after you’re done with the mapper.
  • The mapper keeps the given thread pool and reuses it every time it is passed to pmap etc., so you have to make sure that one mapper is not used simultaneously in more than one pmap etc.. Be careful using this type of pool mapper as the default mapper.
Function: make-fully-concurrent-mapper :optional timeout timeout-val

{control.pmap} Returns a new instance of a fully-concurrent mapper, which spawns as many threads as the elements in the given collection to perform the operation concurrently. It is suitable when you don’t have many tasks, but each task may perform blocking I/O calls. The overhead of creating threads are relatively large, but you may be able to utilize CPU more while most of the threads are waiting I/O.

The optional timeout argument can specify the timeout value for threads to run. As in thread-join!, it can be either a <time> object to specify an absolute point of time, a real number for the number of seconds from the current time, or #f to run it indefinitely. The default is #f.

If timeout reaches, timeout-val is used in place of the result of proc, which is defaulted to #f.

NB: If timeout reaches, the running threads are terminated by thread-terminate!. If the thread is locking a mutex when it occurs, the mutex becomes ‘abandoned’ state. The resources being used then may not be properly cleaned up. If you need to ensure proper cleanup in bounded time, you need to code it in proc explicitly.



For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT