For Development HEAD DRAFTSearch (procedure/syntax/module):

Next: , Previous: , Up: Library modules - Utilities   [Contents][Index]

12.8 control.pmap - Parallel map

Module: control.pmap

This module provides high-level utilities to run code in parallel using threads. For example, the pmap procedure applies the given procedure on each elements in the collection and gathers the restuls into a list, just like map, but the appliation of the procedure is done in parallel.

A desired parallelization strategy differs for application, so we also pvovide mapper objects, that encapsulate how the work is distributed.

High-level API

Function: pmap proc collection :key mapper

{control.pmap} The proc argument must be a procedure that takes one argument, and collection must be a collection (see Collection framework).

Applis proc on each element of collection, possibly concurrently using multiple threads. The result is gathered into a list and returned.

You can pass a mapper to the mapper keyword argument to specify how the task is distibuted to multiple threads.

Function: pfind pred collection :key mapper
Function: pany pred collection :key mapper

{control.pmap} These are to be used to find one element that satisfies the predicate pred. As soon as the element is found, other tasks are cancelled.

pfind is like find that returns the element that satisfies pred, while pany is like any that returns the result of pred that isn’t #f.

If no element satisfies pred, #f is returned.

If there are more than one element that satisfy pred, which one is picked depends on various factors, so you shouln’t count on a deterministic behavior.

Mappers

A mapper is an object that encapsulates a storategy to run tasks in parallel. We provide the following mappers.

Static mapper

Creates several threads and distribute the tasks evenly. It is suitable when the number of tasks are large and each task is expected to take mostly same amount of time, for it takes less overhead than other multi-threading mappers.

Pool mapper

Uses thread pool to process the tasks. It is suitable when the number of tasks are large and/or the execution time of each task varies a lot. You can also reuse the pooled threads, so that you can reduce the overhead of thread creation.

Fully concurrent mapper

Creates one thread per each task. It is suitable when the task involves blocking I/O calls, and the number of tasks are not so large.

Sequential mapper

This runs tasks sequentially in a calling thread. No concurrency involved. It serves two purposes: (1) On a single-core system, this is the least overhead strategy, and (2) You can test the algorithmic correctness without complication of concurrency. On single-core systems, this mapper is the default value of default-mapper.

Parameter: default-mapper

{control.pmap} A parameter keeping a mapper to be used by pmap etc. when no mapper is specified.

The default is a static mapper (with the number of threads same as the number of available cores) if Gauche is configured with threads and the running system has more than one core, or a sequential mapper otherwise.

The mapper set to this parameter is reused, or even is used simultaneously from multiple pmap calls. Pool mappers with external pool keeps a given thread pool in it, so you should be careful ot use such mapper as the default mapper.

Function: sequential-mapper

{control.pmap} Returns a sigleton instance of the sequential mapper.

Function: make-static-mapper :optional num-threads

{control.pmap} Returns a new instance of a static mapper, which spawns num-threads threads on execution, each of which handles evenly divided tasks. This mapper is suitable if you have large number of small tasks with even load.

Function: make-pool-mapper :optional external-pool

{control.pmap} Returns a new instance of a pool mapper, which uses a thread pool (see Thread pools) to run the tasks. It is suitable when the load of tasks varies a lot.

If external-pool is not given, the mapper creats a thread pool, and shut it down, every time high-level mapping operation is called. This usage is local; that is, the thread pool is contained within one call of pmap etc., and won’t be shared.

Alternatively, you can pass an existing thread pool to external-pool to be used. The pool will be reused every time you use this mapper instance. Using an external pool will eliminate overhead of thread pool creation and shutting down every time you run pmap; however, you have to be aware of those caveats:

Function: make-fully-concurrent-mapper :optional timeout timeout-val

{control.pmap} Returns a new instance of a fully-concurrent mapper, which spawns as many threads as the elements in the given collection to perform the operation concurrently. It is suitable when you don’t have many tasks, but each task may perform blocking I/O calls. The overhead of creating threads are relatively large, but you may be able to utilize CPU more while most of the threads are waiting I/O.

The optional timeout and timeout-val arguments are passed to thread-join! (see Thread procedures). It is useful when I/O operations may take too long and you want to guarantee the entire operation finishes within certain time limit.


Next: , Previous: , Up: Library modules - Utilities   [Contents][Index]


For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT