Next: control.scheduler
- Scheduler, Previous: control.plumbing
- Plumbing ports, Up: Library modules - Utilities [Contents][Index]
control.pmap
- Parallel mapThis module provides high-level utilities to run code in parallel
using threads.
For example, the pmap
procedure applies the given procedure
on each elements in the collection and gathers the restuls into a list,
just like map
, but the appliation of the procedure is done in parallel.
A desired parallelization strategy differs for application, so we also pvovide mapper objects, that encapsulate how the work is distributed.
{control.pmap}
The proc argument must be a procedure that takes one argument,
and collection must be a collection (see gauche.collection
- Collection framework).
Applies proc on each element of collection, possibly concurrently using multiple threads. The result is gathered into a list and returned.
You can pass a mapper to the mapper keyword argument to specify how the task is distibuted to multiple threads.
{control.pmap} These are to be used to find one element that satisfies the predicate pred. As soon as the element is found, other tasks are cancelled.
pfind
is like find
that returns the element that satisfies
pred, while pany
is like any
that returns the result
of pred that isn’t #f
.
If no element satisfies pred, #f
is returned.
If there are more than one element that satisfy pred, which one is picked depends on various factors, so you shouln’t count on a deterministic behavior.
A mapper is an object that encapsulates a storategy to run tasks in parallel. We provide the following mappers.
Static mapper
Creates several threads and distribute the tasks evenly. It is suitable when the number of tasks are large and each task is expected to take mostly same amount of time, for it takes less overhead than other multi-threading mappers.
Pool mapper
Uses thread pool to process the tasks. It is suitable when the number of tasks are large and/or the execution time of each task varies a lot. You can also reuse the pooled threads, so that you can reduce the overhead of thread creation.
Fully concurrent mapper
Creates one thread per each task. It is suitable when the task involves blocking I/O calls, and the number of tasks are not so large.
Sequential mapper
This runs tasks sequentially in a calling thread. No concurrency involved.
It serves two purposes: (1) On a single-core system, this is the least
overhead strategy, and (2) You can test the algorithmic correctness
without complication of concurrency. On single-core systems,
this mapper is the default value of default-mapper
.
{control.pmap}
A parameter keeping a mapper to be used by pmap
etc. when no
mapper is specified.
The default is a static mapper (with the number of threads same as the number of available cores) if Gauche is running system with more than one core, or a sequential mapper otherwise.
The mapper set to this parameter is reused, or even is used
simultaneously from multiple pmap
calls.
Pool mappers with external pool keeps a given thread pool in it,
so you should be careful to use such mapper as the default mapper.
{control.pmap} Returns a sigleton instance of the sequential mapper.
{control.pmap} Returns a new instance of a static mapper, which spawns num-threads threads on execution, each of which handles evenly divided tasks. This mapper is suitable if you have large number of small tasks with even load.
If num-threads is omitted, the number of available
processors returned by sys-available-processors
is used (see Environment inquiry).
{control.pmap}
Returns a new instance of a pool mapper, which uses a thread pool
(see control.thread-pool
- Thread pools) to run the tasks. It is suitable when the load
of tasks varies a lot.
If external-pool is not given, the mapper creats a thread
pool, and shut it down, every time high-level mapping operation is called.
This usage is local; that is, the thread pool is contained within
one call of pmap
etc., and won’t be shared.
Alternatively, you can pass an existing thread pool to external-pool
to be used. The pool will be reused every time you use this mapper instance.
Using an external pool will eliminate overhead of thread pool creation and
shutting down every time you run pmap
; however, you have to be
aware of those caveats:
pmap
etc., so you have to make sure that
one mapper is not used simultaneously in more than one pmap
etc..
Be careful using this type of pool mapper as the default mapper.
{control.pmap} Returns a new instance of a fully-concurrent mapper, which spawns as many threads as the elements in the given collection to perform the operation concurrently. It is suitable when you don’t have many tasks, but each task may perform blocking I/O calls. The overhead of creating threads are relatively large, but you may be able to utilize CPU more while most of the threads are waiting I/O.
The optional timeout argument can specify
the timeout value for threads to run. As in thread-join!
,
it can be either a <time>
object to specify an absolute
point of time, a real number for the number of seconds from
the current time, or #f
to run it indefinitely.
The default is #f
.
If timeout reaches, timeout-val is used in place of
the result of proc, which is defaulted to #f
.
NB: If timeout reaches, the running threads are terminated by
thread-terminate!
. If the thread is locking a mutex
when it occurs, the mutex becomes ‘abandoned’ state. The resources
being used then may not be properly cleaned up. If you need
to ensure proper cleanup in bounded time, you need to code it in proc
explicitly.
Next: control.scheduler
- Scheduler, Previous: control.plumbing
- Plumbing ports, Up: Library modules - Utilities [Contents][Index]