For Gauche 0.9.5


Next: , Previous: , Up: Library modules - Gauche extensions   [Contents][Index]

9.24 gauche.process - High Level Process Interface

Module: gauche.process

This module provides a higher-level API of Unix process control, implemented on top of low-level system calls such as sys-fork and sys-exec. This module also provides “process ports”, a convenient way to send/receive information to/from subprocesses.


Next: , Previous: , Up: High Level Process Interface   [Contents][Index]

9.24.1 Running subprocess

Function: run-process cmd/args :key redirects input output error fork wait directory host sigmask

Runs a command with arguments given to cmd/args in a subprocess and returns a <process> object, which is explained in the next section. The cmd/args argument must be a list, whose car specifies the command name and whose cdr is the command-line arguments.

If the command name contains a slash, it is taken as the pathname of the executable. Otherwise the named command is searched from the directories in the PATH environment variable.

Each element in cmd/args are converted to a string by x->string, for the convenience.

For example, the following expression runs ls -al.

(run-process '(ls -al))

If you run the above expression in REPL, you’ll likely to see its return value before the output of ls. By default, run-process does not wait the child process to finish, but it rather returns immediately. If you need to synchronize, pass the wait keyword argument.

(run-process '(ls -al) :wait #t)

Alternatively, you can keep the returned <process> object and call process-wait on it to wait for its termination. See Process object, for the details of process-wait.

(let1 p (run-process '(ls -al))
  ... do some other work ...
  (process-wait p))

Note that -i is read as an imaginary number, so be careful to pass -i as a command-line argument; you should use a string, or write |-i| to make it a symbol.

(run-process '(ls "-i"))

Note: An alternative way to run external process is sys-system, which takes a command line as a single string (see Process management). The string is passed to the shell to be interpreted, so you can include redirections, or can pipe several commands. It would be handy for quick throwaway scripts.

On the other hand, with sys-system, if you want to change command parameters at runtime, you need to worry about properly escape them (actually we have one to do the job in gauche.process; see shell-escape-string below); you need to be aware that /bin/sh, used by sys-system via system(3) call, may differ among platforms and be careful not to rely on specific features on certain systems. As a rule of thumb, keep sys-system for really simple tasks with a constant command line, and use run-process for all other stuff.

Note: Old version of this procedure took arguments differently, like (run-process "ls" "-al" :wait #t), which was compatible to STk. This is still supported but deprecated.

Large number of keyword arguments can be passed to run-process to control execution of the child process. We describe them by categories.

Synchronization

run-process argument: wait flag

If flag is true, run-process waits until the subprocess terminates. Othewise the subprocess runs asynchronously and run-process returns immediately, which is the default behavior.

Note that if the subprocess is running asynchronously, it is the caller’s responsibility to call process-wait at a certain timing to collect its exit status.

;; This returns after wget terminates.
(define p (run-process '(wget http://practical-scheme.net/) :wait #t))

;; Check the exit status
(let1 st (process-exit-status p)
  (cond [(sys-wait-exited? st)
         (print "wget exitted with status " (sys-wait-exit-status st))]
        [(sys-wait-signaled? st)
         (print "wget interrupted by signal " (sys-wait-termsig st))]
        [else
         (print "wget terminated with unknown status " st)]))
run-process argument: fork flag

If flag is true, run-process forks to run the subprocess, which is the default behavior. If flag is false, run-process directly calls sys-exec, so it never returns.

I/O redirection

run-process argument: redirects (iospec …)

Specifies how to redirect child process’s I/Os. Each iospec can be one of the followings, where fd, fd0, and fd1 are nonnegative integers referring to the file descriptor of the child process.

(Note: If you just want to run a command and get its output as a string take a look at process-output->string (see Process ports). If you want to pipe multiple commands together, see Running multiple processes.)

(< fd source)

source can be a string, a symbol, a keyword :null, an integer, or an input port.

If it is a string, it names a file opened for read and the child process can reads the content of the file from fd. An error is signaled if the file does not exist or cannot open for read.

If it is a symbol, an unidirectional pipe is created, whose reader end is connected to the child’s fd, and whose writer end is available as an output port returned from (process-input process source).

If it is :null, the child’s fd is connected to the null device.

If it is an integer, it should specify a parent’s file descriptor opened for read. The child sees the duped file descriptor as fd.

If it is an input port, the underlying file descriptor is duped into child’s fd. It is an error to pass an input port without associated file descriptor (See port-file-number in Common port operations).

(<< fd value)
(<<< fd obj)

Feeds value or obj to the input file descriptor fd of the child process.

With <<, value must be either a string or a uniform vector (see Uniform vectors). It is sent to the child process as is. Using a uniform vector is good to pass binary content.

With <<<, obj can be any Scheme object, and the result of (write-to-string obj) is sent to the child process.

(<& fd0 fd1)

Makes child process’s file descriptor fd0 refer to the same input as its file descriptor fd1. Note the difference from <; (< 3 0) makes the parent’s stdin (file descriptor 0) be read by the child’s file descriptor 3, while (<& 3 0) makes the child’s file descriptor 3 refer to the same input as child’s stdin (which may be redirected to a file or something else by another iospec).

See the note below on the order of processing <&.

(> fd sink)
(>> fd sink)

sink must be either a string, a symbol, a keyword :null, an integer or a file output port.

If it is a string, it names a file. The output of the child to the file descriptor fd is written to the file. If the named file already exists, > first truncates its content, while >> appends to the existing content.

For other arguments, > and >> works the same.

If sink is a symbol, an unidirectional pipe is created whose writer end is connected to the child’s fd, and whose reader end is available as an input port returned by (process-output process sink).

If sink is :null, child’s fd is connected to the system’s null device.

If sink is an integer, it must specify a parent’s file descriptor opened for output. The child sees the duped file descriptor as fd.

If sink is an output port, the underlying file descriptor is duped into fd in the child process.

(>& fd0 fd1)

Makes child process’s file descriptor fd0 refer to the same output as its file descriptor fd1. Note the difference from >; (> 2 1) makes the child’s stderr go to parent’s stdout, while (>& 2 1) makes the child’s stderr go to the same output as child’s stdout (which may be redirected by another iospec).

;; Read both child's stdout and stderr
(let1 p (run-process '(command arg)
                     :redirects '((>& 2 1) (> 1 out)))
  (begin0 (port->string (process-output p 'out))
          (process-wait p)))

Note: You can’t use the same name (symbol) more than once for the pipe of source or sink. For example, the following code signals an error:

(run-process '(command) :redirects '((> 1 out) (> 2 out))) ; error!

You can use >& to “merge” the output to one sink, or <& to “split” the input from one source, instead:

(run-process '(command) :redirects '((> 1 out) (>& 2 1)))

It is allowed to give the same file name more than once, just like the Unix shell. However, note that the file is opened individually for each file descriptor, so simply writing to them may not produce desired result (for regular files, most likely that one output would overwrite another).

Note: I/O redirections are processed at once, unlike the way unix shell does. For example, both of the following expression works the same way, that is, they redirect both stdout and stderr to a file out.

(run-process '(command arg) :redirects '((>& 2 1) (> 1 "out")))
(run-process '(command arg) :redirects '((> 1 "out") (>& 2 1)))

Most unix shells process redirections in order, so the following two command line works differently: The first one redirects child’s stderr to the current stdout, which is the same as the parent’s stdout, then redirects child’s stdout to a file out. So the error messages appear in the parent’s stdout. The second one first redirects the child’s stdout to a file out, so at the time of processing 2>&1, the child’s stderr also goes to the file.

$ command arg 2>&1 1>out
$ command arg 1>out 2>&1

You can say run-process always works like the latter, regardless of the order in redirects argument.

If you want to redirect child’s stderr to parent’s stdout, you can use > like the following:

(run-process '(command arg) :redirects '((> 2 1) (> 1 "out")))
run-process argument: input source
run-process argument: output sink
run-process argument: error sink

Redirects child’s standard i/o. source and sink may be either a string, a keyword :null, a keyword :pipe, an integer file descriptor or a symbol.

These are really shorthand notations of the redirects argument:

:input x   ≡ :redirects '((< 0 x))
:output x  ≡ :redirects '((> 1 x))
:error x   ≡ :redirects '((> 2 x))

The keyword :pipe as source or sink is supported just for the backward compatibility. They work as if a symbol stdin, stdout or stderr is given, respectively:

:input :pipe   ≡ :redirects '((< 0 stdin))
:output :pipe  ≡ :redirects '((> 1 stdout))
:error :pipe   ≡ :redirects '((> 2 stderr))

That is, a pipe is created and its one end is connected to the child process’s stdio, and the other end is available by calling (process-input process), (process-output process) or (process-error process). (That is because process-input and process-output uses stdin and stdout respectively when name argument is omitted, and (process-error p) is equivalent to (process-output p 'stderr).)

See the description of redirects above for the meanings of the argument values.

Execution environment

run-process argument: directory directory

If a string is given to directory, the process starts with directory as its working directory. If directory is #f, this argument is ignored. An error is signaled if directory is other type of objects, or it is a string but is not a name of a existing directory.

When host keyword argument is also given, this argument specifies the working directory of the remote process.

Note: run-process checks the validity of directory, but actual chdir(2) is done just before exec(2), and it is possible that chdir fails in spite of previous checks. At the moment when chdir fails, there’s no reliable way to raise an exception to the caller of run-process, so it writes out an error message to standard error port and exits. A robust program may take this case into account.

run-process argument: sigmask mask

Mask must be either an instance of <sys-sigset>, a list of integers, or #f. If an instance of <sys-sigset> is given, the signal mask of executed process is set to it. A list of integers are treated as a list of signals to mask. It is important to set an appropriate mask if you call run-process from multithreaded application. See the description of sys-exec (Process management) for the details.

If the host keyword argument is specified, this argument merely sets the signal mask of the local process (ssh).

run-process argument: detached flag

When a true value is given, the new process is detached from the parent’s process group and belongs to its own group. It is useful when you run a daemon process. See sys-fork-and-exec (see Process management), for the detailed description of detached argument.

run-process argument: host hostspec

This argument is used to execute command on the remote host. The full syntax of hostspec is protocol:user@hostname:port, where protocol:, user@, or :port part can be omitted.

The protocol part specifies the protocol to communicate with the remote host; currently only ssh is supported, and it is also the default when protocol is omitted. The user part specifies the login name of the remote host. The hostname specifies the remote host name, and the port part specifies the alternative port number which protocol connects to.

The command line arguments are interpreted on the remote host. On the other hand, the I/O redirection is done on the local end. For example, the following code reads the file /foo/bar on the remote machine and copies its content into the local file baz in the current working directory.

(run-process '(cat "bar")
             :host "remote-host.example.com"
             :directory "/foo"
             :output "baz")

Next: , Previous: , Up: High Level Process Interface   [Contents][Index]

9.24.2 Running multiple processes

Function: run-process-pipeline commands :key input output error wait directory sigmask

A convenience routine to run pipeline of processes at once. Example:

(run-process-pipeline '((ls "src/")
                        (grep "\\.c$")
                        (wc -l)))

This is equivalent to shell command pipeline ls src/ | grep '\.c$' | wc -l, i.e. shows the number of C source files in the src subdirectory.

The commands argument is a list of lists. Each list must be cmd/args argument run-process can accept. At least one command must be specified.

The specified commands are run concurrently, with the stdout of the first command is connected to the stdin of the second, and stdout of the second to the stdin of the third, and so on. The stdin of the first command is fed from the source specified by the input keyword argument, and the stdout of the last command is sent to the sink specified by the output keyword argument. The default values of these are the calling process’s stdin and stdout, respectively. See run-process, for the possible values of these arguments.

The stderr of all the processes are sent to the sink specified by the error keyword argument, which is defaulted by the calling process’s stderr.

The wait keyword argument specifies whether run-process-pipeline waits for the completion of the last process. If a true value is given, run-process-pipeline won’t return until the last process exits. If it is #f, run-process-pipeline returns immediately after all the processes are spawned.

The directory and sigmask keyword arguments are applied to all the processes; see run-process for the description of these arguments.

The return value of run-process-pipeline is a list of <process> objects, in the order as given to commands arguments.

Note that the exit status of processes won’t be automatically taken (except for the last one, when wait is true), so the caller must call process-wait on those process objects to clean up the processes.


Next: , Previous: , Up: High Level Process Interface   [Contents][Index]

9.24.3 Process object

Class: <process>

An object to keep the status of a child process. You can create the process object by run-process procedure described below. The process ports explained in the next section also use process objects.

The <process> class keeps track of the child processes spawned by high-level APIs such as run-process or open-input-process. The exit status of such children must be collected by process-wait or process-wait-any calls, which also do some bookkeeping. Using the low-level process calls such as sys-wait or sys-waitpid directly will cause inconsistent state.

Class: <process-abnormal-exit>

A condition type mainly used by the process port utility procedures. Inherits <error>. This type of condition is thrown when the high-level process port utilities detect the child proces exitted with non-zero status code.

Instance Variable of <process-abnormal-exit>: process

A process object.

Note: In Unix terms, exitting a process by calling exit(2) or returning from main() is a normal exit, regardless of the exit status. Some commands do use non-zero exit status to tell one of the normal results of execution (such as grep(1)). However, large number of commands uses non-zero exit status to indicate that they couldn’t carry out the required operation, so we treat them as exceptional situations.

Function: process? obj

(is-a? obj <process>)

Method: process-pid (process <process>)

Returns the process ID of the subprocess process.

Method: process-command (process <process>)

Returns the command invoked in the subprocess process.

Method: process-input (process <process>) :optional name
Method: process-output (process <process>) :optional name

Retrieves one end of a pipe, whose another end is connected to the process’s input or output, respectively. name is a symbol given to the redirects argument of run-process to distinguish the pipe. See the following example:

(let1 p (run-process '(command arg)
                     :redirects '((< 3 aux-in)
                                  (> 4 aux-out)))
  (let ([auxin  (process-input p 'aux-in)]
        [auxout (process-output p 'aux-out)])
    ;; feed something to the child's input
    (display 'something auxin)
    ;; read data from the child's output
    (read-line auxout)
    …
    )
  (process-wait p))

The symbols aux-in and aux-out is used to identify the pipes. Note that process-input returns output port, and process-output returns input port.

When name is omitted, stdin is used for process-input and stdout is used for process-output. These are the names used if child’s stdin and stdout are redirected by :input :pipe and :output :pipe arguments, respectively.

If there’s no pipe with the given name, #f is returned.

(let* ((process (run-process '("date") :output :pipe))
       (line (read-line (process-output process))))
  (process-wait process)
  line)
 ⇒ "Fri Jun 22 22:22:22 HST 2001"
Method: process-error (process <process>)

This is equivalent to (process-output process 'stderr).

Function: process-alive? process

Returns true if process is alive. Note that Gauche can’t know the subprocess’ status until it is explicitly checked by process-wait.

Function: process-list

Returns a list of active processes. The process remains active until its exit status is explicitly collected by process-wait. Once the process’s exit status is collected and its state changed to inactive, it is removed from the list process-list returns.

Function: process-wait process :optional nohang error-on-nonzero-status

Obtains the exit status of the subprocess process, and stores it to process’s status slot. The status can be obtained by process-exit-status.

This suspends execution until process exits by default. However, if a true value is given to the optional argument nohang, it returns immediately if process hasn’t exit.

If a true value is given to the optional argument error-on-nonzero-status, and the obtained status code is not zero, this procedure raises <process-abnormal-exit> error.

Returns #t if this call actually obtains the exit status, or #f otherwise.

Function: process-wait-any :optional nohang

Obtains the exit status of any of the subprocesses created by run-process. Returns a process object whose exit status is collected.

If a true value is given to the optional argument nohang, this procedure returns #f immediately even if no child process has exit. If nohang is omitted or #f, this procedure waits for any of children exits.

If there’s no child processes, this procedure immediately returns #f.

Function: process-exit-status process

Returns exit status of process retrieved by process-wait. If this is called before process-wait is called on process, the result is undefined.

The meaning of exit status depends on the platform. You need to use sys-wait-exited? or sys-wait-signaled? to see if it is terminated voluntarily or by a signal, and use sys-wait-exit-status or sys-wait-termsig to extract the exit code or the terminating signal (see Process management).

Function: process-send-signal process signal

Sends a signal signal to the subprocess process. signal must be an exact integer for signal number. See Signal, for predefined variables of signals.

Function: process-kill process
Function: process-stop process
Function: process-continue process

Sends SIGKILL, SIGSTOP and SIGCONT to process, respectively.


Previous: , Up: High Level Process Interface   [Contents][Index]

9.24.4 Process ports

Function: open-input-process-port command :key input error encoding conversion-buffer-size

Runs command asynchronously in a subprocess. Returns two values, an input port which is connected to the stdout of the running subprocess, and a process object.

Command can be a string or a list.

If it is a string, it is passed to /bin/sh. You can use shell metacharacters in this form, such as environment variable interpolation, globbing, and redirections. If you create the command line by concatenating strings, it’s your responsibility to ensure escaping special characters if you don’t want the shell to interpret them. The shell-escape-string function described below might be a help.

If command is a list, each element is converted to a string by x->string and then passed directly to sys-exec (the car of the list is used as both the command path and the first element of argv, i.e. argv[0]). Use this form if you want to avoid the shell from interfering; i.e. you don’t need to escape special characters.

The subprocess’s stdin is redirected from /dev/null, and its stderr shares the calling process’s stderr by default. You can change these by giving file pathnames to input and error keyword arguments, respectively.

You can also give the encoding keyword argument to specify character encoding of the process output. If it differs from the Gauche’s internal encoding format, open-input-process-port inserts a character encoding conversion port. If encoding is given, the conversion-buffer-size keyword argument can control the conversion buffer size. See Character code conversion, for the details of character encoding conversions.

(receive (port process) (open-input-process-port "ls -l Makefile")
  (begin0 (read-line port)
          (process-wait process)))
 ⇒ "-rw-r--r--   1 shiro    users        1013 Jun 22 21:09 Makefile"

(receive (port process) (open-input-process-port '(ls -l "Makefile"))
  (begin0 (read-line port)
          (process-wait process)))
 ⇒ "-rw-r--r--   1 shiro    users        1013 Jun 22 21:09 Makefile"

(open-input-process-port "command 2>&1")
 ⇒ ;the port reads both stdout and stderr

(open-input-process-port "command 2>&1 1>/dev/null")
 ⇒ ;the port reads stderr

The exit status of subprocess is not automatically collected. It is the caller’s responsibility to issue process-wait, or the subprocess remains in a zombie state. If it bothers you, you can use one of the following functions.

Function: call-with-input-process command proc :key input error encoding conversion-buffer-size on-abnormal-exit

Runs command in a subprocess and pipes its stdout to an input port, then call proc with the port as an argument. When proc returns, it collects its exit status, then returns the result proc returned. The cleanup is done even if proc raises an error.

The keyword argument on-abnormal-exit specifies what happens when the child process exits with non-zero status code. It can be either :error (default), :ignore, or a procedure that takes one argument. If it is :error, a <process-abnormal-exit> error condition is thrown by non-zero exit status; the process slot of the condition holds the process object. If it is :ignore, nothing is done for non-zero exit status. If it is a procedure, it is called with a process object; when the procedure returns, call-with-input-process returns normally.

The semantics of command and other keyword arguments are the same as open-input-process-port above.

(call-with-input-process "ls -l *"
  (lambda (p) (read-line p)))
Function: with-input-from-process command thunk :key input error encoding conversion-buffer-size on-abnormal-exit

Runs command in a subprocess, and calls thunk with its current input port connected to the command’s stdout. The command is terminated and its exit status is collected, after thunk returns or raises an error.

The semantics of command and keyword arguments are the same as call-with-input-process above.

(with-input-from-process "ls -l *" read-line)
Function: open-output-process-port command :key output error encoding conversion-buffer-size

Runs command in a subprocess asynchronously. Returns two values, an output port which is connected to the stdin of the subprocess. and the process object.

The semantics of command is the same as open-input-process-port. The semantics of encoding and conversion-buffer-size are also the same.

The subprocess’s stdout is redirected to /dev/null by default, and its stderr shares the calling process’s stderr. You can change these by giving file pathnames to output and error keyword arguments, respectively.

The exit status of the subprocess is not automatically collected. The caller should call process-wait on the subprocess at appropriate time.

Function: call-with-output-process command proc :key output error encoding conversion-buffer-size on-abnormal-exit

Runs command in a subprocess, and calls proc with an output port which is connected to the stdin of the command. The exit status of the command is collected after either proc returns or raises an error.

The semantics of keyword arguments are the same as open-output-process-port, except on-abnormal-exit, which is the same as described in call-with-input-process.

(call-with-output-process "/usr/sbin/sendmail"
  (lambda (out) (display mail-body out)))
Function: with-output-to-process command thunk :key output error encoding conversion-buffer-size on-abnormal-exit

Same as call-with-output-process, except that the output port which is connected to the stdin of the command is set to the current output port while executing thunk.

Function: call-with-process-io command proc :key error encoding conversion-buffer-size on-abnormal-exit

Runs command in a subprocess, and calls proc with two arguments; the first argument is an input port which is connected to the command’s stdout, and the second is an output port connected to the command’s stdin. The error output from the command is shared by the calling process’s, unless an alternative pathname is given to the error keyword argument.

The exit status of the command is collected when proc returns or raises an error.

Function: process-output->string command :key error encoding conversion-buffer-size on-abnormal-exit
Function: process-output->string-list command :key error encoding conversion-buffer-size on-abnormal-exit

Runs command and collects its output (to stdout) and returns them. process-output->string concatenates all the output from command to one string, replacing any sequence of whitespace characters to single space. The action is similar to “command substitution” in shell scripts. process-output->string-list collects the output from command line-by-line and returns the list of them. Newline characters are stripped.

Internally, command is run by call-with-input-process, to which keyword arguments are passed.

(process-output->string '(uname -smp))
  ⇒ "Linux i686 unknown"

(process-output->string '(ls))
  ⇒ "a.out foo.c foo.c~ foo.o"

(process-output->string-list '(ls))
  ⇒ ("a.out" "foo.c" "foo.c~" "foo.o")
Function: shell-escape-string str :optional flavor

If str contains characters that affects shell’s command-line argument parsing, escape str to avoid shell’s interpretation. Otherwise, returns str itself.

The optional flavor argument takes a symbol to specify the platform; currently windows and posix can be specified. The way shell handles the escape and quotation differ a lot between these platforms; the windows flavor uses MSVC runtime argument parsing behavior, while the posix flavor assumes IEEE Std 1003.1. When omitted, the default value is chosen according to the running platform. (Note: Cygwin is regarded as posix.)

Use this procedure when you need to build a command-line string by yourself. (If you pass a command-line argument list, instead of a single command-line string, you don’t need to escape them since we bypass the shell.)

Function: shell-tokenize-string str :optional flavor

Split a string str into arguments as the shell does. The optional flavor arguments can be a symbol either windows or posix to specify the syntax. If it’s windows, we follow MSVC runtime command-line argument parser behavior. If it’s posix, we follow IEEE Std 1003.1 Shell Command Language. When omitted, the default value is chosen according to the running platform. (Note: Cygwin is regarded as posix.)

This procedure does not handle fancier shell features such as variable substitution. If it encounters a metacharacter that requires such interpretation, an error is signaled. In other words, metacharacters must be properly quoted in str.

(shell-tokenize-string "echo $foo" 'posix)
  ⇒ signals error

(shell-tokenize-string "echo \"$foo\"" 'posix)
  ⇒ still signals error

(shell-tokenize-string "echo '$foo'" 'posix)
  ⇒ ("echo" "$foo")

(shell-tokenize-string "echo \\$foo" 'posix)
  ⇒ ("echo" "$foo")

Previous: , Up: High Level Process Interface   [Contents][Index]