For Development HEAD DRAFTSearch (procedure/syntax/module):

9.25 gauche.parseopt - Parsing command-line options

Module: gauche.parseopt

This module defines a convenient way to parse command-line options. The interface is hinted by Perl, and conveniently handles long-format options with multiple option arguments.

Actually, you have a few choices to parse command-line options in Gauche. SRFI-37 (see srfi.37 - args-fold: a program argument processor) provides functional interface to parse POSIX/GNU compatible argument syntax. SLIB has getopt-compatible utility. Required features may differ from application to application, so choose whichever fits your requirement.

High-level API

Macro: let-args args (bind-spec … [. rest]) body …

{gauche.parseopt} This macro captures the most common pattern of argument processing. It takes a list of arguments, args, and scans it to find Unix-style command-line options and binds their values to local variables according to bind-spec, then executes body ….

Let’s look at a simple example first, which gives you a good idea of what this form does. (See the “Examples” section below for more examples).

(define (main args)
  (let-args (cdr args)
      ((verbose     "v|verbose" ? "Run verbosely.")
       (outfile     "o|outfile=s{FILE}" ? "Write output to {FILE}.")
       (debug-level "d|debug-level=i{LEVEL}" 0 ? "Set debug level.")
       (help        "h|help" => (cut show-help (car args))
                    ? "Show help message and exit.")
       . restargs
      )
    ....))

(define (show-help progname)
  (print "Usage: ....")
  (pritn "Options:")
  (print (option-parser-help-string))
  (exit 0))

The local variable verbose will be bound to #t if a command-line argument -v or --verbose is given, and to #f otherwise. The variable output is specified to take one option argument; if the command-line arguments are given like -o out.txt, outfile receives "out.txt". The debug-level one is similar, but the option argument is coerced to an integer, and also it has default value 0 when the option isn’t given. The help clause invokes an action rather than merely binding the value.

(Note: Currently let-args does not distinguish so-called short and long options, e.g. -v and --v have the same effect, so as -verbose and --verbose. In future we may add an option to make it compatible with getopt_long(3).)

The string after ? is used as a help string for the option. You can get a formatted help string with option-parser-help-string.

The final restargs variable after the dot receives a list of non-optional command-line arguments.

Let’s look at bind-spec in detail. It must be one of the following forms.

1. (var option-spec [default] [? helpstr])
2. (var option-spec [default] => callback [? helpstr])

3. (else => fallback)
4. (else formals body ...)

A list of command-line arguments passed to args are parsed according to option-specs. If the corresponding option is given, the option’s “value” is determined as follows:

(a) If the bind-spec is 1., then
  (a1) If option-spec doesn't require an argument, then #t:
  (a2) If option-spec requires one argument, then the value of
       the argument:
  (a3) If option-spec requires more than one argument,
       the list of the values of the arguments.
(b) If the bind-spec is 2., then callback is called with
  the value(s) of arguments, and its return value.

An option can be singular, meaning at most one appearance is effective, or it can be plural, meaning multiple apperances are considered. If an option is singular, var is bound to the value determined as above. If an option is plural, var is bound to the list of above values from all occurrences.

We’ll explain the details of option-spec later.

As a special case, var can be #f, in which case the value is ignored. It is only useful for side effects in callback.

If the corresponding option is not given in args, var is bound to default if it is given, #f otherwise.

The last bind-spec may be the form 3 or 4. in which case the clause is selected when no other option-spec matches a given command-line option. In the form 3, fallback will be called with three arguments; the given option, a list of remaining command-line arguments, and a continuation procedure. The fallback is supposed to handle the given option, and it may call the continuation procedure with the remaining arguments to continue processing, or it may return a list of arguments which will be treated as non-optional command-line arguments. The form 4 is a shorthand notion of (else => (lambda formals body ...)).

The bind-spec list can be an improper list, whose last cdr is a symbol. In which case, a list of the rest of the command-line arguments is bound to the variable named by the symbol.

Note that the default, callback, and forms in else clause is evaluated outside of the scope of binding of vars (as the name let-args implies).

Unlike typical getopt or getopt_long implementation in C, let-args does not permute the given command-line arguments. It stops parsing when it encounters a non-option argument (argument without starting with a minus sign).

If the parser encounters an argument with only two minus signs ‘--’, it stops argument parsing and treats the rest of arguments as non-option command-line arguments.

After all the bindings is done, body … are evaluated. Body may began with internal define forms.

Internally, let-args creates an “option-parser” object, which is available within the dynamic extent of let-args from the parameter current-option-parser. See below.

Parameter: current-option-parser

{gauche.parseopt} In the dynamic extent of let-args, this parameter holds an option parser object that parses the command line and stores the options’ apparance and argument values. It should be treated as an opaque object. Its only public use is to pass to option-parser-help-string to get a formatted option help strings, which is done by default.

Function: option-parser-help-string :key option-parser omit-options-without-help option-indent description-indent width

{gauche.parseopt} Returns a formatted help string of command-line option of an option parser object option-parser. Its default value is the value of parameter current-option-parser, so if you invoke this procedure in the dynamic extent of let-args to get its help string, you don’t need to pass the option-parser argument explicitly.

With the example shown in the let-args above, calling this procedure produces the following string:

  -v, --verbose
               Run verbosely.
  -o, --outfile FILE
               Write output to FILE.
  -d, --debug-level LEVEL
               Set debug level.
  -h, --help   Show help message and exit.

If the name of an option argument is given as surrounded by curly braces in the option spec, e.g. "o|outfile=s{FILE}", the name is used in the help message (see “Option spec” below). Otherwise, the argument type character is used. In the help message, it is recommended to enclose the reference to the option argument with curly braces. Those braces are simply removed in the help string. However, in future, it may be rendered differently if the output device has richer presentation.

If the omit-options-without-help keyword argument is true, options without help string isn’t included in the help string. It is useful if you have an experimental options you don’t want to advertise for users. The default is #f, which shows (No help available) for such options.

The arguments option-indent, description-indent, and width controls the formatting. Option names are indented with option-indent characters, and their descriptions are indented with description-indent. The overall width of text is governed by width. The help string is formatted with text.fill (see text.fill - Text filling).

Option spec

option-spec is a string that specifies the name of the option and how the option takes the arguments. An alphanumeric characters, underscore, plus and minus sign is allowed for option’s names, except that minus sign can’t be the first character, i.e. the valid option name matches a regexp #/[\w+][-\w+]*/.

If the option takes argument(s), it can be specified by attaching equal character and a character (or characters) that represents the type of the argument(s) after the name. The option can take more than one arguments. The following characters are recognized as a type specifier of the option’s argument.

s

String.

n

Number.

f

Real number (coerced to flonum).

i

Exact integer.

e

S-expression.

y

Symbol (argument is converted by string->symbol).

Let’s see some examples of option-spec:

"name"

Specifies option name, that doesn’t take any argument.

"name=s"

Option name takes one argument, and it is passed as a string.

"name=i"

Option name takes one argument, and it is passed as an exact integer.

"name=ss"

Option name takes two arguments, both string.

"name=iii"

Option name takes three integer arguments.

"name=sf"

Option name takes two arguments, the first is a string and the second is a number.

Each option argument letter can be follwed by an option argument name, enclosed with curly braces. Those names don’t affect command-line argument parsing, but are used to generate help string (see option-parser-help-string above).

--file=s{FILENAME}
--origin=f{XCOORD}f{YCOORD}f{ZCOORD}

If the option has alternative names, they can be concatenated by "|". For example, an option spec "h|help" will match both "h" and "help".

If an asterisk * is followed by the name(s) of the option, the option is plural, that is, it can be specified multiple times and you can get the values as a list. For example, I|incdir*=s means -I and -incdir option can appear more than once in the command line, and its final value would be a list of string argument given to all of the options.

In the command line, the option may appear with preceding single or double minus signs. The option’s argument may be combined by the option itself with an equal sign. For example, all the following command line arguments match an option spec "prefix=s".

-prefix /home/shiro
-prefix=/home/shiro
--prefix /home/shiro
--prefix=/home/shiro

If the option consists of a single letter and takes arguments, the first argument can follow immediately after the option letter without a whitespace. If you have an option spec "I=s", all the following command line arguments are recognized:

-I/foo
-I /foo
-I=/foo
--I/foo
--I /foo
--I=/foo

If there’s an ambiguity between a long option and a single-letter option plus an argument, the long option takes precedence. If you have an option spec "long" and "l=s", command line arguments -long is recognized as the former, not -l option with arguments ong.

Error handling

Condition Type: <parseopt-error>

{gauche.parseopt} When let-args encounters an argument that cannot be processed as specified by option specs, an error of condition type <parseopt-error> is raised. The cases include when a mandatory option argument is missing, or when an option argument has a wrong type.

(let-args '("-a" "foo") ((a "a=i")) ; option a requires integer
  (list a))
 ⇒ parseopt-error

Note that this condition is about parsing the given args. If an invalid option-spec is given, an ordinary error is thrown.

Examples

This example is taken from gauche-install script. The mode option takes numbers in octal, so it uses the callback procedure to convert it. See also the else clause how to handle unrecognized option.

(define (main args)
  (let-args (cdr args)
      ([mkdir   "d|directory"
                ? "Creates directories listed in the arguments. \
                   (3rd format only)."]
       [mode    "m|mode=s{MODE}" #o755 => (cut string->number <> 8)
                ? "Change mode of the installed file."]
       [owner   "o|owner=s{OWNER}"
                ? "Change owner of the installed file(s)."]
       [group   "g|group=s{GROUP}"
                ? "Change group of the installed file(s)."]
       [csfx    "C|canonical-suffix"
                ? "If installed file has a suffix *.sci, replace it for \
                   *.scm.   This is Gauche specific convention."]
       [srcdir  "S|srcdir=s{DIR}"
                ? "Look for files within {DIR}; useful if VPATH is used."]
       [target  "T|target=s{DIR}"
                ? "Installs files to the {DIR}, creating paths if needed. \
                   Partial path of files are preserved. (4th format only)."]
       [utarget "U|uninstall=s{DIR}"
                ? "Reverse of -T, e.g. removes files from its destination."]
       [shebang "shebang=s{PATH}"
                ? "Adds #!{PATH} before the file contents. \
                   Useful to install scripts."]
       [verb    "v|verbose" ? "Work verbosely"]
       [dry     "n|dry-run" ? "Just prints what actions to be done."]
       [sprefix "p|strip-prefix=s{PREFIX}"
                ? "Strip prefix dirs from FILEs before \
                  installation. (4th/5th format only)."]
       [#f      "h|help" => usage ? "Show this help message."]
       [#f      "c" ? "This option is ignored.  Recognized for the \
                       compatibility."]
       [else (opt . _) (print "Unknown option : " opt) (usage)]
       . args)
    ...)
  )

The next example is a small test program to show the usage of else clause. It gathers all options into the variable r, except that when it sees -c it stops argument processing and binds the rest of the arguments to restargs.

(use gauche.parseopt)

(define (main args)
  (let1 r '()
    (let-args (cdr args)
      ((else (opt rest cont)
         (cond [(equal? opt "c") rest]
               [else (push! r opt) (cont rest)]))
       . restargs)
     (print "options: " r)
     (print "restargs: " restargs)
     0)))

Sample session of the above script (suppose it is saved as example).

$ ./example -a -b -c -d -e foo
options: (a b)
restargs: (-d -e foo)
$ ./example -a -b -d -e foo
options: (a b d e)
restargs: (foo)

Low-level API

Internally, let-args creates an option parser object and option spec objects which does the actual parsing work. Those classes are internal, but a few procedures are exported so that you can build your own parser.

Function: make-option-spec option-spec :key default callback help-string

{gauche.parseopt} Parse a string option-spec (see “Option spec” above) and returns an option spec object, which should be treated as opaque.

The keyword arguments corresponds to the values in the bind-spec of let-args form.

Function: build-option-parser specs :key fallback

{gauche.parseopt} Given a list of option spec objects creates and returns an option parser object. Each option spec object can be created with make-option-spec.

The returned option parser object can be passed to run-option-parser to handle the actual command-line argument list.

The optional fallback argument must be a procedure if provided. See run-option-parser, for how and when it is called.

Function: run-option-parser option-parser args :optionap fallback

{gauche.parseopt} Handle command-line argument list args, which must be a list of string, with an option parser option-parser.

It handles the command-line options in args if any, and set up each option’s value in the option spec, then returns the rest of the argument list.

It looks for elements in args from left to right. If it encounters a string beginning with -, it thinks the string is a command-line option and search option specs in option-parser. Note that a string -- marks the end of the option list; it is consumed, and the remaining arguments are immediately returned.

If there’s a matching option spec, the option itself and the option’s argument(s) (if the option takes ones) are taken out from args. Then, the option’s value is determined as follows.

  • If the option spec has a handler, the handler is called with the option’s argument(s), and its return value becomes the option’s value.
  • Otherwise, if the option does not take an argument, #t.
  • Otherwise, if the option takes a single argument, the given argument itself.
  • Otherwise, the list of arguments.

Furthermore, the option has plural flag, the final result of the option’s value is a list of all option values. Otherwise, the option’s value is overwritten as the last value of the option.

If the given option-like string does not match any of the option specs, a fallback procedure is called. A fallback procedure can be given by fallback optional argument, or the fallback argument given to build-option-parser. The fallback procedure is called with three arguments: the unrecognized option string (without preceding hyphen(s)), the argument list following it, and a closure that takes an argument list to continue the parsing.

The fallback procedure may process the unknown option as it pleases. Whatever the fallback procedure returns becomes the return value of run-option-parser. If the fallback procedure wants to continue processing the rest of the arugment list, tail call the third argument.

If no fallback argument is given to run-option-parser nor build-option-parser, the default fallback procedure is called, which throws “unrecognized option” <parseopt-error> error.

Function: get-option-spec option-parser option-name

{gauche.parseopt} Returns an option spec object that has the option name option-name from option-parser. After run-option-parser, you need to get an option spec to retrieve the option’s value.

If no option spec matches option-name, #f is returned.

Function: option-spec-appeared? optspec

{gauche.parseopt} Returns true if an option spec optspec has appeared in the command-line argument list. This must be called after run-option-parser.

Function: option-spec-value optspec

{gauche.parseopt} Returns the option’s value of an option spec optspec. This must be called after run-option-parser.

The option’s value is determined as follows.

  • If the option didn’t appear in the command-line argument list, the “default” value given to the make-option-spec.
  • If the option did appear in the comamnd-line argument list, the value of each occurrence is determined as follows:
    • If the option doesn’t take an argument, #t.
    • If the option takes one argument, the given argument.
    • If the option takes multiple arguments, the list of the given arguments.

    Then, if the option is plural, all the value from occurrences are gathered to a list in the order of appearance; otherwise, the value form the last occurrence is kept.

Deprecated API

The following macros are supersede by let-args. New code should not use them.

Macro: parse-options args (option-clause …)

{gauche.parseopt} Deprecated. args is an expression that contains a list of command-line arguments. This macro scans the command-line options (an argument that begins with ‘-’) and processes it as specified in option-clauses, then returns the remaining arguments.

Each option-clause is consisted by a pair of option-spec and its action.

If a given command-line option matches one of option-spec, then the associated action is evaluated. An action can be one of the following forms.

bind-spec body

bind-spec is a proper or dotted list of variables like lambda-list. The option’s arguments are bound to bind-spec, then then body … is evaluated.

=> proc

If a command-line option matches option-spec, calls a procedure proc with a list of the option’s arguments.

If a symbol else is at the position of option-spec, the clause is selected when no other option clause matches a given command-line option. Three “arguments” are associated to the clause; the unmatched option, the rest of arguments, and a procedure that represents the option parser.

Macro: make-option-parser (option-clause …)

{gauche.parseopt} Deprecated. This is a lower-level interface. option-clauses are the same as parse-options. This macro returns a procedure that can be used later to parse the command line options.

The returned procedure takes one required argument and one optional argument. The required argument is a list of strings, for given command-line arguments. The optional argument may be a procedure that takes more than three arguments, and if given, the procedure is used as if it is the body of else option clause.



For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT