Unit testing (Gauche Users’ Reference)

9.34 `gauche.test` - Unit Testing

Module: gauche.test ¶

Defines a set of functions to write test scripts. A test script will look like this:

(use gauche.test)
(test-start "my feature")
(load "my-feature")  ; load your program
(import my-feature)  ; if your program defines a module.

(test-module 'my-feature) ; tests consistency in your module.

(test-section "feature group 1")
(test "feature 1-1" EXPECT (lambda () TEST-BODY))
(test "feature 1-2" EXPECT (lambda () TEST-BODY))
 …

(test-section "feature group 2")
(define test-data ...)
(test "feature 2-1" EXPECT (lambda () TEST-BODY))
(test "feature 2-2" (test-error) (lambda () TEST-THAT-SIGNALS-ERROR))
 …

(test-end :exit-on-failure #t)

With this convention, you can run test both interactively or in batch. To run a test interactively, just load the file and it reports a result of each test, as well as the summary of failed test at the end. To run a test in batch, it is convenient to redirect the stdout to some file If stdout is redirected to other than tty, all the verbose logs will go there, and only a small amount of messages go to stderr.

It is recommended to have a "check" target always in Makefile of your module/program, so that the user of your program can run a test easily. The rule may look like this:

check :
        gosh my-feature-test.scm > test.log

For the portable programs, there are a couple of srfis that provide testing frameworks (see srfi.78 - Lightweight testing, and see srfi.64 - A Scheme API for test suites). In Gauche, when you can use these srfis with gauche.test, srfi’s tests work as a part of Gauche’s tests.

9.34.1 Structuring a test file

Function: test-start module-name ¶

{gauche.test} Initializes internal state and prints a log header. This should be called before any tests. Module-name is used only for logging purpose.

If a test is already running, this procedure just emits a log message and update internal test nesting level. It allows a test script to include another test script with test-start/test-end pair.

With this feature, you can split a lengthy test script into several sub test scripts, each of which is a stand-alone test script, and include them into the main test script. By doing so, you can run a single sub test script while developing, and run the main test script for full tests.

Function: test-end :key exit-on-failure ¶

{gauche.test} Prints out list of failed tests. If exit-on-failure is #f or omitted, this procedure returns the number of failed tests.

Otherwise, this function terminates the gosh process by exit. If a fixnum is given to exit-on-failure it becomes the process’s exit status; if other true value is given, the exit status will be 1.

If test is nested, that is, this test-end corresponds to inner test-start while outer test is ongoing, this procedure only emits a log message and adjust internal test nesting level.

Function: test-running? ¶: {gauche.test} Returns #t if test is running, that is, you’re between outermost test-start and test-end. Returns #f otherwise.

Function: test-section section-name ¶: {gauche.test} Marks beginning of the group of tests. This is just for logging.

Function: test-log fmtstr args … ¶

{gauche.test} This is also just for logging. Creates a formatted string with fmrstr and args just like format, then write it to the current output port, with prefix ;; and newline at the end.

With the typical Makefile settings, where you redirect stdout of test scripts to a log file, the message only goes to the log file.

Using this, you can dump information that can’t be automatically tested but may be useful for troubleshooting. For example, you get a mysterious test failure reports you can’t reproduce on your machine, and suspect some aspects of the running systems may unpredictably affect the test result. You can put test-log in the test code to dump such parameters, and ask the reporter to run the test again and analyze the log.

Function: test-record-file file ¶

{gauche.test} Suppose you have several test scripts. Normally you run them as a group and what you want to know is a concise summary of the whole results, instead of each result of individual test files.

A test record file is an auxiliary file used to gather summary of the result. It holds a one-line summary of tests like this:

Total:  9939 tests,  9939 passed,     0 failed,     0 aborted.

When a test record file exists, test-start reads and parses it, and remembers the numbers. Then test-end adds the count of the results and writes them back to the same test record file.

If you writes the check target in your makefile as follows, you will get the final one-line summary every time you run make check, assuming that test1.scm, test2.scm, and test3.scm all has (test-record-file "test.record") before a call to test-start.

check:
        @rm -f test.record test.log
        gosh test1.scm >> test.log
        gosh test2.scm >> test.log
        gosh test3.scm >> test.log
        @cat test.record

Note that to make test-record-file work, it must be placed before the call to test-start.

Alternatively, you can use the environment variable GAUCHE_TEST_RECORD_FILE to specify the test record file.

Environment Variable: GAUCHE_TEST_RECORD_FILE ¶

If this environment variable is set when the test script is run, its value is used as the name of the test record file.

If the test script calls test-record-file, it takes precedence and this environment variable is ignored.

Function: test-summary-check ¶

{gauche.test} If the test record file is set (either by test-record-file or the environment variable GAUCHE_TEST_RECORD_FILE), read it, and then exit with status 1 if the record has nonzero failure count and/or nonzero abort count. If the test record file isn’t set, this procedure does nothing.

This is useful when you have multiple test scripts and you want to let make fail if any of tests fails, but not before all test script is run. If you make every test script use :exit-on-failure of test-end, then make stops immediately after the script that fails. Instead, you avoid using :exit-on-failure, but use the test record file and for the last thing you can call this function:

check:
   rm -f $GAUCHE_TEST_RECORD_FILE test.log
   gosh test1.scm >> test.log
   gosh test2.scm >> test.log
   cat $GAUCHE_TEST_RECORD_FILE /dev/null
   gosh -ugauche.test -Etest-summary-check -Eexit

By this, make will run all the test script no matter how many of them fails (since gosh exits with status 0), but detect an error since the last line of gosh call exits with status 1 if there has been any failure.

9.34.2 Individual test

Most test frameworks have various test procedures such as test-assert, test-equal, etc., depending on what you want to test. We have only one, test, and its thin wrapper test*. It takes a thunk (or an expression) to run, the expected result, and an optional check predicate to compare the expected result against the actual result. Various conditions, such as testing the actual result is a true value, or testing the expression raising a specific error, can be handled by the check predicate. The default check predicate handles typical cases:

Compare the expected result with the actual result.

(test* "one plus one equals two"
       2
       (+ 1 1))

Assert the expression returns true value (i.e. not #f).

(test* "'any' returns non-false value"
       (test-truthy)
       (any integer? '(1.2 3/4 5)))

Check if the actual result is one of possible values.

(test* "get random ineteger between 0 and 5"
       (test-one-of 0 1 2 3 4 5)
       (random-integer 6))

Check if the expression raises an expected error.

(test* "expects read error"
       (test-error <read-error>)
       (read-from-string "(a .)"))

With this way, it is easier to parameterize tests, e.g. loop over a list of expected result and test thunks.

Macro: test* name expected expr :optional check report hook ¶

{gauche.test} A convenience macro that wraps expr by lambda.

(test* name expected expr opt ...)
  ≡ (test name expected (lambda () expr) opt ...)

Function: test name expected thunk :optional check report hook ¶

{gauche.test} Calls thunk, and checks its result fits expected using a procedure check, which is called as follows:

(check expected result-of-thunk)

It should return #t if the given result agrees with the expected value, or #f otherwise. The default check procedure is test-check, explained below. It compares expected and result-of-thunk with equal?, except when expected is some of special case test objects. (See “testing ambiguous results” and “testing abnormal cases” paragraphs below for this special treatment.)

One typical usage of the custom check procedure is to compare inexact numbers tolerating small error.

(test "test 1" (/ 3.141592653589 4)
      (lambda () (atan 1))
      (lambda (expected result)
        (< (abs (- expected result)) 1.0e-10)))

Name is a name of the test, for the logging purpose.

When thunk signals an uncaptured error, it is caught and yields a special error object <test-error>. You can check it with another error object created by test-error function to see if it is an expected type of error. See the entry of test-error below for the details.

The report optional argument must be an #f or a procedure that takes three arguments. If it is a procedure, it is called after check returns false (but before hook is called). The first argument is name, the seond argument is expected, and the third argument is either the result of thunk, or a <test-error> object when thunk raises an error. The default is test-report-failure procedure, which simply uses write to display the result of thunk or a <test-error> object. By passing your own procedure, you can customize the message to be printed when the test is failed.

Finally, the hook optional argument must be an #f or a procedure that takes four arguments. If it is a procedure, it is called after the test procedure finishes the test. The first argument is a symbol either pass or fail, the second argument is name, the third argument is expected, and the fourth argument is either the result of thunk, or a <test-error> object when thunk raises an error. The return value of hook is ignored.

It is mainly for libraries that wrap gauche.test and wants to do its own bookkeeping.

Note: In 0.9.10, we didn’t have report argument. Instead of adding report to the last optional argument, we made it the second and shifted hook, for hook arugment will rarely be used. To keep the backward compatibility, we recognize if 4-argument procedure is passed to a report argument we treat it as hook, with warning. This compatibility feature will be removed in future releases.

Function: test-check expected actual :optional equal ¶: {gauche.test} The default procedure test and test* use to check the result of the test expression conforms the expected value. By default, test-check just compares expected and actual with a procedure equal, which is defaulted to equal?. test-check behaves differently if expected is one of special test objects described below.

Function: test-report-failure name expected actual ¶: {gauche.test} The default procedure to report test failure. It just writes actual with write. You can customize the failure report by passing your reporting procedure to report argument to test and test*. See test-report-failure-diff below, for example.

Testing ambiguous results

Function: test-one-of choice … ¶

{gauche.test} Sometimes the result of test expression depends on various external environment, and you cannot put an exact expected value. This procedure supports to write such tests conveniently.

Returns a special object representing either one of the choices. The default check procedure, test-check, recognizes the object when it is passed in the expected argument, and returns true if any one of choice … passes the check against the result.

For example, the following test passes if proc returns either 1 or 2.

(test* "proc returns either 1 or 2" (test-one-of 1 2) (proc))

Note that test-check compares the actual result against each of choices by test-check itself, that is:

(test-check (test-one-of choice …) result equal)
 ≡ (or (test-check choice result equal) …)

This, if you want to compare each choice with customized equivalence procedure, pass test-check with a specialized equivalence procedure as the check procedure. The following example compares each choice and the result case-insensitively:

(test* "Using one-of with case insensitive comparison"
       (test-one-of "abc" "def")
       "Abc"
       (cut test-check <> <> string-ci=?))

Function: test-none-of choice … ¶: {gauche.test} Similar to test-one-of, but creates a special object representing none of the choices. The test suceeds if the test expression evaluates to a value that don’t match any of choices.

Note: If you want to compare inexact numeric result, you can use approx=? (see Numerical comparison).

Function: test-truthy ¶: {gauche.test} Returns a special object that expects a true value, i.e. anything but #f.　Use it to assert the test expression returns a true value but not necessarily #t.

Testing abnormal cases

Function: test-error :optional (condition-type <error>) (message #f) ¶

{gauche.test} Returns a new <test-error> object that matches with other <test-error> object with the given condition-type.

The test-check procedure treats <test-error> objects specially. When err-expected and err-actual are <test-error> objects, (test-check err-expected err-actual) returns #t if err-expected’s condition type is the same as or supertype of err-actual’s.

For example, if you want to test a call to foo raises an <io-error> (or its condition subtype), you can write as the following example:

(test "see if foo raises <io-error>" (test-error <io-error>) (foo))

Another optional argument message can be used to check if the raised error has a message of expected pattern. The argument may be a string, a regexp, or #f (default). If it is a string, test-check checks if the message of the raised error exactly match the string. If it is a regexp, test-check checks the message of the raised error matches that regexp. If it is #f, the message is not checked.

Variable: *test-error* ¶: {gauche.test} Deprecated. Bounded to an instance of <test-error> with condition type <error>. This is only provided for the backward compatibility; new code should use test-error procedure above.

Variable: *test-report-error* ¶

{gauche.test} If this variable is true, the test routine prints stack trace to the current error port when it captures an error. It is useful when you got an unexpected test-error object and want to check out where the error is occurring.

This variable is initialized by the environment variable GAUCHE_TEST_REPORT_ERROR when the gauche.test module is loaded. For example, you can use the environment variable to check out an unexpected error from your test script as follows (the value of the environment variable doesn’t matter).

env GAUCHE_TEST_REPORT_ERROR=1 gosh mytest.scm

9.34.3 Incorporating external tests

Sometimes you implment an existing specification that comes with tests. If tests are written in R7RS Scheme, you can run it by itself; however, you might want to run it as a part of larger test suite managed by gauche.test. By doing so, you can get consolidated test result reports, for Gauche integrates SRFI test frameworks such as srfi.64 and srfi.78.

If the test file is written in R7RS format, however, you may not be able to simply include the test file into Gauche’s test suite. As R7RS import differs from Gauche’s, we have some black magic to switch R7RS/Gauche namespace based on whether the first form in the file is R7RS import form or not (see Three import forms). If you include R7RS program into Gauche’s program, the import in the R7RS code is interpreted as Gauche’s and doesn’t work. The macro test-include-r7 can be used instead.

NB: If the external test is written for Chibi Scheme, you can use chibi-test instead (see compat.chibi-test - Running Chibi-scheme test suite).

Macro: test-include-r7 path [exclude-clause] ¶

{gauche.test} Like include form (see Inclusions), expands the content of the file specified by path in place of this form. If path is relative, it is relative from the includer.

The code is included in the environment where import is bound to R7RS’s, so that a test script written for R7RS can be included as is.

A recommended usage is to create a submodule in Gauche’s test script:

(use gauche.test)
(test-start ...)
     :
(test-section "xxx-tests")
(define-module xxx-tests
  (use gauche.test)
  (test-include-r7 "xxx-tests"))

Sometimes the external script refers to a library that’s not corresponds to what Gauche provides (e.g. tests/include/srfi-222-tests.scm imports (compounds) library, but Gauche provides it as srfi.222.) You can list such libraries to exclude clause so that import won’t load it:

(define-module srfi-222-tests
  (use gauche.test)
  (use srfi.222)
  (test-include-r7 "include/srfi-222-tests" (exclude (compounds))))

The format of exclude-clause is as follows:

<exclude-clause> : (exclude r7rs-library-name …)

9.34.4 Better test failure reporting

As described in test entry above, you can customize how the failure is reported by passing the optional report argument to test and test*. One of useful customizations is to show the difference between multi-line text. It’s such a useful tool so we provide a report procedure.

Here’s a contrived example. We pass test-report-failure-diff as a report procedure (and test-check-diff for check procedure, which we’ll explain later). Expected text is given as a list of lines, while the actual result is a single string; Both test-report-failure-diff and test-check-diff procedures canonicalize expected and actual result into a single multi-line string, so you can give them in whichever ways that’s convenient for you.

(test* "Beatrice"
       ;; expected
        '("What fire is in mine ears?  Can this be true?"
          "Stand I condemned for pride and scorn so much?"
          "Contempt, farewell, and maiden pride, adieu!"
          "No glory lives behind the back of such.")
       ;; actual
       "What fire is in mine ears?  Can this be true?\n\
        Stand I condemn'd for pride and scorn so much?\n\
        Contempt, farewell! and maiden pride, adieu!\n\
        No glory lives behind the back of such.\n"
       test-check-diff           ; check
       test-report-failure-diff) ; report

 ⇒ Reports:
ERROR: GOT diffs:
--- expected
+++ actual
@@ -1,4 +1,4 @@
 What fire is in mine ears?  Can this be true?
-Stand I condemned for pride and scorn so much?
-Contempt, farewell, and maiden pride, adieu!
+Stand I condemn'd for pride and scorn so much?
+Contempt, farewell! and maiden pride, adieu!
 No glory lives behind the back of such.

As you see, the result is reported in a unified diff format (see text.diff - Calculate difference of text streams) so that you can spot the difference easily.

Function: test-check-diff expected actual :optional equal ¶

{gauche.test} An alternative check procedure you can pass into check argument of test procedure / test* macro.

Before comparing expected and actual, it performs the following operations on each of expected and actual:

If it is a list of strings, join them with \n (with suffix syntax, so the last line is also appended with \n).
If it is a form (content-of <string>), then <string> is taken as a filename and the content of the file is used as a string. If the filename is relative, it is relative to the current loading file. If named file doesn’t exist, an empty string is used.
If it is a string or other object, it is used as is.

Then the two arguments are compared using equal, which is defaulted to equal?.

Function: test-report-failure-diff msg expected actual ¶

{gauche.test} An alternative failure report procedure you can pass into report argument of test procedure / test* macro.

The expected and actual arguments are converted in the same way as test-check-diff; that is, if it is a list of strings (lines) or a form (content-of <filename>), it is converted to a single string.

Then the difference of the two is reported in a unified diff format (using diff-report/unified. See text.diff - Calculate difference of text streams).

If either expected or actual is not convertable to a single string, the result is reported in the same way as the standard test-report-failure.

Note: This procedure is called twice, once when the test is failed, and again from test-end to report the summary of discrepancy. If you pass (content-of <filename>) form, you have to make sure the named file exists until test-end returns. This is tricky if you generate text into a temporary file during a single test. In general, (content-of <filename>) form is useful in the expected argument, where you can specify the prepared file.

Macro: test*/diff name expected expr ¶

{gauche.test} This is a convenience version of test*, using test-check-diff and test-report-failure-diff as check and report procedures, respectively.

(test*/diff name expected expr)
 ≡
 (test* name expected expr test-check-diff test-report-failure-diff)

9.34.5 Quasi-static checks

Scheme is dynamically typed, which is convenient for incremental and experimental development on REPL, but it tends to delay error detection until the code is actually run. It is very annoying that you run your program for a while only to see it barfs on simple typo of variable name.

Gauche addresses this issue by checking certain types of errors at the test phase. It isn’t purely a static check (we need to load a module or a script, which evaluates toplevel expressions), nor exhaustive (we can’t catch inconsistencies that span over multiple modules or about information that can be added at runtime). Nevertheless it can often catch some common mistakes, such as incorrect variable names or calling procedures with wrong number of arguments.

The two procedures, test-module and test-script, load the named module and the script files respectively (which compiles the Scheme code to VM instructions), then scan the compiled VM code to perform the following tests:

See if the global variables referenced within functions are all defined (either in the module, or in one of imported modules).
If a global variable is used as a function, see if the number of arguments given to it is consistent to the actual function.
See if the symbols set as autoload in the code can be resolved.
While testing module, see if the symbols declared in the export list are actually defined.

The check is somewhat heuristic and we may miss some errors and/or can have false positives. For false positives, you can enumerate symbols to be excluded from the test.

Function: test-module module :key allow-undefined bypass-arity-check ¶

{gauche.test} Loads the module and runs the quasi-static consistency check. Module must be a symbol module name or a module.

Sometimes you have a global variable that may not be defined depending on compiler options or platforms, and you check its existence at runtime before using it. The undefined variable reference check by test-module doesn’t follow such logic, and reports an error whenever it finds your code referring to undefined variable. In such case, you can give a list of symbols to the allow-undefined keyword argument; the test will excludes them from the check.

The arity check may also raise false positives, if the module count on a behavior of global procedures that will be modified after the module is loaded (e.g. a method with different number of arguments can be added to a generic function after the module is loaded, which would make the code valid.) If you know you’re doing right thing and need to suppress the false positives, pass a list of names of the functions to bypass-arity-check keyword arguments.

Function: test-script filename :key allow-undefined bypass-arity-check compile-only ¶

{gauche.test} Loads the script named by filename into a fresh anonymous module and runs the quasi-static consistency check. Filename must be a string that names the script file.

The meaning of keyword arguments is the same as test-module.

Note that the toplevel forms in filename are evaluated, so scripts that relies on the actions of toplevel forms could cause unwanted side-effects. This check works best for the scripts written in SRFI-22 convention, that is, calling actions from main procedure instead of toplevel forms. R7RS scripts relies on actions in toplevel forms and can’t be tested with this procedure.

Scripts that relies on being loaded into user module also won’t work well with this check, which loads the forms into anonymous module.

If you need to test a script with toplevel side-effecting forms and you can’t change it, you may want to pass true value to the compile-only keyword argument. Then test-script just compiles each toplevel form before running static checking, instead of loading (which not only compiles but executes each of toplevel forms).

• Structuring a test file:
• Individual test:
• Incorporating external tests:
• Better test failure reporting:
• Quasi-static checks:

9.34 gauche.test - Unit Testing