For Gauche 0.9.5


Next: , Previous: , Up: Library modules - Utilities   [Contents][Index]

12.23 file.util - Filesystem utilities

Module: file.util

Provides convenient utility functions handling files and directories. Those functions are built on top of the primitive system procedures described in Filesystems.

Many procedures in this module takes a keyword argument follow-link?, which specifies the behavior when the procedure sees a symbolic link. If true value is given to follow-link? (which is the default), the procedure operates on the file referenced by the link; if false is given, it operates on the link itself.

Note on the naming convention: Some Scheme implementations "create" new directories and files, while the others "make" them. Some implementations "delete" them, while the others "remove" them. It seems that both conventions are equally popular. So Gauche provides both.


Next: , Previous: , Up: Filesystem utilities   [Contents][Index]

12.23.1 Directory utilities

Function: current-directory :optional new-directory

When called with no argument, this returns the pathname of the current working directory. When called with a string argument new-directory, this sets the current working directory of the process to it. If the process can’t change directory to new-directory, an error is signaled.

This function is in ChezScheme, MzScheme and some other Scheme implementations.

Function: home-directory :optional user

Returns the home directory of the given user, which may be a string user name or an integer user id. If user is omitted, the current user is assumed. If the given user cannot be found, or the home directory of the user cannot be determined, #f is returned.

On Windows native platforms, this function is only supported to query the current user’s directory.

Parameter: temporary-directory

A parameter that keeps the name of the directory that can be used to create a temporary files. The default value is the one returned from sys-tmpdir (see Pathnames). The difference of sys-tmpdir is that, since this is a parameter, it can be overridden by application during execution. Libraries are recommended to use this instead of sys-tmpdir for greater flexibility.

Function: directory-list path :key children? add-path? filter filter-add-path?

Returns a list of entries in the directory path. The result is sorted by dictionary order.

By default, only the basename (the last component) of the entries returned. If add-path? is given and true, path is appended to each entry. If children? is given and true, "." and ".." are excluded from the result.

If filter is given, it must be a predicate that takes one argument. It is called on every element of the directory entry, and only the entries on which filter returns true are included in the result. The argument passed to filter is a basename of the directory entry by default, but when filter-add-path? is true, path is appended to the entry.

If path is not a directory, an error is signaled.

(directory-list "test")
 ⇒ ("." ".." "test.scm" "test.scm~")

(directory-list "test" :add-path? #t)
 ⇒ ("test/." "test/.." "test/test.scm" "test/test.scm~")

(directory-list "test" :children? #t)
 ⇒ ("test.scm" "test.scm~")

(directory-list "test" :children? #t :add-path? #t
   :filter (lambda (e) (not (string-suffix? "~" e))))
 ⇒ ("test/test.scm")
Function: directory-list2 path :key children? add-path? filter follow-link?

Like directory-list, but returns two values; the first one is a list of subdirectories, and the second one is a list of the rest. The keyword arguments children?, add-path? and filter are the same as directory-list.

Giving false value to follow-link? makes directory-list2 not follow the symbolic links; if the path contains a symlink to a directory, it will be included in the first list if follow-link? is omitted or true, while it will be in the second list if follow-link? is false.

Function: directory-fold path proc seed :key lister follow-link?

A fundamental directory traverser. Conceptually it works as follows, in recursive way.

The default procedure of lister is just a call to directory-list, as follows.

(lambda (path seed)
  (values (directory-list path :add-path? #t :children? #t)
          seed))

Note that lister shouldn’t return the given path itself (".") nor the parent directory (".."), or the recursion wouldn’t terminate. Also note lister is expected to return a path accessible from the current directory, i.e. if path is "/usr/lib/foo" and it contains "libfoo.a" and "libfoo.so", lister should return '("/usr/lib/foo/libfoo.a" "/usr/lib/foo/libfoo.so").

The keyword argument follow-link? is used to determine whether lister should be called on a symbolic link pointing to a directory. When follow-link? is true (default), lister is called with the symbolic link if it points to a directory. When follow-link? is false, proc is not called.

The following example returns a list of pathnames of the emacs backup files (whose name ends with "~") under the given path.

(use srfi-13) ;; for string-suffix?
(directory-fold path
                (lambda (entry result)
                  (if (string-suffix? "~" entry)
                      (cons entry result)
                      result))
                '())

The following example lists all the files and directories under the given pathname. Note the use of lister argument to include the directory path itself in the result.

(directory-fold path cons '()
  :lister (lambda (path seed)
            (values (directory-list path :add-path? #t :children? #t)
                    (cons path seed))))
Function: make-directory* name :optional perm
Function: create-directory* name :optional perm

Creates a directory name. If the intermediate path to the directory doesn’t exist, they are also created (like mkdir -p command on Unix). If the directory name already exist, these procedure does nothing. Perm specifies the integer flag for permission bits of the directory.

Function: remove-directory* name
Function: delete-directory* name

Deletes directory name and its content recursively (like rm -r command on Unix). Symbolic links are not followed.

Function: copy-directory* src dst :key if-exists backup-suffix safe keep-timestamp keep-mode follow-link?

If src is a regular file, copies its content to dst, just like copy-file does. If src is a directory, recursively descends it and copy the file tree to dst. Basically it mimics the behavior of cp -r command.

If there’s any symbolic links under src, the link itself is copied instead of the file pointed to by it, unless a true value is given to the follow-link? keyword argument, i.e. the default value of follow-link? is #f. (Note that this is opposite to the copy-file, in which follow-link? is true by default.)

The meanings of the other keyword arguments are the same as copy-file. See the entry of copy-file for the details.

Function: create-directory-tree dir spec

Creates a directory tree under dir according to spec. This procedure is useful to set up certain directory hierarchy at once.

The spec argument is an S-expression with the following structure:

<spec> : <name>                             ; empty file
       | (<name> <option> ...)              ; empty file
       | (<name> <option> ... <string>)     ; file with content
       | (<name> <option> ... <procedure>)  ; file with generated content
       | (<name> <option> ... (<spec> ...)) ; directory

<name> : string or symbol

<option> ... : keyword-value alternating list

With the first and second form of spec, an empty file is created with the given name. With the third form of spec, the string becomes the content of the file.

With the fourth form of spec, the procedure is called with the pathname as an argument, and output to the current output port within the procedure is written to the created file. The pathname is relative to the dir argument. At the time the procedure is called, its parent directory is already created.

The last form of spec creates a named directory, then creates its children recursively according to the specs.

With options you can control attributes of created files/directories. Currently the following options are recognized.

:mode mode

Takes integer as permission mode bits.

:owner uid
:group gid

Takes integer uid/gid of the owner/group of the file/directory. Calling process may need special priviledge to change the owner and/or group.

:symlink path

This is only valid for file spec, and it causes create-directory-tree to create a named symbolic link whose content is path.

Function: check-directory-tree dir spec

Checks if a directory hierarchy according to spec exists under dir. Returns #t if it exists, or #f otherwise.

The format of spec is the same as create-directory-tree described above.

If spec contains options, the attributes of existing files/directories are also checked if they match the given options.


Next: , Previous: , Up: Filesystem utilities   [Contents][Index]

12.23.2 Pathname utilities

Function: build-path base-path component …

Appends pathname components component to the base-path. Component can be a symbol up or same; in Unix, they are synonym to ".." and ".". This API is taken from MzScheme.

Function: absolute-path? path
Function: relative-path? path

Returns #t if path is absolute or relative, respectively.

Function: expand-path path

Expands tilda-notation of path if it contains one. Otherwise, path is returned. This function does not check if path exists and/or readable.

Function: resolve-path path

Expands path like expand-path, then resolve symbolic links for every components of the path. If path does not exist, or contains dangling link, or contains unreadable directory, an error is signaled.

Function: simplify-path path

Remove ’up’ ("..") components and ’same’ (".") components from path as much as possible. This function does not access the filesystem.

Function: decompose-path path

Returns three values; the directory part of path, the basename without extension of path, and the extension of path. If the pathname doesn’t have an extension, the third value is #f. If the pathname ends with a directory separator, the second and third values are #f. (Note: This treatment of the trailing directory separator differs from sys-dirname/sys-basename; those follow popular shell’s convention, which ignores trailing slashes.)

(decompose-path "/foo/bar/baz.scm")
  ⇒ "/foo/bar", "baz", "scm"
(decompose-path "/foo/bar/baz")
  ⇒ "/foo/bar", "baz", #f

(decompose-path "baz.scm")
  ⇒ ".", "baz", "scm"
(decompose-path "/baz.scm")
  ⇒ "/", "baz", "scm"

;; Boundary cases
(decompose-path "/foo/bar/baz.")
  ⇒ "/foo/bar", "baz", ""
(decompose-path "/foo/bar/.baz")
  ⇒ "/foo/bar", ".baz", #f
(decompose-path "/foo/bar.baz/")
  ⇒ "/foo/bar.baz", #f, #f
Function: path-extension path
Function: path-sans-extension path

Returns an extension of path, and a pathname of path without extension, respectively. If path doesn’t have an extension, #f and path is returned respectively.

(path-extension "/foo/bar.c")       ⇒ "c"
(path-sans-extension "/foo/bar.c")  ⇒ "/foo/bar"

(path-extension "/foo/bar")         ⇒ #f
(path-sans-extension "/foo/bar")    ⇒ "/foo/bar"
Function: path-swap-extension path newext

Returns a pathname in which the extension of path is replaced by newext. If path doesn’t have an extension, "." and newext is appended to path.

If newext is #f, it returns path without extension.

(path-swap-extension "/foo/bar.c" "o")  ⇒ "/foo/bar.o"
(path-swap-extension "/foo/bar.c" "")   ⇒ "/foo/bar."
(path-swap-extension "/foo/bar.c" #f)   ⇒ "/foo/bar"

(path-swap-extension "/foo/bar" "o")  ⇒ "/foo/bar.o"
(path-swap-extension "/foo/bar" "")   ⇒ "/foo/bar."
(path-swap-extension "/foo/bar" #f)   ⇒ "/foo/bar"
Function: find-file-in-paths name :key paths pred

Looks for a file that has name name in the given list of pathnames paths and that satisfies a predicate pred. If found, the absolute pathname of the file is returned. Otherwise, #f is returned.

If name is an absolute path, only the existence of name and whether it satisfies pred are checked.

The default value of paths is taken from the environment variable PATH, and the default value of pred is file-is-executable? (see File attribute utilities). That is, find-file-in-paths searches the named executable file in the command search paths by default.

(find-file-in-paths "ls")
  ⇒ "/bin/ls"

;; example of searchin user preference file of my application
(find-file-in-paths "userpref"
  :paths `(,(expand-path "~/.myapp")
           "/usr/local/share/myapp"
           "/usr/share/myapp")
  :pred  file-is-readable?)
Function: null-device

Returns a name of the null device. On unix platforms (including cygwin) it returns "/dev/null", and on Windows native platforms (including mingw) it returns "NUL".

Function: console-device

Returns a name of the console device. On unix platforms (including cygwin) it returns "/dev/tty", and on Windows native platforms (including mingw) it returns "CON".

This function does not guarantee the device is actually available to the calling process.


Next: , Previous: , Up: Filesystem utilities   [Contents][Index]

12.23.3 File attribute utilities

Function: file-type path :key follow-link?
Function: file-perm path :key follow-link?
Function: file-mode path :key follow-link?
Function: file-ino path :key follow-link?
Function: file-dev path :key follow-link?
Function: file-rdev path :key follow-link?
Function: file-nlink path :key follow-link?
Function: file-uid path :key follow-link?
Function: file-gid path :key follow-link?
Function: file-size path :key follow-link?
Function: file-atime path :key follow-link?
Function: file-mtime path :key follow-link?
Function: file-ctime path :key follow-link?

These functions return the attribute of file/directory specified by path. The attribute name corresponds to the slot name of <sys-stat> class (see File stats). If the named path doesn’t exist, #f is returned.

If path is a symbolic link, these functions queries the attributes of the file pointed by the link, unless an optional argument follow-link? is given and false.

MzScheme and Chicken have file-size. Chicken also has file-modification-time, which is file-mtime.

Function: file-is-readable? path
Function: file-is-writable? path
Function: file-is-executable? path

Returns #t if path exists and readable/writable/executable by the current effective user, respectively. This API is taken from STk.

Function: file-is-symlink? path

Returns #t if path exists and a symbolic link. See also file-is-regular? and file-is-directory? in File stats.

Function: file-eq? path1 path2
Function: file-eqv? path1 path2
Function: file-equal? path1 path2

Compares two files specified by path1 and path2. file-eq? and file-eqv? checks if path1 and path2 refers to the identical file, that is, whether they are on the same device and have the identical inode number. The only difference is when the last component of path1 and/or path2 is a symbolic link, file-eq? doesn’t resolve the link (so compares the links themselves) while file-eqv? resolves the link and compares the files referred by the link(s).

file-equal? compares path1 and path2 considering their content, that is, when two are not the identical file in the sense of file-eqv?, file-equal? compares their content and returns #t if all the bytes match.

The behavior of file-equal? is undefined when path1 and path2 are both directories. Later, it may be extended to scan the directory contents.

Generic Function: file-mtime=? f1 f2
Generic Function: file-mtime<? f1 f2
Generic Function: file-mtime<=? f1 f2
Generic Function: file-mtime>? f1 f2
Generic Function: file-mtime>=? f1 f2

Compares file modification time stamps. There are a bunch of methods defined, so each argument can be either one of the followings.

;; compare "foo.c" is newer than "foo.o"
(file-mtime>? "foo.c" "foo.o")

;; see if "foo.log" is updated within last 24 hours
(file-mtime>? "foo.c" (- (sys-time) 86400))
Generic Function: file-ctime=? f1 f2
Generic Function: file-atime=? f1 f2

Same as file-mtime=?, except these checks file’s change time and access time, respectively. All the variants of <, <=, >, >= are also defined.


Next: , Previous: , Up: Filesystem utilities   [Contents][Index]

12.23.4 File operations

Function: touch-file path :key (time #f) (type #f) (create #t)
Function: touch-files paths :key (time #f) (type #f) (create #t)

Updates timestamp of path, or each path in the list paths, to the current time. If the specified path doesn’t exist, a new file with size zero is created, unless the keyword argument create is #f.

If the keyword argument time is given and not #f, it must be a nonnegative real number. It is used as the timestamp value instead of the current time.

The keyword argument type can be #f (default), a symbol atime or mtime. If it is a symbol, only the access time or modification time is updated.

Note: touch-files processes one file at a time, so the timestamp of each file may not be exactly the same.

These procedures are built on top of the system call sys-utime (see File stats).

Function: copy-file src dst :key if-exists backup-suffix safe keep-timestamp keep-mode follow-link?

Copies file from src to dst. The source file src must exist. The behavior when the destination dst exists varies by the keyword argument if-exists;

:error

(Default) Signals an error when dst exists.

:supersede

Replaces dst to the copy of src.

:backup

Keeps dst by renaming it.

:append

Append the src’s content to the end of dst.

#f

Doesn’t copy and returns #f when dst exists.

Copy-file returns #t after completion.

If src is a symbolic link, copy-file follows the symlink and copies the actual content by default. An error is raised if src is a dangling symlink.

Giving #f to the keyword argument follow-link? makes copy-file to copy the link itself. It is possible that src is a dangling symlink in this case.

If if-exists is :backup, the keyword argument backup-suffix specifies the suffix attached to the dst to be renamed. The default value is ".orig".

By default, copy-file starts copying to dst directly. However, if the keyword argument safe is a true value, it copies the file to a temporary file in the same directory of dst, then renames it to dst when copy is completed. (When safe is true and if-exists is :append, we first copy the content of dst to a temporary file if dst exists, appends the content of src, then renames the result to dst). If copy is interrupted for some reason, the filesystem is "rolled back" properly.

If the keyword argument keep-timestamp is true, copy-file sets the destination’s timestamp to the same as the source’s timestamp after copying.

If the keyword argument keep-mode is true, the destination file’s permission bits are set to the same as the source file’s. If it is false (default), the destination file’s permission remains the same if the destination already exists and the safe argument is false, otherwise it becomes #o666 masked by umask settings.

Function: move-file src dst :key if-exists backup-suffix

Moves file src to dst. The source src must exist. The behavior when dst exists varies by the keyword argument if-exists, as follows.

:error

(Default) Signals an error when dst exists.

:supersede

Replaces dst by src.

:backup

Keeps dst by renaming it.

#f

Doesn’t move and returns #f when dst exists.

Move-file returns #t after completion.

If if-exists is :backup, the keyword argument backup-suffix specifies the suffix attached to the dst to be renamed. The default value is ".orig".

The file src and dst can be on the different filesystem. In such a case, move-file first copies src to the temporary file on the same directory as dst, then renames it to dst, then removes src.

Function: remove-file filename
Function: delete-file filename

[R7RS] Removes the named file. An error is signalled if filename does not exist, is a directory, or cannot be deleted with other reasons such as permissions. R7RS defines delete-file.

Compare with sys-unlink (see Directory manipulation), which doesn’t raise an error when the named file doesn’t exist.

Function: remove-files paths
Function: delete-files paths

Removes each path in a list paths. If the path is a file, it is unlinked. If it is a directory, its contents are recursively removed by remove-directory*. If the path doesn’t exist, it is simply ignored.

delete-files is just an alias of remove-files.

Function: file->string filename options …
Function: file->list reader filename options …
Function: file->string-list filename options …
Function: file->sexp-list filename options …

Convenience procedures to read from a file filename. They first open the named file, then call port->string, port->list, port->string-list and port->sexp-list on the opened file, respectively. (see Input utility functions). The file is closed if all the content is read or an error is signaled during reading.

Those procedures take the same keyword arguments as call-with-input-file. When the named file doesn’t exist, the behavior depends on :if-does-not-exist keyword argument—an error is signaled if it is :error, and #f is returned if the argument is #f.


Previous: , Up: Filesystem utilities   [Contents][Index]

12.23.5 Lock files

Exclusivity of creating files or directories is often used for inter-process locking. The following procedure provides a packaged interface for it.

Function: with-lock-file lock-name thunk :key type retry-interval retry-limit secondary-lock-name retry2-interval retry2-limit perms abandon-timeout

Exclusively creates a file or a directory (lock file) with lock-name, then executes thunk. After thunk returns, or an error is thrown in it, the lock file is removed. When thunk returns normally, its return values become the return values of with-lock-file.

If the lock file already exists, with-lock-file waits and retries getting the lock until timeout reaches. It can be configured by the keyword arguments.

There’s a chance that with-lock-file leaves the lock file when it gets a serious error situation and doesn’t have the opportunity to clean up. You can allow with-lock-file to steal the lock if its timestamp is too old; say, if you know that the applications usually locks just for seconds, and you find the lock file is 10 minutes old, then it’s likely that the previous process was terminated abruptly and couldn’t clean it up. You can also configure this behavior by the keyword arguments.

Internally, two lock files are used to implement this stealing behavior safely. The creation and removal of the primary lock file (named by lock-name argument) are guarded by the secondary lock file (named by secondary-lock-file argument, defaulted by .2 suffix attached to lock-name). The secondary lock prevents more than one process steals the same primary lock file simultaneously.

The secondary lock is acquired for a very short period so there’s much less chance to be left behind by abnormal terminations. If it happens, however, we just give up; we don’t steal the secondary lock.

If with-lock-file couldn’t get a lock before timeout, a <lock-file-failure> condition is thrown.

Here’s a list of keyword arguments.

type

It can be either one of the symbols file or directory.

If it is file, we use a lock file, relying on the O_EXCL exclusive creation flag of open(2). This is the default value. It works for most platforms; however, some NFS implementation may not implement the exclusive semantics properly.

If it is directory, we use a lock directory, relying on the atomicity of mkdir(2). It should work for any platforms, but it may be slower than file.

retry-interval
retry-limit

Accepts a nonnegative real number that specifies either the interval to attempt to acquire the primary lock, or the maximum time we should keep retrying, respectively, in seconds. The default value is 1 second interval and 10 second limit. To prevent retrying, give 0 to retry-limit.

secondary-lock-name

The name of the secondary lock file (or directory). If omitted, lock-name with a suffix .2 attached is used. Note: The secondary lock name must be aggreed on all programs that locks the same (primary) lock file. I recommend to leave this to the default unless there’s a good reason to do otherwise.

retry2-interval
retry2-limit

Like retry-interval and retry-limit, but these specify interval and timeout for the secondary lock file. The possibility of secondary lock file collision is usually pretty low, so you would hardly need to tweak these. The default values are 1 second interval and 10 second limit.

perms

Specify the permission bitmask of the lock file or directory, in a nonnegative exact integer. The default is #o644 for a lock file and #o755 for a lock directory.

Note that to control who can acquire/release/steal the lock, what matters is the permission of the directory in which the lock file/directory, not the permission of the lock file/directory itself.

abandon-timeout

Specifies the period in seconds in a nonnegative real number. If the primary lock file is older than that, with-lock-file steals the lock. To prevent stealing, give #f to this argument. The default value is 600 seconds.

Condition type: <lock-file-failure>

A condition indicating that with-lock-file couldn’t obtain the lock. Inherits <error>.

Instance Variable of <lock-file-failure>: lock-file-name

The primary lock file name.

Gauche also provides OS-supported file locking feature, fcntl lock, via gauche.fcntl module. Whether you want to use fcntl lock or with-lock-file will depend on your application.

These are the advantages of the fcntl lock:

In common situations, probably the most handy property is the first one; you don’t need to worry about leaving lock behind unexpected process termination.

However, there are a couple of shortcomings in fcntl locks.

Especially because of the second point, it is very difficult to use fcntl lock unless you have total control over and knowledge of the entire application. It is ok to use the fcntl lock by the application code to lock the application-specific file. Library developers have difficulty, however, to make sure any potential user of the library won’t try to lock the same file as the library tries to lock (usually it’s impossible).


Previous: , Up: Filesystem utilities   [Contents][Index]