file.util
- Filesystem utilities ¶Provides convenient utility functions handling files and directories. Those functions are built on top of the primitive system procedures described in Filesystems.
Many procedures in this module takes a keyword argument follow-link?, which specifies the behavior when the procedure sees a symbolic link. If true value is given to follow-link? (which is the default), the procedure operates on the file referenced by the link; if false is given, it operates on the link itself.
Note on the naming convention: Some Scheme implementations "create" new directories and files, while the others "make" them. Some implementations "delete" them, while the others "remove" them. It seems that both conventions are equally popular. So Gauche provides both.
• Directory utilities: | ||
• Pathname utilities: | ||
• File attribute utilities: | ||
• File operations: | ||
• Temporary files and directories: | ||
• Lock files: |
{file.util
}
When called with no argument, this returns the pathname of the current
working directory. When called with a string argument new-directory,
this sets the current working directory of the process to it.
If the process can’t change directory to new-directory, an error is
signaled.
This function is in ChezScheme, MzScheme and some other Scheme implementations.
SRFI-170 defines current-directory
without arguments, to return
the current working directory.
{file.util
}
Returns the home directory of the given user,
which may be a string user name or an integer user id.
If user is omitted, the current user is assumed.
If the given user cannot be found, or the home directory
of the user cannot be determined, #f
is returned.
On Windows native platforms, this function is only supported to query the current user’s directory.
{file.util
}
Returns a list of entries in the directory path.
The result is sorted by dictionary order.
By default, only the basename (the last component) of the entries
returned. If add-path? is given and true, path is appended
to each entry. If children? is given and true, "."
and
".."
are excluded from the result.
If filter is given, it must be a predicate that takes one argument. It is called on every element of the directory entry, and only the entries on which filter returns true are included in the result. The argument passed to filter is a basename of the directory entry by default, but when filter-add-path? is true, path is appended to the entry.
If path is not a directory, an error is signaled.
(directory-list "test") ⇒ ("." ".." "test.scm" "test.scm~") (directory-list "test" :add-path? #t) ⇒ ("test/." "test/.." "test/test.scm" "test/test.scm~") (directory-list "test" :children? #t) ⇒ ("test.scm" "test.scm~") (directory-list "test" :children? #t :add-path? #t :filter (lambda (e) (not (string-suffix? "~" e)))) ⇒ ("test/test.scm")
{file.util
}
Like directory-list
, but returns two values; the first one is a list
of subdirectories, and the second one is a list of the rest.
The keyword arguments children?, add-path?
and filter
are the same as directory-list
.
Giving false value to follow-link? makes directory-list2
not follow the symbolic links; if the path contains a
symlink to a directory,
it will be included in the first list if follow-link?
is omitted or true,
while it will be in the second list if follow-link? is false.
{file.util
}
A fundamental directory traverser.
Conceptually it works as follows, in recursive way.
(proc path seed)
and returns the result.
(lister path seed)
. The procedure lister
is expected to return two values: a list of pathnames, and the
next seed value. Then
directory-fold
is called on each returned pathname,
passing the returned seed value to the seed argument of the
next call of directory-fold
.
Returns the result of the last seed value.
The default procedure of lister is just a call to directory-list
,
as follows.
(lambda (path seed) (values (directory-list path :add-path? #t :children? #t) seed))
Note that lister shouldn’t return the given path itself ("."
)
nor the parent directory (".."
), or the recursion wouldn’t
terminate. Also note lister is expected to return a path accessible
from the current directory, i.e. if path is "/usr/lib/foo"
and
it contains "libfoo.a"
and "libfoo.so"
, lister should
return '("/usr/lib/foo/libfoo.a" "/usr/lib/foo/libfoo.so")
.
The keyword argument follow-link? is used to determine whether lister should be called on a symbolic link pointing to a directory. When follow-link? is true (default), lister is called with the symbolic link if it points to a directory. When follow-link? is false, proc is not called.
The following example returns a list of pathnames of the emacs backup files (whose name ends with "~") under the given path.
(use srfi.13) ;; for string-suffix? (directory-fold path (lambda (entry result) (if (string-suffix? "~" entry) (cons entry result) result)) '())
The following example lists all the files and directories under the given pathname. Note the use of lister argument to include the directory path itself in the result.
(directory-fold path cons '() :lister (lambda (path seed) (values (directory-list path :add-path? #t :children? #t) (cons path seed))))
{file.util
}
Creates a directory name. If the intermediate path to the
directory doesn’t exist, they are also created
(like mkdir -p
command on Unix). If the directory
name already exist, these procedure does nothing.
Perm specifies the integer flag for permission bits of the
directory.
{file.util
}
Deletes directory name and its content recursively
(like rm -r
command on Unix). Symbolic links are not
followed.
The keyword argument if-does-not-exist must be either
:error
(default) or #f
. If it is :error
,
an error is signaled if name does not exist.
If it is #f
, the procedure just returns in such a case.
An error is thrown when name exists but is not a directory.
You can use remove-files
to remove both files and directories.
{file.util
}
If src is a regular file, copies its content to dst, just like
copy-file
does. If src is a directory, recursively
descends it and copy the file tree to dst. Basically
it mimics the behavior of cp -r
command.
If there’s any symbolic links under src, the link itself
is copied instead of the file pointed to by it, unless a true value
is given to the follow-link? keyword argument,
i.e. the default value of follow-link? is #f
.
(Note that this is opposite to the copy-file
, in which
follow-link? is true by default.)
The meanings of the other keyword arguments are the same as
copy-file
. See the entry of copy-file
for the details.
{file.util
}
Creates a directory tree under dir according to spec.
This procedure is useful to set up certain directory hierarchy at once.
The spec argument is an S-expression with the following structure:
<spec> : <name> ; empty file | (<name> <option> ...) ; empty file | (<name> <option> ... <string>) ; file with content | (<name> <option> ... <procedure>) ; file with generated content | (<name> <option> ... (<spec> ...)) ; directory <name> : string or symbol <option> ... : keyword-value alternating list
With the first and second form of spec, an empty file is created with the given name. With the third form of spec, the string becomes the content of the file.
With the fourth form of spec, the procedure is called with the pathname as an argument, and output to the current output port within the procedure is written to the created file. The pathname is relative to the dir argument. At the time the procedure is called, its parent directory is already created.
The last form of spec creates a named directory, then creates its children recursively according to the specs.
With options you can control attributes of created files/directories. Currently the following options are recognized.
:mode mode
Takes integer as permission mode bits.
:owner uid
:group gid
Takes integer uid/gid of the owner/group of the file/directory. Calling process may need special privilege to change the owner and/or group.
:symlink path
This is only valid for file spec, and it causes
create-directory-tree
to create a named symbolic link
whose content is path.
{file.util
}
Checks if a directory hierarchy according to spec exists
under dir. Returns #t
if it exists, or #f
otherwise.
The format of spec is the same
as create-directory-tree
described above.
If spec contains options, the attributes of existing files/directories are also checked if they match the given options.
{file.util
}
Appends pathname components component … to the base-path.
This takes care of platform-specific pathname separators.
Base-path can be a directory name, #f
, or a symbol cwd
or
cld
. If it is #f
, "."
is used.
If it is cwd
, the current working directory
is used. If it is cld
, the directory of the current loading
file is used
(current-load-path
, see Loading Scheme file).
If there’s no current loading file (e.g. calling
from REPL), the current working directory is used instead.
Note that the current loading directory is looked up at the time
build-path
is called. It may not the directory where the
actual source file is in, if build-path
is called after loading
is completed.
Component can be a string, #f
, or
a symbol same
or up
. The latter three
are interpreted as "."
, "."
, and ".."
, respectively.
They are appended to base-path.
(The symbol same
and up
are taken from MzScheme).
{file.util
}
Returns #t
if path is absolute or relative, respectively.
{file.util
}
Expands tilda-notation of path if it contains one.
Otherwise, path is returned. This function does not
check if path exists and/or readable.
{file.util
}
Expands path like expand-path
,
then resolve symbolic links for every components
of the path. If path does not exist, or contains dangling link,
or contains unreadable directory, an error is signaled.
{file.util
}
Remove ’up’ (".."
) components and ’same’ ("."
) components
from path as much as possible.
This function does not access the filesystem.
{file.util
}
Returns three values; the directory part of path,
the basename without extension of path, and
the extension of path. If the pathname doesn’t have an extension,
the third value is #f
. If the pathname ends with a directory
separator, the second and third values are #f
. (Note: This treatment
of the trailing directory separator differs from
sys-dirname
/sys-basename
; those follow popular shell’s
convention, which ignores trailing slashes.)
(decompose-path "/foo/bar/baz.scm") ⇒ "/foo/bar", "baz", "scm" (decompose-path "/foo/bar/baz") ⇒ "/foo/bar", "baz", #f (decompose-path "baz.scm") ⇒ ".", "baz", "scm" (decompose-path "/baz.scm") ⇒ "/", "baz", "scm" ;; Boundary cases (decompose-path "/foo/bar/baz.") ⇒ "/foo/bar", "baz", "" (decompose-path "/foo/bar/.baz") ⇒ "/foo/bar", ".baz", #f (decompose-path "/foo/bar.baz/") ⇒ "/foo/bar.baz", #f, #f
{file.util
}
Returns an extension of path,
and a pathname of path without extension, respectively.
If path doesn’t have an extension, #f
and path
is returned respectively.
(path-extension "/foo/bar.c") ⇒ "c" (path-sans-extension "/foo/bar.c") ⇒ "/foo/bar" (path-extension "/foo/bar") ⇒ #f (path-sans-extension "/foo/bar") ⇒ "/foo/bar"
{file.util
}
Returns a pathname in which the extension of path is replaced
by newext. If path doesn’t have an extension,
"." and newext is appended to path.
If newext is #f
, it returns path without extension.
(path-swap-extension "/foo/bar.c" "o") ⇒ "/foo/bar.o" (path-swap-extension "/foo/bar.c" "") ⇒ "/foo/bar." (path-swap-extension "/foo/bar.c" #f) ⇒ "/foo/bar" (path-swap-extension "/foo/bar" "o") ⇒ "/foo/bar.o" (path-swap-extension "/foo/bar" "") ⇒ "/foo/bar." (path-swap-extension "/foo/bar" #f) ⇒ "/foo/bar"
{file.util
}
Looks for a file that has name name in the given list of pathnames
paths and that satisfies a predicate pred. If found,
the absolute pathname of the file is returned. Otherwise, #f
is returned.
If name is an absolute path, only the existence of name and whether it satisfies pred are checked.
The default value of paths is taken from the environment variable
PATH
, and the default value of pred is file-is-executable?
(see File attribute utilities). That is, find-file-in-paths
searches the named executable file in the command search paths
by default.
(find-file-in-paths "ls")
⇒ "/bin/ls"
;; example of searching user preference file of my application
(find-file-in-paths "userpref"
:paths `(,(expand-path "~/.myapp")
"/usr/local/share/myapp"
"/usr/share/myapp")
:pred file-is-readable?)
The extensions keyword argument may list alternative extensions added to name. For example, the following example searches not only notepad, but also notepad.exe and notepad.com, in the PATH. If an alternate name is found, the returned pathname contains the extension.
(find-file-in-paths "notepad" :extensions '("exe" "com"))
For each path, the name and the alternative names are checked in order.
That is,
if there are /bin/b.com and /usr/bin/b.exe and paths
is ("/bin" "/usr/bin")
, you’ll get /bin/b.com when you
search b with extensions ("exe" "com")
.
{file.util
}
Returns a name of the null device.
On unix platforms (including cygwin) it returns "/dev/null"
,
and on Windows native platforms (including mingw) it returns "NUL"
.
{file.util
}
Returns a name of the console device.
On unix platforms (including cygwin) it returns "/dev/tty"
,
and on Windows native platforms (including mingw) it returns "CON"
.
This function does not guarantee the device is actually available to the calling process.
{file.util
}
These functions return the attribute of file/directory specified by
path. The attribute name corresponds to the slot name of
<sys-stat>
class (see File stats).
If the named path doesn’t exist, #f
is returned.
If path is a symbolic link, these functions queries the attributes of the file pointed by the link, unless an optional argument follow-link? is given and false.
MzScheme and Chicken have file-size
. Chicken also has
file-modification-time
, which is file-mtime
.
{file.util
}
Returns #t
if path exists and readable/writable/executable
by the current effective user, respectively.
This API is taken from STk.
{file.util
}
Returns #t
if path exists and a symbolic link.
See also file-is-regular?
and file-is-directory?
in
File stats.
{file.util
}
Compares two files specified by path1 and path2.
file-eq?
and file-eqv?
checks if path1 and path2
refers to the identical file, that is, whether they are on the same
device and have the identical inode number. The only difference is
when the last component of path1 and/or path2 is a symbolic
link, file-eq?
doesn’t resolve the link (so compares the links
themselves) while file-eqv? resolves the link and compares the
files referred by the link(s).
file-equal?
compares path1 and path2 considering their
content, that is, when two are not the identical file in the sense of
file-eqv?
, file-equal?
compares their content and returns
#t
if all the bytes match.
The behavior of file-equal?
is undefined
when path1 and path2 are both directories.
Later, it may be extended to scan the directory contents.
{file.util
}
Compares file modification time stamps. There are a bunch of methods defined,
so each argument can be either one of the followings.
<sys-stat>
object (see File stats).
The mtime is taken from the stat structure.
<time>
object. The time is used as the mtime.
;; compare "foo.c" is newer than "foo.o" (file-mtime>? "foo.c" "foo.o") ;; see if "foo.log" is updated within last 24 hours (file-mtime>? "foo.c" (- (sys-time) 86400))
{file.util
}
Same as file-mtime=?
, except these checks file’s change time
and access time, respectively.
All the variants of <
, <=
, >
, >=
are also
defined.
{file.util
}
Updates timestamp of path, or each path in the list paths,
to the current time. If the specified path
doesn’t exist, a new file with size zero is created, unless
the keyword argument create is #f
.
If the keyword argument time is given and not #f
, it
must be a nonnegative real number. It is used as the timestamp value
instead of the current time.
The keyword argument type can be #f
(default), a symbol
atime
or mtime
. If it is a symbol, only the access time
or modification time is updated.
Note: touch-files
processes one file at a time, so the timestamp
of each file may not be exactly the same.
These procedures are built on top of the system call
sys-utime
(see File stats).
{file.util
}
Copies file from src to dst. The source file src must exist.
The behavior when the destination dst exists varies by the keyword
argument if-exists;
:error
(Default) Signals an error when dst exists.
:supersede
Replaces dst to the copy of src.
:backup
Keeps dst by renaming it.
:append
Append the src’s content to the end of dst.
#f
Doesn’t copy and returns #f
when dst exists.
Copy-file
returns #t
after completion.
If src is a symbolic link, copy-file
follows the
symlink and copies the actual content by default. An error
is raised if src is a dangling symlink.
Giving #f
to the keyword argument follow-link?
makes copy-file
to copy the link itself.
It is possible that src is a dangling
symlink in this case.
If if-exists is :backup
, the keyword argument backup-suffix
specifies the suffix attached to the dst to be renamed.
The default value is ".orig"
.
By default, copy-file
starts copying to dst directly.
However, if the keyword argument safe is a true value,
it copies the file to a temporary file in the same directory of dst,
then renames it to dst when copy is completed.
(When safe is true and if-exists is :append
,
we first copy the content of dst to a temporary file if dst
exists, appends the content of src, then renames the result to dst).
If copy is interrupted for some reason, the filesystem is "rolled back"
properly.
If the keyword argument keep-timestamp is true, copy-file
sets the destination’s timestamp to the same as the source’s timestamp
after copying.
If the keyword argument keep-mode is true, the destination file’s
permission bits are set to the same as the source file’s. If it is false
(default), the destination file’s permission remains the same if
the destination already exists and the safe argument is false,
otherwise it becomes #o666
masked by umask settings.
{file.util
}
Moves file src to dst. The source src must exist.
The behavior when dst exists varies by the keyword argument
if-exists, as follows.
:error
(Default) Signals an error when dst exists.
:supersede
Replaces dst by src
.
:backup
Keeps dst by renaming it.
#f
Doesn’t move and returns #f
when dst exists.
Move-file
returns #t
after completion.
If if-exists is :backup
, the keyword argument backup-suffix
specifies the suffix attached to the dst to be renamed.
The default value is ".orig"
.
The file src and dst can be on the different filesystem.
In such a case, move-file
first copies src to the
temporary file on the same directory as dst, then renames
it to dst, then removes src.
[R7RS file]
{file.util
}
Removes the named file. An error is signalled if filename
does not exist, is a directory, or cannot be deleted with other
reasons such as permissions.
R7RS defines delete-file
.
Compare with sys-unlink
(see Directory manipulation),
which doesn’t raise an error when the named file doesn’t exist.
{file.util
}
Removes each path in a list paths. If the path is
a file, it is unlink
ed. If it is a directory,
its contents are recursively removed by remove-directory*
.
If the path doesn’t exist, it is simply ignored.
delete-files
is just an alias of remove-files
.
{file.util
}
Convenience procedures to read from a file filename.
They first open the named file, then call port->string
,
port->list
, port->string-list
and port->sexp-list
on the opened file, respectively. (see Input utility functions).
The file is closed if all the content is read or an error is
signaled during reading.
Those procedures take the same keyword arguments as
call-with-input-file
.
When the named file doesn’t exist, the behavior depends on
:if-does-not-exist keyword argument—an error is signaled
if it is :error
, and #f
is returned if the argument is
#f
.
{file.util
}
Opposite of file->string
etc. They are convenient
to quickly write out things into a file.
NB: The name string->file
etc. might suggest they would take the
object to be written as the first argument. We decided to put filename
first, since in the situations where these procedures are used,
it is more likely that one want to write literal data, which would be
bigger than the filename itself.
The options part is passed to call-with-output-file
as is.
For example, the following code appends the text when foo.txt
already exists:
(string->file "foo.txt" "New text to append\n" :if-exists :append)
The list->file
takes writer argument, which is a procedure
that receives two arguments, an element from the list lis, and an
output port. It should write out the element to the port in a suitable
way. The string-list->file
and sexp-list->file
are
specialized versions of list->file
, where string-list->file
uses (^[s p] (display s p) (newline p))
as writer,
and sexp-list->file
uses
(^[s p] (write s p) (newline p))
as writer.
{file.util
}
A parameter that keeps the name of the directory that can be used
to create a temporary files. The default value is
the one returned from sys-tmpdir
(see Pathnames).
The difference of sys-tmpdir
is that, since this is a parameter,
it can be overridden by application during execution.
Libraries are recommended to use this instead of sys-tmpdir
for greater flexibility.
{file.util
}
Creates a temporary file with a unique name and opens it for output,
then calls proc with the output port and the temporary file’s name.
The temporary file is removed after either proc returns
or raises an uncaught error.
Returns the value(s) proc returns.
The temporary file is created in the directory directory,
with the name prefix followed by several random alphanumeric characters.
When omitted, the value of (temporary-directory)
is used
for directory, and "gtemp"
for prefix.
The name passed to proc consists of directory and the file’s name. So whether the name is absolute or relative pathname depends on the value of directory.
(call-with-temporary-file (^[_ name] name)
⇒ Something like "/tmp/gtemp4dSpMh"
You can keep the output file by renaming it in proc. But if doing so, make sure to specify directory so that the temporary file is created in the same directory as the final output; rename may not work across filesystems. If you anticipate your code runs on Windows as well, make sure to close the output port before renaming. Windows does not allow you to rename an opened file.
Internally, it calls sys-mkstemp
to create a unique file.
See Directory manipulation, for the details.
{file.util
}
Creates a temporary directory with unique name,
then calls proc with the name.
The temporary directory and its contents are removed
after either proc returns
or raises an uncaught error.
Returns the value(s) proc returns.
The temporary directory is created in the directory directory,
with the name prefix followed by several random alphanumeric characters.
When omitted, the value of (temporary-directory)
is used
for directory, and "gtemp"
for prefix.
The name passed to proc consists of directory and the directory name. So whether the name is absolute or relative pathname depends on the value of directory.
Internally, it calls sys-mkdtemp
to create a unique file.
See Directory manipulation, for the details.
Exclusivity of creating files or directories is often used for inter-process locking. The following procedure provides a packaged interface for it.
{file.util
}
Exclusively creates a file or a directory (lock file)
with lock-name, then executes thunk.
After thunk returns, or an error is thrown in it,
the lock file is removed. When thunk returns normally,
its return values become the return values of with-lock-file
.
If the lock file already exists, with-lock-file
waits and retries
getting the lock until timeout reaches. It can be configured by
the keyword arguments.
There’s a chance that with-lock-file
leaves the lock file
when it gets a serious error situation and doesn’t have the opportunity
to clean up. You can allow with-lock-file
to steal
the lock if its timestamp is too old; say, if you know that the
applications usually locks just for seconds, and you find the lock
file is 10 minutes old, then it’s likely that the previous
process was terminated abruptly and couldn’t clean it up.
You can also configure this behavior by the keyword arguments.
Internally, two lock files are used to implement this
stealing behavior safely. The creation and removal of the primary
lock file (named by lock-name argument) are guarded by
the secondary lock file (named by secondary-lock-file argument,
defaulted by .2
suffix attached to lock-name).
The secondary lock prevents more than one process steals
the same primary lock file simultaneously.
The secondary lock is acquired for a very short period so there’s much less chance to be left behind by abnormal terminations. If it happens, however, we just give up; we don’t steal the secondary lock.
If with-lock-file
couldn’t get a lock before timeout,
a <lock-file-failure>
condition is thrown.
Here’s a list of keyword arguments.
It can be either one of the symbols file
or directory
.
If it is file
, we use a lock file, relying on the O_EXCL
exclusive creation flag of open(2)
.
This is the default value.
It works for most platforms;
however, some NFS implementation may not implement the exclusive
semantics properly.
If it is directory
, we use a lock directory, relying on the
atomicity of mkdir(2)
. It should work for any platforms,
but it may be slower than file
.
Accepts a nonnegative real number that specifies either the interval to attempt to acquire the primary lock, or the maximum time we should keep retrying, respectively, in seconds. The default value is 1 second interval and 10 second limit. To prevent retrying, give 0 to retry-limit.
The name of the secondary lock file (or directory). If omitted,
lock-name with a suffix .2
attached is used.
Note: The secondary lock name must be agreed on all programs that
locks the same (primary) lock file. I recommend to leave this
to the default unless there’s a good reason to do otherwise.
Like retry-interval and retry-limit, but these specify interval and timeout for the secondary lock file. The possibility of secondary lock file collision is usually pretty low, so you would hardly need to tweak these. The default values are 1 second interval and 10 second limit.
Specify the permission bitmask of the lock file or directory,
in a nonnegative exact integer. The default is #o644
for
a lock file and #o755
for a lock directory.
Note that to control who can acquire/release/steal the lock, what matters is the permission of the directory in which the lock file/directory, not the permission of the lock file/directory itself.
Specifies the period in seconds in a nonnegative real number.
If the primary lock file is
older than that, with-lock-file
steals the lock.
To prevent stealing, give #f
to this argument.
The default value is 600 seconds.
{file.util
}
A condition indicating that with-lock-file
couldn’t
obtain the lock. Inherits <error>
.
<lock-file-failure>
: lock-file-name ¶The primary lock file name.
Gauche also provides OS-supported file locking feature,
fcntl
lock, via gauche.fcntl
module.
Whether you want to use fcntl
lock or with-lock-file
will depend on your application.
These are the advantages of the fcntl
lock:
In common situations, probably the most handy property is the first one; you don’t need to worry about leaving lock behind unexpected process termination.
However, there are a couple of shortcomings in fcntl
locks.
Especially because of the second point, it is very difficult
to use fcntl
lock unless you have total control over and knowledge
of the entire application.
It is ok to use the fcntl
lock by the application code to lock
the application-specific file.
Library developers have difficulty, however, to make sure any potential
user of the library won’t try to lock the same file as the library tries
to lock (usually it’s impossible).