You can use characters other than
us-ascii not only in
literal strings and characters, but in comments, symbol names,
literal regular expressions, and so on.
By default, Gauche assumes a Scheme program is written in its internal character encoding. It is fine as far as you’re writing scripts to use your own environment, but it becomes a problem if somebody else tries to use your script and finds out you’re using different character encoding than his/hers.
So, if Gauche finds a comment something like the following within
the first two lines of the program source, it assumes the rest of
the source code is written in
<encoding-name>, and does
the appropriate character encoding conversion to read the source code:
;; coding: <encoding-name>
More precisely, a comment in either first or second line that matches
a regular expression
#/coding[:=]\s*([\w.-]+)/ is recognized,
and the first submatch is taken as an encoding name.
If there are multiple matches, only the first one is effective.
The first two lines must not contain characters other than us-ascii
in order for this mechanism to work.
The following example tells Gauche that the script is written
in EUC-JP encoding. Note that the string "
-*-" around the coding
would be recognized by Emacs to select the buffer’s encoding
#!/usr/bin/gosh ;; -*- coding: euc-jp -*- ... script written in euc-jp ...
Internally, the handling of this magic comment is done by a special type of port. See Coding-aware ports for the details. See also Loading Scheme file for how to disable this feature.