You can use characters other than us-ascii
not only in
literal strings and characters, but in comments, symbol names,
literal regular expressions, and so on.
By default, Gauche assumes a Scheme program is written in utf-8. If you need to write a source other than utf-8, however, you can add the following “magic comment” near the beginning of the source code:
When Gauche finds a comment something like the following within
the first two lines of the program source, it assumes the rest of
the source code is written in <encoding-name>
, and does
the appropriate character encoding conversion to read the source code:
;; coding: <encoding-name>
More precisely, a comment in either first or second line that matches
a regular expression #/coding[:=]\s*([\w.-]+)/
is recognized,
and the first submatch is taken as an encoding name.
If there are multiple matches, only the first one is effective.
The first two lines must not contain characters other than us-ascii
in order for this mechanism to work.
The following example tells Gauche that the script is written
in EUC-JP encoding. Note that the string "-*-
" around the coding
would be recognized by Emacs to select the buffer’s encoding
appropriately.
#!/usr/bin/gosh ;; -*- coding: euc-jp -*- ... script written in euc-jp ...
Internally, the handling of this magic comment is done by a special type of port. See Coding-aware ports for the details. See also Loading Scheme file for how to disable this feature.