Next: Literals, Previous: Core syntax, Up: Core syntax [Contents][Index]
Gauche extends R7RS Scheme lexical parser in some ways. Besides, because of historical reasons, a few of the default lexical syntax may conflict R7RS specification. You can set a reader mode to make it R7RS compliant.
Tokens beginning with #!
may have special meanings to
the reader. R7RS defines two of such directives—#!fold-case
and #!no-fold-case
, which switches whether symbols
are read in case-folding or non-case-folding mode, respectively.
see Hash-bang token below, for all the directives Gauche has.
Gauche adopts the R6RS syntax that regards []
the same
as ()
. Both kind of parentheses are equivalent,
but the kind of corresponding open and close parentheses must match.
Some seasoned Lisper may frown on them, but it
helps visually distinguish different roles of parentheses.
A general convention is to use []
for groupings other than
function and macro application. If such grouping nests, however, use ()
for outer groupings. Examples:
(cond [(test1 x) (y z)] [(test2 x) (s t)] [else (u v)]) (let ([x (foo a b)] [y (bar c d)]) (baz x y))
It is purely optional, so you don’t need to use them if you don’t like them.
R7RS doesn’t adopt this syntax and leaves []
for extensions,
so it is safe to stick to ()
in portable R7RS programs.
(If the reader is in strict-r7
mode, an error is signalled
when []
is used. See Reader lexical mode, for the details.)
Scheme-specific modes of some editors (e.g. Quack on Emacs) allows
you to type just )
and inserts either ]
or )
depending on which kind parenthesis it is closing. We recommend
using such modes if you use this convention.
Symbol names are case sensitive by default (see Case-sensitivity).
Symbol name can begin with digits, ’+
’ or ’-
’, as long as
the entire token doesn’t consist valid number syntax.
Other weird characters can be included in a symbol name by surrounding
it with ’|’, e.g. ’|this is a symbol|
’.
See Symbols, for details.
Either integral part or fraction part of inexact real numbers can be omitted
if it is zero, i.e. 30.
, .25
, -.4
are read as
real numbers.
The number reader recognizes ’#
’ as insignificant digits.
Complex numbers can be written both in the rectangular format
(e.g. 1+0.3i
) and in the polar format (e.g. 3.0@1.57
).
Inexact real numbers include
the positive infinity, the negative infinity, and NaN,
which are represented as +inf.0
, -inf.0
and +nan.0
,
respectively. (-nan.0
is also read as NaN.)
Gauche supports SRFI-169 (underscores in numbers) notation; you can
insert a character _
between digits in numeric literals
to improve readability, e.g. #b1100_1010_1111_1110
.
A valid underscore must be surrounded by digits; 1_2_3
is ok,
but _123
, 123_
, and 12__3
are not.
Gauche also adopts Common-Lisp style radix prefixed numeric literals,
e.g. #3r120
(120
in base-3, 15 in decimal). Radix between
2 and 36 are recognized; alphabetic letters a-zA-Z
are used beyond decimal.
For the polar notation of complex numbers, Gauche allows the suffix pi
to denote the phase by multiples of pi. The Scheme syntax use radians for
the phase, but you can only approximate pi with the floating point numbers,
so it can’t represent round numbers except zero angle.
gosh> 2@3.141592653589793 -2.0+2.4492935982947064e-16i
With the pi
suffix, you can get a round numbers.
gosh> 2@1pi -2.0 gosh> 2@0.5pi 0.0+2.0i gosh> 2@-0.5pi 0.0-2.0i
You can denote a character using hexadecimal notation of the character code in some literals; specifically, character literals, charcter set literals, string literals, symbols, regular expression literals.
R7RS adopted a hex escape notation \xNNNN;
for strings
and symbols surrounded by vertical bars, and #\xNNNN
for
characters. The number of digits is variable, and
the character code is Unicode codepoint.
Gauche had been using two types of escapes; \u
and \x
. In general, u
is for Unicode
codepoint, while x
is for the character code in the internal
encoding. Besides, except character literals, we used fixed number
of digits, instead of using the terminator ;
as in R7RS.
Since 0.9.4, we interpret \x
-escape as R7RS whenever if
it consists a valid R7RS hex-escape, and if not,
try to interpret it as legacy Gauche hex-escape.
Although rarely, there are cases that can interpreted both in R7RS syntax and legacy Gauche syntax, but yielding different characters. Reading legacy files with such literals in the current Gauche may cause unexpected behavior. You can switch the reader mode so that it becomes backward-compatible. See Reader lexical mode, for the details.
Many more special tokens begins with ’#
’ are defined.
See the table below.
• Sharp syntax: | ||
• Hash-bang token: |
Next: Hash-bang token, Previous: Lexical structure, Up: Lexical structure [Contents][Index]
The table below lists sharp-syntaxes.
#! | [R6RS][R7RS][SRFI-22] It is either a beginning of an interpreter line (shebang) of a script, or a special token that affects the mode of the reader. See ‘hash-bang token’ section below. |
#" | Introduces an interpolated string. See String interpolation. |
## , #$ , #% , #& , #' | Unused. |
#( | [R7RS] Introduces a vector. |
#) | Unused. |
#* | Bitvector or an incomplete string. See Strings. |
#+ | Unused. |
#, | [SRFI-10] Introduces reader constructor syntax. |
#- , #. | Unused. |
#/ | Introduces a literal regular expression. See Regular expressions. |
#0 … #9 | #n# , #n= : [SRFI-38] Shared substructure definition and reference.#nR , #nr : Radix prefixed numeric literals. |
#: | Uninterned symbol. See Symbols. |
#; | [SRFI-62] S-expression comment. Reads next one S-expression and discard it. |
#< | Introduces an unreadable object. |
#= , #> | Unused. |
#? | Introduces debug macros. See Debugging. |
#@ | Unused. |
#a | Unused. |
#b | [R7RS] Binary number prefix. |
#c | Unused. |
#d | [R7RS] Decimal number prefix. |
#e | [R7RS] Exact number prefix. |
#f | [R7RS] Boolean false, or
introducing R7RS uniform vector. See Uniform vectors.
R7RS defines both #f and #false as a boolean false value. |
#g , #h | Unused. |
#i | [R7RS] Inexact number prefix. |
#j , #k , #l , #m , #n | Unused. |
#o | [R7RS] Octal number prefix. |
#p , #q , #r | Unused. |
#s | [R7RS vector.@] introducing R7RS uniform vector. See Uniform vectors. |
#t | [R7RS] Boolean true. R7RS defines #t and #true
as a boolean true value. |
#u | [R7RS vector.@] introducing R7RS uniform vector. See Uniform vectors.
R7RS uses #u8 prefix for bytevectors, which is compatible to
u8 uniform vectors. |
#v , #w | Unused. |
#x | [R7RS] Hexadecimal number prefix. |
#y , #z | Unused. |
#[ | Introduces a literal character set. See Character Sets. |
#\ | [R7RS] Introduces a literal character. See Characters. |
#] , #^ , #_ | Unused. |
#` | Legacy syntax for string interpolation, superseded by #" . |
#{ | Unused. |
#| | [SRFI-30] Introduces a block comment. Comment ends by matching ’|# ’. |
#} , #~ | Unused. |
• Hash-bang token: |
Previous: Sharp syntax, Up: Lexical structure [Contents][Index]
A character sequence #!
has two completely different
semantics, depending on how and where it occurs.
If a file begins with #!/
or #!
(hash, bang, and a
space), then the reader assumes it is an interpreter line (shebang) of
a script and ignores the rest of characters until the end of line.
(Actually the source doesn’t need to be a file. The reader checks
whether it is the beginning of a port.)
Other than the above case, #!identifier
is read as
a token with special meanings. This kind token can be a special
directive for the reader, instead of read as a datum.
By default, the following tokens are recognized.
#!fold-case
#!no-fold-case
Switches the reader’s case sensitivity;
#!fold-case
makes the reader case insensitive, and
#!no-fold-case
makes it case sensitive. (Also see
Case-sensitivity).
#!r6rs
This token is introduced in R6RS and used to indicate the
program strictly conforms R6RS. Gauche doesn’t conform R6RS, but
currently it just issues warning when it sees #!r6rs
token, and
it keeps reading on.
#!r7rs
Make the reader strict-r7
mode, that complies R7RS.
See Reader lexical mode, for the details.
#!gauche-legacy
Make the reader legacy
mode, that is
compatible to Gauche 0.9.3 and before.
See Reader lexical mode, for the details.
Next: Literals, Previous: Core syntax, Up: Core syntax [Contents][Index]