Cコードの生成 (Gauche ユーザリファレンス)

9.4 `gauche.cgen` - Cコードの生成

Gauche本体の多くの部分は、Gauche自身、もしくはS式ベースのDSLで書かれています。これらのコードはビルド中にCソースに変換され、Cコンパイラでコンパイルされます。 gauche.cgenモジュールおよびそのサブモジュールは、 Gaucheビルドプロセスが使っているこの機能を一般にも使えるように公開するものです。

Cコードジェネレータに要求される機能はアプリケーションによって多種多様であり、あまりがちがちに枠組みを固定してしまうと却って使い辛くなるでしょう。そこで、ひとつの固定したフレームワークではなく、ゆるく連携するいくつかのモジュールを提供し、ユーザが必要な機能を自由に組み合わせて使えるようにしてあります。実際、Gaucheのビルド時に走るプロセスの中には、gauche.cgen.unitやgauche.cgen.literalだけしか使わないものもあります(例えばsrc/builtin-syms.scmを見てください)。

Module: gauche.cgen ¶: このモジュールは、簡便のために gauche.cgen.unit、gauche.cgen.literal、 gauche.cgen.type、gauche.cgen.ciseを extendしたものです。

大抵の場合、gauche.cgenをuseしておけば、個々のサブモジュールについて考える必要はありません。以降のセクションは主に説明のしやすさのために、サブモジュールごとに分けて記述してあります。

9.4.1 Cソースファイルを生成する

Cソースを生成する際に面倒なのは、ひとつの機能を実装するのにも、ソースの断片を離れた場所に配置しなければならないことです。それが参照されているコードより前に宣言が置かれなければならないですし、必要な初期化コードは初期化ルーチン内に書かれなければなりません。 <cgen-unit>クラスはそういったコード配置の面倒をみてくれます。

枠組みを作る

Class: <cgen-unit> ¶

{gauche.cgen} cgen-uintはCソースコード生成の単位です。ひとつのインスタンスがひとつの.cファイルと、必要に応じてひとつの.hファイルに対応します。処理中には「現在のunit」がパラメータcgen-current-unitに束縛されていて、多くのcgen APIはそれを暗黙に参照します。

以下のスロットが公開されていて、出力をカスタマイズするのに使えます。通常、これらのスロットは初期化時に設定します。コード生成の途中でこれらのスロットの値を変えた場合の動作は未定義です。

Instance Variable of <cgen-unit>: name ¶

このunitの名前を示す文字列。これは、生成されるファイルのデフォルトの名前 (name.c と name.h)、および初期化関数名に使われます。他のcgenモジュールでも、この値をもとに名前を作り出すものがあります。従って、Cの識別子として有効な文字だけを使うようにしてください。

生成されるデフォルトの名前は個別に他のスロットで上書きすることができます。

Instance Variable of <cgen-unit>: c-file ¶

Instance Variable of <cgen-unit>: h-file ¶

Cソースファイルとヘッダファイルの名前を指定する文字列。デフォルトは#fで、その場合はnameスロットの値に拡張子.cおよび.hをつけたものが使われます。

生成されるファイル名を使いたい場合は、これらのスロットの値ではなく、 cgen-unit-c-fileおよびcgen-unit-h-fileを呼んでください。

Instance Variable of <cgen-unit>: preamble ¶: 生成されるソースの先頭に挿入される文字列のリスト。デフォルトは("/* Generated by gauche.cgen */")です。各文字列が独立した行になります。

Instance Variable of <cgen-unit>: init-prologue ¶

Instance Variable of <cgen-init>: init-epilogue ¶

初期化関数の先頭と末尾に置かれる文字列。 init-prologueのデフォルトは "void Scm_Init_NAME(void) {"、 init-epilogueのデフォルトは "}"です。ただしNAMEはnameスロットの値です。各文字列は独立した行におかれます。

デフォルトの初期化関数名を知りたい場合は、gen-unit-init-nameを呼んでください。

初期化関数名やその引数をカスタマイズしたい場合は、 init-prologueを変更してください。

初期化関数の中身は、cgen-initで登録されたコード片から生成されます。

Parameter: cgen-current-unit ¶: 現在のcgen unitを保持するパラメータです。

Cコードを生成する典型的な流れは次のとおりです。

<cgen-unit>のインスタンスを作り、現在のunitとする。
コード片を挿入するAPIを呼ぶ。コード片は現在のunitに蓄積される。
unitに対してemitメソッド(cgen-emit-c、cgen-emit-h) を呼ぶ。これでCファイルと、必要に応じてヘッダファイルが生成される。

Generic Function: cgen-emit-c cgen-unit ¶

Generic Function: cgen-emit-h cgen-unit ¶

{gauche.cgen} cgen-unitに蓄積されたコード片を、それぞれCソースファイルとCヘッダファイルに書き出します。ファイルの名前はそれぞれcgen-unit-c-fileと cgen-unit-h-fileが返す値になります。ファイルが既に存在する場合、それは上書きされます (コードを追加してゆくようなことはできません)。したがって、これらはコード生成の最後のステップで呼ばれるのが普通です。

各ファイルがどう構成されるかは、下の「内容を埋めてゆく」の項を参照してください。

Generic Function: cgen-unit-c-file cgen-unit ¶
Generic Function: cgen-unit-h-file cgen-unit ¶: {gauche.cgen} cgen-unitが扱うCソースファイルおよびヘッダファイルの名前を返します。デフォルトメソッドは、まずcgen-unitのc-fileまたはh-fileスロットの値を見て、#fでなければそれを返し、 #fであればnameスロットの値に .cまたは.hの拡張子をつけたものを返します。

Generic Function: cgen-unit-init-name cgen-unit ¶: {gauche.cgen} Cで生成される初期化関数の名前を返します。この名前はinit-prologueのデフォルト値の中で使われます。

中身を埋めてゆく

Cコード片を追加できるパートは4つあります。それぞれのパートの中で、コード片は追加された順に出力されます。

extern: このパートは、ヘッダファイルに出力されます。
decl: Cソースファイル中の先頭部分、標準の先頭要素が出力されたあとに出力されます。
body: Cソースファイル中の、declパートの後に出力されます。
init: Cソースファイルの末尾に出力される、初期化関数の中に出力されます。

それぞれのパートにコードを追加するための手続きは次のとおりです。

Function: cgen-extern code … ¶
Function: cgen-decl code … ¶
Function: cgen-body code … ¶
Function: cgen-init code … ¶: {gauche.cgen} コード片code …を各パートに追加します。各コード片は文字列でなければなりません。

下に、典型的な使い方の単純な例を示します。このコードを走らせると、カレントディレクトリにmy-cfile.cとmy-cfile.hができます。

(use gauche.cgen)

(define *unit* (make <cgen-unit> :name "my-cfile"))

(parameterize ([cgen-current-unit *unit*])
  (cgen-decl "#include <stdio.h>")
  (cgen-init "printf(stderr, \"initialization function\\n\");")
  (cgen-body "void foo(int n) { printf(stderr, \"got %d\\n\", n); }")
  (cgen-extern "void foo(int n);")
  )

(cgen-emit-c *unit*)
(cgen-emit-h *unit*)

次に示す手続きは、Cコードとして安全になるように文字列をエスケープします。他のcgenモジュールを使わずにCコードを生成する際に便利です。

Function: cgen-safe-name string ¶

Function: cgen-safe-name-friendly string ¶

Function: cgen-safe-string string ¶

Function: cgen-safe-comment string ¶

{gauche.cgen} Cの識別子、文字列リテラル、あるいはコメントとして使えない文字の並びをエスケープします。

cgen-safe-nameは、文字列中にASCIIアルファベットや数字以外の文字が現れたら、_XXに変換します。ここでXXはその文字の文字コードの 16進数表現です(_自身も変換されます)。返される文字列はそのままCの識別子として使えます。この変換は単射であり、元の文字列が異なれば結果の文字列も異なったものになります。

一方、cgen-safe-name-friendlyはちょっと異なり、文字列をより読みやすいCの識別子の名前にします。 ->は_TOに (例: char->integerはchar_TOintegerに)、他の-および_は_へとマップされます。 ?はP (例: char?はcharPに)、 !はX (例: set!はsetXに), <と>はそれぞれ_LTと_GTになります。以上の文字を除く特殊な文字はcgen-safe-nameと同様に_XXになります。この変換は単射ではありません。read-lineとread_lineはともにread_lineになってしまいます。生成されたCコードを人間に読みやすくしたい場合に限り使ってください (ただ、生成したCコードを人間が読むことは推奨されません)。

Scheme文字列をCの文字列リテラルにしたい場合はcgen-safe-stringが使えます。これは制御文字と非ASCII文字をエスケープします。 ASCII範囲外の文字については、Gaucheの内部エンコーディングが使われます (この変換はまた、?も変換します。うっかりCのトライグラフを出力してしまわないためです)。

cgen-safe-commentはずっとシンプルで、 /*と*/をそれぞれ/ *と* /にするだけです (二文字の間に空白を入れます)。これは、Scheme文字列をCコメントの内部に出力する際にうっかりコメントを終わらせてしまわないようにするためです。 (厳密には、エスケープするのは*/だけで良いはずですが、簡易的なCパーザには余分な/*で混乱してしまうものがあるかもしれないので /*もエスケープしています)。この変換も単射ではありません。

(cgen-safe-name "char-alphabetic?")
  ⇒ "char_2dalphabetic_3f"
(cgen-safe-name-friendly "char-alphabetic?")
  ⇒ "char_alphabeticP"
(cgen-safe-string "char-alphabetic?")
  ⇒ "\"char-alphabetic\\077\""

(cgen-safe-comment "*/*"
  ⇒ "* / *"

コードの断片をCプリプロセッサの#ifdefで条件的に含めたい場合は、次のマクロが使えます。

Macro: cgen-with-cpp-condition cpp-expr body … ¶

{gauche.cgen} body …中で提出されたコード片が #if cpp-exprと#endifに囲まれて出力されます。

cpp-exprが文字列の場合は、それが条件式としてそのまま出力されます。

(cgen-with-cpp-condition "defined(FOO)"
  (cgen-init "foo();"))

;; will generate:
#if defined(FOO)
foo();
#endif /* defined(FOO) */

cpp-exprにS式を使うこともできます。

<cpp-expr> : <string>
           | (defined <cpp-expr>)
           | (not <cpp-expr>)
           | (<n-ary-op> <cpp-expr> <cpp-expr> ...)
           | (<binary-op> <cpp-expr> <cpp-expr>)

<n-ary-op> : and | or | + | * | - | /

<binary-op> : > | >= | == | < | <= | !=
            | logand | logior | lognot | >> | <<

例:

(cgen-with-cpp-condition '(and (defined FOO)
                               (defined BAR))
  (cgen-init "foo();"))

;; will generate:
#if ((defined FOO)&&(defined BAR))
foo();
#endif /* ((defined FOO)&&(defined BAR)) */

cgen-with-cpp-conditionはネストすることもできます。

二つ以上のパートにコード片を出力する

コードの生成を抽象化してゆくと、ひとつの高レベルの構造が複数のパートにコードを出力しなければならないことがよくあります。その度にcgen-bodyやcgen-initを個別に呼ぶのは不便です。そのかわりに、カスタマイズされたクラスを定義して、適切なパートへのコード出力を処理させることができます。

Class: <cgen-node> ¶

{gauche.cgen} 関連するコード断片の集まりを表すベースクラスです。

このクラスのサブクラスのインスタンスが作られた時点で、 with-cgen-cpp-conditionで指定されているCプリプロセッサの条件が補足され、コード生成時に必要な#ifや#endifが出力されます。

<cgen-node>のサブクラスを定義して、それに対して以下のメソッドのうち一つ以上を定義してください。

Generic Function: cgen-emit-xtrn cgen-node ¶

Generic Function: cgen-emit-decl cgen-node ¶

Generic Function: cgen-emit-body cgen-node ¶

Generic Function: cgen-emit-init cgen-node ¶

{gauche.cgen} cgen-emit-cやcgen-emit-hがCコードを書き出す際に、これらのジェネリックファンクションが呼ばれます。これらのメソッド内でcurrent output portに書き出されたものが、出力ファイルへと書き込まれます。

cgen-emit-hによって.hファイルが生成されている最中に、登録されたノードのcgen-emit-xtrnメソッドが、登録された順に呼ばれます。

cgen-emit-cによって.cファイルが生成されている最中に、登録されたノードに対し、まずcgen-emit-declメソッドがまず登録された順に呼ばれ、次にcgen-emit-bodyメソッドが同じ順で呼ばれ、最後にcgen-emit-initメソッドが同じ順で呼ばれます。

どれかのメソッドを特殊化していなければ、該当するパートのコードは生成されません。

サブクラスを定義しインスタンスを作った後で、次の手続きにより現在のcgen unitに対してノードを登録できます。

Function: cgen-add! cgen-node ¶: {gauche.cgen} cgen-nodeを現在のcgen unitに登録します。現在のunitがセットされていなければ、cgen-nodeは単に無視されます。

実のところ、手続きcgen-extern、cgen-decl、 cgen-body、cgen-initはいずれも、該当するパートに出力されるメソッドだけを定義した<cgen-node>のサブクラスを内部的に使っています。

9.4.2 Schemeリテラルを生成する

Schemeの値を定数としてCコードに埋め込みたい場合がしばしばあります。それが単純な値、例えばSchemeの真偽値(SCM_TRUE/SCM_FALSE)、文字(SCM_MAKE_CHAR(code))、小さい整数(SCM_MAKE_INT(value))、等であれば、直接Cコードに書き込めます。しかし、ひとたび単純な値の範囲を離れると、こっちでstaticデータ宣言をしてこっちでランタイムに初期化して…といった具合に急速に面倒になります。

例えば、シンボルのリスト(a b c)をCに埋め込みたいとしたら、次の手順を踏む必要があります。 (1) それぞれのシンボルの名前をScmStringとして作る、 (2) それをScmInternに渡してSchemeシンボルを作る、 (3) ScmCons (あるいは便利なマクロSCM_LIST3) でリストを作る。

gauche.cgenを使うと、これらのコードを自動的に生成できます。

註: cgen-literalを使う場合、最初にそれを呼ぶ前に (cgen-decl "#include <gauche.h>") を呼ぶようにしてください。cgen-literalが生成するコードは gauche.h内にある宣言を必要とします。

Function: cgen-literal obj ¶

{gauche.cgen} Schemeオブジェクトobjを表す<cgen-literal>オブジェクトを返します。また、必要な宣言や初期化コードが現在のcgenユニットに登録されます。

<cgen-literal>オブジェクトの詳細は公開していませんが、 cgen-cexprにそれを渡すことで、Scheme値を実行時にCから参照するための Cコード断片を得ることができます。下のcgen-cexprのエントリにある例を参照してください。

Generic Function: cgen-cexpr cgen-literal ¶

{gauche.cgen} cgen-literalが表すSchemeリテラル値を得るCコード片を返します。 Cでの型はScmObjになります。

Schemeオブジェクトが真偽値やfixnumのような単純な即値オブジェクトの場合は、 SCM_FALSEやSCM_MAKE_INT(1234)といったその値を生成する式が返されます。Schemeオブジェクトがメモリ上に置かれなければならない場合は、 cgen-literalは必要なメモリアロケーションと初期化のコードを生成し、 cgen-cexprはそのオブジェクトへのポインタを返す式を返します。

次の例では、cgen-literalで作られたリテラル値(a b c)を表示するC関数printabcを生成しています。

(define *unit* (make <cgen-unit> :name "foo"))
(parameterize ((cgen-current-unit *unit*))
  (let1 lit (cgen-literal '(a b c))
    (cgen-body
     (format "void printabc() { Scm_Printf(SCM_CUROUT, \"%S\", ~a); }"
             (cgen-c-name lit)))))
(cgen-emit-c *unit*)

作られたfoo.cを見ると、どのようにリテラル値が扱われているかわかるでしょう。

cgen-literalを使うひとつのメリットは、共通のリテラルが共有されることです。ひとつのunitの中で(cgen-literal '(a b c))を2度呼ぶと、同じ実体が参照されます。さらに(cgen-literal '(b c))を呼ぶと、そのリテラルは(a b c)のリテラルの末尾部分を共有します。従って、リテラルが重複することを心配せずに、必要になった時にcgen-literalを呼ぶことができます。

(cgen-literalで登録されたリテラルは変更不可なものとして扱ってください。 Scheme界でリテラルが変更不可なのと同じです)

Schemeオブジェクトの中にはリテラルにできないものもあります。例えばオープンされているポートは、実行時特有の情報を含むためリテラルにできません。

(プログラマが独自の型のためにcgen-literalを拡張できる仕組みもありますが、 APIがまだ固まっていません。)

9.4.3 SchemeとCの間の変換

In the C world, Scheme objects are uniformly represented as an opaque tagged pointer ScmObj. In order to access the actual objects, you need to check its runtime type information and to retrive the actual C type out of it.

Stub types are the objects that bridge Scheme runtime types and C types. Since mappings between Scheme types and C types are not one-to-one, there are more stub types than either types; for example, Scheme <integer> type may be bridged to C int type by the stub type <int>, but it may also be bridged to C short type by the stub type <short>.

Do not confuse stub types and Gauche’s runtime types—stub types are meta information associated to runtime types. You can look up a stub type by its name by cgen-type-from-name. The session below shows the difference of the runtime types and stub types:

gosh> <int>
#<native-type <int>>
gosh> ,d
#<native-type <int>> is an instance of class <native-type>
slots:
  name      : <int>
  super     : #<class <integer>>
  c-type-name: "int"
  size      : 4
  alignment : 4

gosh> (cgen-type-from-name '<int>)
#<cten-type <int>>
gosh> ,d
#<cten-type <int>> is an instance of class <cgen-type>
slots:
  name      : <int>
  scheme-type: #<native-type <int>>
  c-type    : "int"
  description: "int"
  cclass    : #f
  %c-predicate: "SCM_INTEGERP"
  %unboxer  : "Scm_GetInteger"
  %boxer    : "Scm_MakeInteger"
  %maybe    : #f

gosh> <integer>
#<class <integer>>
gosh> ,d
#<class <integer>> is an instance of class <integer-meta>
slots:
  name      : <integer>
  cpl       : (#<class <integer>> #<class <rational>> #<class <real>> #<cl
  direct-supers: (#<class <rational>>)
  accessors : ()
  slots     : ()
  direct-slots: ()
  num-instance-slots: 0
  direct-subclasses: ()
  direct-methods: ()
  initargs  : ()
  defined-modules: (#<module gauche>)
  redefined : #f
  category  : builtin
  core-size : 0

gosh> (cgen-type-from-name '<integer>)
#<cten-type <integer>>
gosh> ,d
#<cten-type <integer>> is an instance of class <cgen-type>
slots:
  name      : <integer>
  scheme-type: #<class <integer>>
  c-type    : "ScmObj"
  description: "exact integer"
  cclass    : #f
  %c-predicate: "SCM_INTEGERP"
  %unboxer  : ""
  %boxer    : "SCM_OBJ_SAFE"
  %maybe    : #f

Each stub type has a C-predicate, a boxer and an unboxer, each of them is a Scheme string for the name of a C function or C macro. A C-predicate takes ScmObj object and returns C boolean value that if the given object has a valid type and range for the stub type. A boxer takes C object and converts it to a Scheme object; it usually involves wrapping or boxing the C value in a tagged pointer or object, hence the name. An unboxer does the opposite: takes a Scheme object and convert it to a C value. The Scheme object must be checked by the C-predicate before being passed to the unboxer.

We have a few categories of stub types.

Stub types corresponds to native types (see ネイティブタイプ).
Stub types corresponds to C-class types. These are Scheme object whose structure is defined in C. They can be treated as ScmObj or can be casted to the specific C type; e.g. <symbol> can be casted to ScmSymbol*. Its unboxer is ScmObj -> C-TYPE*, and boxer is C-TYPE* -> ScmObj.
Pass-through types. These are Scheme object that are also handled as ScmObj in C-level. Stub types only typecheck, and its boxer and unboxer are just identity. It can be either purely-Scheme-defined objects, or an object that can take multiple representations (e.g. <integer> can be a fixnum or ScmBignum*, so the stub generator passes through it, and the C routine handles the internals.)
Maybe stub types. It is noted by a question mark suffix. In stub context, we only concern maybe type that can be unboxed into a C pointer type. In addition to the objects of the origial type, it maps Scheme’s #f to C’s NULL and vice versa. For example, <port>? maps Scheme’s <port> instance to C’s ScmPort*, and Scheme’s #f to C’s NULL.

Class: <cgen-type> ¶: {gauche.cgen} An instance of this class represents a stub type. It can be looked up by name such as <const-cstring> by cgen-type-from-name.

Function: cgen-type-from-name name ¶: {gauche.cgen} Returns an instance of <cgen-type> that has name. If the name is unknown, #f is returned.

Function: cgen-box-expr cgen-type c-expr ¶

Function: cgen-unbox-expr cgen-type c-expr ¶

Function: cgen-pred-expr cgen-type c-expr ¶

{gauche.cgen} c-expr is a string denotes a C expression. Returns a string of C expression that boxes, unboxes, or typechecks the c-expr according to the cgen-type.

;; suppose foo() returns char*
(cgen-box-expr
 (cgen-type-from-name '<const-cstring>)
 "foo()")
 ⇒ "SCM_MAKE_STR_COPYING(foo())"

9.4.4 CiSE - S式で書くC

Some low-level routines in Gauche are implemented in C, but they’re written in S-expression. We call it “C in S expression”, or CiSE.

The advantage of using S-expression is its readability, obviously. Another advantage is that it allows us to write macros as S-expr to S-expr translation, just like the legacy Scheme macros. That’s a powerful feature—effectively you can extend C language to suit your needs.

The gauche.cgen.cise module provides a set of tools to convert CiSE code into C code to be passed to the C compiler. It also has some support to overcome C quirks, such as preparing forward declarations.

Currently, we don’t do rigorous check for CiSE; you can pass a CiSE expression to the translator that yields invalid C code, which will cause the C compiler to emit errors. The translator inserts line directives by default so the C compiler error message points to the location of original (CiSE) source instead of generated code; however, sometimes you need to look at the generated code to figure out what went wrong. We hope this will be improved in future.

In Gauche source code, CiSE is extensively used in precompiled Scheme files and recognized by the precompiler. However, gauche.cgen.cise is an independent module only relies on gauche.cgen basic features, so you can plug it to your own C code generating programs.

9.4.4.1 CiSE overview

Before diving into the details, it’s easier to grasp some basic concepts.

A CiSE fragment is an S-expression that follows CiSE syntax (see CiSE syntax). A CiSE fragment can be translated to a C code fragment by cise-render. Note that some translation may not be local, e.g. it may want to emit forward declarations before other C code fragments. So, the full translation requires buffering—you process all the CiSE fragments and save output, emit forward declarations, then emit the saved C code fragments. We have a wrapper procedure, cise-translate, to take care of it, but for your purpose you may want to roll your own wrapper.

A CiSE macro is a Scheme code that translates a CiSE fragment to another CiSE fragment. There are number of predefined CiSE macros. You can add your own CiSE macros by utilities such as define-cise-stmt and define-cise-expr.

A CiSE ambient is a bundle of information that affects fragment translation. It contains CiSE macro definitions, and also it keeps track of forward declarations.

If you’re not sure how a cise fragment corresponds to C code, you can interactively try it:

gosh> (cise-render-to-string
        '(.struct foo (i::int c::(const char*)))))
"struct foo { int i; const char* c; } "
gosh> (cise-render-to-string
        '(define-cfn foo (x::int) (return (+ x 3)))
        'toplevel))
" ScmObj foo(int x){{return ((x)+(3));}}"

(The second argument of cise-render-to-string specifies the context; see CiSE procedures, for the details.)

9.4.4.2 CiSE syntax

In this section, we lists basic CiSE syntax. They are just data from the viewpoint of Gauche—so you can build and manipulate them like any S-expression (quasiquote comes pretty handy).

CiSE types

C types can be written either as a symbol (e.g. int) or a list (e.g. (const char *). When used in definition, it is preceded by ::. The following example shows types are used in local variable definitions:

(let* ([a :: int 0]
       [b :: (const char *) "abc"])
  ...)

For the convenience and readability, you can write the variable name, separating double-colon and type name concatenated. You can also concatenate point suffixes (char* instead of char * in the following example):

(let* ([a::int 0]
       [b::(const char*) "abc"])
  ...)

CiSE translater first breaks up these concatenated forms, then deal with types.

At this moment, CiSE does not check if type is valid C type. It just pass along whatever given.

There are a few special type notations for more complex types. These can appear in middle of the type; for example, you can write (const .struct x (a::int b::double) *) to produce const struct x {int a; double b;} *.

CiSE Type: .array elt-type (dim …) ¶

Expands to C array type, whose element type is of elt-type and dimensions are dim ….

(cise-render-to-string '(.array char (3)))
  ⇒ "char [3]"

(cise-render-to-string '(.array int (2 5)))
  ⇒ "int [2][5]"

The last element of dim can be *, corresponds to the C type without specifying the array size:

(cise-render-to-string '(.array char (*)))
  ⇒ "char [3]"

(cise-render-to-string '(.array int (10 *)))
  ⇒ "int [10][]"

Here’s an example of global C variable definition of array type:

(cise-render-to-string '(define-cvar params ::(.array int (PARAM_SIZE)))
                       'toplevel)
  ⇒ " int params[PARAM_SIZE];"

CiSE Type: .struct [tag] [(field-spec …)] ¶
CiSE Type: .union [tag] [(field-spec …)] ¶

CiSE Type: .function (arg-spec …) ret-type ¶

CiSE statements

CiSE Statement: begin stmt … ¶: Code grouping with { and }

CiSE Statement: let* (binding …) stmt … ¶

Define and optionally assign initial values to local variables. The binding is a form of either one of the following type:

(name [:: type] [init-expr]): Define a C variable name of type type. type should be a CiSE type. If type is omitted, the default type is ScmObj. Optional init-expr is a CiSE expression to compute the initial value of name. Note that array initialization is not supported yet.
(_ cise-form): This is to allow arbitrary CiSE statement or expression cise-form between local variable definitions. See the example below.

The (_ cise-form) binding is useful when you want to do some check between other bindings, without having nested let*:

(let* ([len::ScmSmallInt (Scm_Length lis)]
       [_ (when (< len 1) (Scm_Error "Lis is too short: %S" lis))]
       [first (SCM_CAR lis)])
  ...)

CiSE Statement: if test-expr then-stmt [else-stmt] ¶
CiSE Statement: when test-expr stmt … ¶
CiSE Statement: unless test-expr stmt … ¶
CiSE Statement: cond (cond1 stmt1 …) … [ (else else-stmt …) ] ¶: Conditional statements.

CiSE Statement: case expr ((val1 …) stmt1 …) … [ (else else-stmt …) ] ¶
CiSE Statement: case/fallthrough expr ((val1 …) stmt1 …) … [ (else else-stmt …) ] ¶: Switch-case statement. case does not fall through between ’case’ blocks while case/fallthrough does.

CiSE Statement: for (start-expr test-expr update-expr) stmt … ¶
CiSE Statement: for () stmt … ¶
CiSE Statement: loop stmt … ¶
CiSE Statement: while test-expr body … ¶: Loop statements.

CiSE Statement: for-each (lambda (var) stmt …) expr ¶
CiSE Statement: dolist [var expr] stmt … ¶: expr must yield a list. Traverse the list, binding each element to var and executing stmt …. The lambda form is a fake; you don’t really create a closure.

CiSE Statement: pair-for-each (lambda (var) stmt …) expr ¶: Like for-each, but var is bound to each ’spine’ cell instead of each element of the list.

CiSE Statement: dopairs [var expr] stmt … ¶

CiSE Statement: dotimes (var expr) stmt … ¶: expr must yield an integer, n. Repeat stmt … by binding var from 0 to (n-1).

CiSE Statement: return [expr] ¶
CiSE Statement: break ¶
CiSE Statement: continue ¶: Return, break and continue statements.

CiSE Statement: label name ¶
CiSE Statement: goto name ¶: Label and goto statements. We always add a null statement after the label so that we can place (label name) at the end of a compound statement.

CiSE Statement: .if expr stmt [stmt] ¶

CiSE Statement: .when expr stmt … ¶

CiSE Statement: .unless expr stmt … ¶

CiSE Statement: .cond clause … ¶

CiSE Statement: .define name[(arg …)] [expr] ¶

CiSE Statement: .undef name ¶

CiSE Statement: .include path ¶

Preprocessor directives.

expr could be a string, a symbol, a number or one of the following forms:

(defined c)
(not c)
(and c)
(or c)
(op c …) where op is either + or *.
(op c c …) where op is either - or /.
(op c c) where op is either >, >=, ==, <, <=, !=, logand, logior, lognot, << or >>.

Note that defining a macro function without value

#define foo(abc)

is not supported because it’s ambiguous with

#define foo abc()

when written in CiSE syntax. (.define foo (abc)) always generates the latter.

.include could take a symbol. This is used for including system header files, e.g. (.include <stdint.h>).

CiSE Statement: define-cfn name (arg [:: type] …) [ret-type [qualifier …]] stmt … ¶

Defines a C function.

If type or ret-type is omitted, the default type is ScmObj.

Supported qualifiers are :static and :inline, corresponding to C’s static and inline keywords. If :static is specified, forward declaration is automatically generated.

CiSE Statement: define-cvar name [:: type] [qualifier …] [<init-expr>] ¶: Defines a global C variable. Supported qualifier is :static. Note that array initialization is not supported yet.

CiSE Statement: define-ctype name [:: type] ¶: Defines a new type using typedef

CiSE Statement: declare-cfn name (arg [:: type] …) [ret-type] ¶
CiSE Statement: declare-cvar name [:: type] ¶: Declares an external C function or variable.

CiSE Statement: .static-decls ¶: Produce declarations of static functions before function bodies.

CiSE Statement: .raw-c-code body … ¶

CiSE expressions

CiSE Expression: + expr … ¶
CiSE Expression: - expr … ¶
CiSE Expression: * expr … ¶
CiSE Expression: / expr … ¶
CiSE Expression: % expr1 expr2 ¶: Arithmetic operations.

CiSE Expression: and expr … ¶
CiSE Expression: or expr … ¶
CiSE Expression: not expr ¶: Boolean operations.

CiSE Expression: logand expr1 expr2 … ¶
CiSE Expression: logior expr1 expr2 … ¶
CiSE Expression: logxor expr1 expr2 … ¶
CiSE Expression: lognot expr ¶
CiSE Expression: << expr1 expr2 ¶
CiSE Expression: >> expr1 expr2 ¶: Bitwise operations.

CiSE Expression: * expr ¶
CiSE Expression: -> expr1 expr2 … ¶
CiSE Expression: ref expr1 expr2 … ¶
CiSE Expression: aref expr1 expr2 … ¶
CiSE Expression: & expr ¶: Dereference, reference and address operations. ref is C’s .. aref is array reference.

CiSE Expression: pre++ expr ¶
CiSE Expression: post++ expr ¶
CiSE Expression: pre-- expr ¶
CiSE Expression: post-- expr ¶: Pre/Post increment or decrement.

CiSE Expression: < expr1 expr2 ¶
CiSE Expression: <= expr1 expr2 ¶
CiSE Expression: > expr1 expr2 ¶
CiSE Expression: >= expr1 expr2 ¶
CiSE Expression: == expr1 expr2 ¶
CiSE Expression: != expr1 expr2 ¶: Comparison.

CiSE Expression: set! lvalue1 expr1 lvalue2 expr2 … ¶
CiSE Expression: = lvalue1 expr1 lvalue2 expr2 … ¶
CiSE Expression: += lvalue expr ¶
CiSE Expression: -= lvalue expr ¶
CiSE Expression: *= lvalue expr ¶
CiSE Expression: /= lvalue expr ¶
CiSE Expression: %= lvalue expr ¶
CiSE Expression: <<= lvalue expr ¶
CiSE Expression: >>= lvalue expr ¶
CiSE Expression: logand= lvalue expr ¶
CiSE Expression: logior= lvalue expr ¶
CiSE Expression: logxor= lvalue expr ¶: Assignment expressions.

CiSE Expression: cast type expr ¶: Type casting.

CiSE Expression: ?: test-expr then-expr else-expr ¶: Conditional expression.

CiSE Expression: .type type ¶: Useful to place a type name, e.g. an argument of sizeof operator.

CiSE Expression: new type ¶

CiSE Expression: new (type expr ...) ¶

CiSE Expression: new type (dim ...) ¶

CiSE Expression: new (type expr ...) (dim ...) ¶

C++ new operator. The second argument can be just a type name, or a constructor call. The optional second argument specifies array dimensions.

(new MyClass)              ⇒ new MyClass;
(new (MyClass a b) (1 2))  ⇒ new MyClass(a, b)[1,2];

CiSE Expression: delete expr ¶

CiSE Expression: delete () expr ¶

C++ delete operator. The second form is to delete an array.

(delete (* foo))  ⇒ delete *foo;
(delete () foo)   ⇒ delete[] foo;

9.4.4.3 CiSE procedures

Parameter: cise-ambient ¶: {gauche.cgen}

Function: cise-default-ambient ¶: {gauche.cgen}

Function: cise-ambient-copy ambient ¶: {gauche.cgen}

Function: cise-ambient-decl-strings ambient ¶: {gauche.cgen}

Parameter: cise-emit-source-line ¶: {gauche.cgen}

Function: cise-render cise-fragment :optional port context ¶: {gauche.cgen}

Function: cise-render-to-string cise-fragment :optional context ¶: {gauche.cgen}

Function: cise-render-rec cise-fragment stmt/expr env ¶: {gauche.cgen}

Function: cise-translate inp outp :key environment ¶: {gauche.cgen}

Function: cise-register-macro! name expander :optional ambient ¶: {gauche.cgen}

Function: cise-lookup-macro name :optional ambient ¶: {gauche.cgen}

Macro: define-cise-stmt name [env] clause … [:where definition …] ¶
Macro: define-cise-expr name [env] clause … [:where definition …] ¶
Macro: define-cise-toplevel name [env] clause … [:where definition …] ¶: {gauche.cgen}

Macro: define-cise-macro (name form env) body … ¶
Macro: define-cise-macro name name2 ¶: {gauche.cgen}

9.4.5 スタブの生成

Stub Form: define-stub-type NAME C-TYPE [DESC C-PREDICATE UNBOXER BOXER] ¶: Register a new type to be recognized. This is rather a declaration than definition; no C code will be generated directly by this form.

Stub Form: define-cproc name (args …) [ret-type] [flag …] [qualifier …] stmt … ¶

Create Scheme procedure.

args specifies arguments:

arg … [:rest var] : Each arg is variable name or var::type, specifies required argument. If :rest is given, list of excessive arguments are passed to var.
arg … :optional spec … [:rest rest-var] : Optional arguments. spec is var or (var default). If no default is given, var receives SCM_UNBOUND—if var isn’t a type of ScmObj it will raise an error.
ARG … :key spec … [:allow-other-keys [:rest rest-var]] : Keyword arguments. spec is var or (var default). If no default is given, var receives SCM_UNBOUND—if var isn’t a type of ScmObj it will raise an error.
arg … :optarray (var cnt max) [:rest rest-var] : A special syntax to receive optional arguments as a C array. var is a C variable of type ScmObj*. cnt is a C variable of type int, which receives the number of optional argument in the ScmObj array. max specifies the maximum number of optional arguments that can be passed in the array form. If more than max args are given, a list of excessive arguments are passed to the rest-var if it is specified

ret-type specifies the return type of function. It could be either :: typespec or ::typespec where typespec is a valid stub type, or (type …) when multiple values are returned. When omitted, the procedure is assumed to return <top>.

flag is a keyword to modify some aspects of the procedure. Supported flags are as follows:

:fast-flonum - indicates that the procedure accepts flonum arguments and it won’t retain the reference to them. The VM can pass flonums on VM registers to the procedure with this flag. (This improves floating-point number handling, but it’s behavior is highly VM-specific; ordinary stub writers shouldn’t need to care about this flag at all.)
:constant - indicates that this procedure returns a constant value if all args are compile-time constants. The compiler may replace the call to this proc with the value, if it determines all arguments are known at the compile time. The resulting value should be serializable to the precompiled file.
NB: Since this procedure may be called at compile time, a subr that may return a different value for batch/cross compilation shouldn’t have this flag.

qualifier is a list to adds auxiliary information to the procedure. Currently the following qualifiers are officially supported.

(setter setter-name) : specify setter. setter-name should be a cproc name defined in the same stub file
(setter (args …) body …) : specify setter anonymously.
(catch (decl c-stmt …) …) : when writing a stub for C++ function that may throw an exception, use this spec to ensure the exception will be caught and converted to Gauche error condition.
(inliner insn-name) : only used in Gauche core procedures that can be inlined into an VM instruction.

stmt is a cise expression. Inside the expression, a cise macro (result expr …) can be used to assign the value(s) to return from the cproc. As a special case, if stmt is a single symbol, it names a C function to be called with the same argument (mod unboxing) as the cproc.

Stub Form: define-cgeneric name c-name property-clause … ¶

Defines generic function. c-name specifies a C variable name that keeps the generic function structure. One or more of the following clauses can appear in property-clause …:

(extern) : makes c-name visible from other file (i.e. do not define the structure as static).
(fallback "fallback") : specifies the fallback function.
(setter . setter-spec) : specifies the setter.

Stub Form: define-cmethod name (arg …) body … ¶

Stub Form: define-cclass scheme-name [qualifier …] c-type-name c-class-name cpa (slot-spec …) property … ¶

Generates C stub for static class definition, slot accessors and initialization. Corresponding C struct has to be defined elsewhere.

The following qualifiers are supported:

:base generates a base class definition (inheritable from Scheme code).
:built-in generates a built-in class definition (not inheritable from Scheme code). This is the default if neither :base nor :built-in are specified.
:private - the class declaration and standard macro definitions are also generated (which needs to be in the separate header file if you want the C-level structure to be used from other C code. If the extension is small enough to be contained in one C file, this option is convenient.)

cpa lists ancestor classes in precedence order. They need to be C identifiers of Scheme class Scm_*Class, for the time being. Scm_TopClass is added at the end automatically.

slot-spec is defined as (slot-name [qualifier …]) or slot-name. The following qualifiers are supported:

:type cgen-type
:c-name c-name specifies the C field name if the autogenerated name from slot-name is not accurate.
:c-spec c-spec
:getter proc-spec specifies how to create the slot getter. proc-spec could be
- #f to omit the getter
- #t to generate a default one with type conversion according to type
- A string is interpreted as the C code to implement the getter
- (c c-name) specifies the C function name that implements the getter, which is implemented elsewhere.
:setter proc-spec specifies how to create the slot setter. The syntax is the same as :getter.

The following property are supported:

(allocator proc-spec)
(printer proc-spec)
(comparer proc-spec)
(direct-supers string …)

Stub Form: define-cptr scheme-name [qualifier …] c-type c-name c-pred c-boxer c-unboxer [(flags flag …)] [(print print-proc)] [(cleanup cleanup-proc)] ¶

Defines a new foreign pointer class based on <foreign-pointer>. It is suitable when the C structure is mostly passed around using pointers; most typically, when the foreign library allocates the structure and returns the pointer to the Scheme world.

scheme-name is a Scheme variable name. This will be bound to a newly-created subclass of <foreign-pointer> to represent this C-ptr type.

c-type is the type of the C pointer we wrap.

c-name is the C variable name (of type ScmClass *). In initialization code, an instance of a class (the same one bound to scm-name in the Scheme world) will be stored in this C variable.

c-pred is a macro name to determine if a ScmObj is of this type. c-boxer is a macro name to wrap C pointer and return a ScmObj c-unboxer is a macro name to extract C pointer from a ScmObj

The only supported qualifier is :private, which will generate c-pred, c-boxer and c-unboxer definitions automatically. Otherwise those definitions must be provided elsewhere.

The two supported flags are

:keep-identity (which is SCM_FOREIGN_POINTER_KEEP_IDENTITY in the C world) keeps a weak hash table that maps the wrapped C pointer to the wrapping ScmObj, so Scm_MakeForeignPointer (i.e. c-boxer when :private is used) returns eq? object if the same C pointer is given.
This incurs some overhead, but cleanup procedure can safely free the foreign object without worrying if there’s other ScmObj that’s pointing to the same C pointer.

Do not use this flag if the C pointer is also allocated by GC_malloc. The used hash table is only weak for its value, so the C pointer wouldn’t be GCed.
:map-null (which is SCM_FOREIGN_POINTER_MAP_NULL in the C world) makes Scm_MakeForeignPointer (i.e. c-boxer when :private is used) return SCM_FALSE when the C pointer is NULL.

Stub Form: define-symbol scheme-name [c-name] ¶: Defines a Scheme symbol. No Scheme binding is created. When c-name is given, the named C variable points to the created ScmSymbol.

Stub Form: define-variable scheme-name initializer ¶: Defines a Scheme variable.

Stub Form: define-constant scheme-name initializer ¶: Defines a Scheme constant.

Stub Form: define-enum name ¶: A define-constant specialized for enum values. This is useful for exporting C enums to Scheme.

Stub Form: define-enum-conditionally name ¶: Abbreviation of (if "defined(name)" (define-enum name))

Stub Form: define-cise-stmt name clause … ¶
Stub Form: define-cise-expr name clause … ¶
Stub Form: define-cfn … ¶
Stub Form: declare-cfn … ¶
Stub Form: define-cvar … ¶
Stub Form: declare-cvar … ¶
Stub Form: define-ctype … ¶
Stub Form: .define … ¶
Stub Form: .if … ¶
Stub Form: .include … ¶
Stub Form: .undef … ¶
Stub Form: .unless … ¶
Stub Form: .when … ¶: Cise macro definitions (see CiSE - S式で書くC).

Stub Form: initcode c-code ¶: Insert c-code literally in the initialization function

Stub Form: declcode stmt … ¶: Inserts declaration code. stmt is usually .include or other preprocessor statements but it could also be a string which is treated as C fragments.

Stub Form: begin form … ¶: Treat each form as if they are toplevel stub forms.

Stub Form: if test then-stmt [else-stmt] ¶
Stub Form: when test stmt ¶: Deprecated. Please use .if and .when instead.

Stub Form: include file ¶: Include and evaluate another stub file.

• Cソースファイルを生成する:		gauche.cgen.unit
• Schemeリテラルを生成する:		gauche.cgen.literals
• SchemeとCの間の変換:		gauche.cgen.type
• S式で書くC:		gauche.cgen.cise
• スタブの生成:		gauche.cgen.stub

9.4 gauche.cgen - Cコードの生成