Bilingual Programming

Rob Chapman

1. Abstract

This paper documents a system which cleanly integrates the two languages C and Forth, to allow intermixing of the languages for optimal problem solving. From the C perspective, the Forth side provides an interactive, extensible command shell which may be used for testing, debugging, prototyping or peer application interfacing. From the Forth perspective, C is the new portable assembler with a whole schlock of reusable parts. By implementing the virtual Forth machine (VFM) objects in C and adding a concise language to govern their interaction, we set Forth a sail upon the C.

2. System Overview

Using the translator engine from last year's conference, I've written a set of rules for translating a text file containing Forth and/or C code into C code. This code may then be linked with the botForth library to create an application.

Debugging code can take advantage of the debuggers available in the C environment as well as the interactive Forth command line.

3. Examples

Simple Example

Lets use a simple C example:

void foo()
{
bar();
}

If we wish to generate a header for this function in the Forth dictionary, we precede the function with a colon which switches to Forth and creates a header. The C switches back to C:

: foo  C
{
bar();
}

Once the program is running, foo may be typed at the command line and executed.

Initially, foo might have been prototyped at the command line or in a file sent to the command line as:

: foo  bar ;

Timely Example

This example uses the time function available in C to implement a simple timeout mechanism and code execution measurement words. The Forth source code is:

( Time services  Rob Chapman  Jun 8, 1993 )
: TIMEOUT ( "name" -- ) ( for timeout declarations )
CREATE 0 , ;

: TIME ( -- seconds ) ( get the current time in seconds )
C
*--sp = (cell)time(0);
Forth ;

: TIMEOUT? ( a -- f ) ( check to see if a timeout is expired )
@ TIME U< ;

: SET-TIMEOUT ( n \ a -- ) ( set a timeout time in seconds )
SWAP TIME + SWAP ! ;

( ==== Event timing ==== )
TIMEOUT start-time

: START ( -- ) TIME start-time ! ;
: END ( -- ) TIME start-time @ - .D ." seconds " ;

After we pass it through the translator, Timbre, we get two files. The bodies of the definitions are:

/* Source file generated by Timbre */
#include "botforth.h" /* interface to VFM */
#include "time.h" /* headers */

void TIMEOUT() /* "name" -- */
{ /* for timeout declarations */
CREATE();
*--sp=0; /* LITERAL */
COMMA();
}

void TIME() /* -- seconds */
{ /* get the current time in seconds */
*--sp = (cell)time(0);
}

void TIMEOUT_QUERY() /* a -- f */
{ /* check to see if a timeout is expired */
FETCH();
TIME();
U_LESS_THAN();
}

void SET_TIMEOUT() /* n \ a -- */
{ /* set a timeout time in seconds */
SWAP();
TIME();
PLUS();
SWAP();
STORE();
}

/* ==== Event timing ==== */
struct
{
Cell name1;
}start_time_={0,};

void start_time()
{
*--sp=(Cell)&start_time_;
}


void START() /* -- */
{
TIME();
start_time();
STORE();
}

void END() /* -- */
{
TIME();
start_time();
FETCH();
MINUS();
DOT_D();
*(char**)--sp="\010seconds ";
COUNT();
TYPE();
}

The headers of the definitions are:

/* Header file for dictionary entries generated by Timbre */
extern void *_time;

extern void TIMEOUT();
struct{void *link; Byte name[8]; void (*tick)();}
_TIMEOUT={&_time,0x80|7,'T','I','M','E','O','U','T',TIMEOUT};

extern void TIME();
struct{void *link; Byte name[5]; void (*tick)();}
_TIME={&_TIMEOUT,0x80|4,'T','I','M','E',TIME};

extern void TIME();
struct{void *link; Byte name[5]; void (*tick)();}
_TIME={&_,0x80|4,'T','I','M','E',TIME};

extern void TIMEOUT_QUERY();
struct{void *link; Byte name[9]; void (*tick)();}
_TIMEOUT_QUERY={&_TIME,0x80|8,'T','I','M','E','O','U','T','?',TIMEOUT_QUERY};

extern void SET_TIMEOUT();
struct{void *link; Byte name[12]; void (*tick)();}
_SET_TIMEOUT={&_TIMEOUT_QUERY,0x80|11,'S','E','T','-','T','I','M','E','O','U','T',SET_TIMEOUT};

extern void START();
struct{void *link; Byte name[6]; void (*tick)();}
_START={&_SET_TIMEOUT,0x80|5,'S','T','A','R','T',START};

extern void END();
struct{void *link; Byte name[4]; void (*tick)();}
_END={&_START,0x80|3,'E','N','D',END};

4. Details

Memory Model

The Forth machine allocs memory at startup and sets it up in the following manner:

The Stacks

The data stack and the return stack are implemented in C as arrays. The stack pointer is predecremented when pushing a value and postincremented when popping a value. The stack elements may be indexed as if they were in an array.

Cell *sp,*rp;		/* Stack pointers */

The pointers are initialized in main():

rp0_ = rp = (Cell *)(dp_ + 32000);	/* user dictionary is about 32k */
sp0_ = sp = rp0_ - 64; /* return stack has 64 entries */

Some basic stack primitives are:

void SWAP()  /* m \ n -- n \ m */
{
-sp;
sp[0] = sp[2];
sp[2] = sp[1];
sp[1] = sp[0];
sp++;
}

void DUP() /* m -- m \ m */
{
-sp,sp[0]=sp[1];
}

void DROP() /* m -- */
{
sp++;
}

/* ==== Return stack primitives ==== */

void TO_R() /* m -- */
{
*--rp=*sp++;
}

void R() /* -- m */
{
*--sp=*rp;
}

void R_FROM() /* -- m */
{
*--sp=*rp++;
}

Inner Interpreters

The kernel used the indirect threaded model to create definitions when compiling Forth source code. This mixes well with the kernel since all the kernel words have been compiled to machine code and are pointed to by the headers. This is the way it has to be since C doesn't let you mix code and data spaces. There are four inner interpreters:

void (***ip)(), (**tick)(); /* Indirect-threaded code pointers */

void vii() /* : vii ( -- a ) ip @ @ CELL + ; */
{
*--sp=(Cell)(tick+1);
}

void cii() /* : cii ( -- n ) vii @ ; */
{
*--sp=(Cell)*(tick+1);
}

/*
: EXECUTE ( tick -- ) DUP tick ! @ JSR ;
: ITC ( -- ) BEGIN ip @ @+ ip ! ?DUP WHILE EXECUTE REPEAT ;
: :II ( -- ) ip @ >R tick @ CELL + ip ! ITC R> ip ! ;
*/
void colon_ii()
{
*--rp=(Cell)ip;
ip=(void (***)())(tick+1);
while((tick=*ip++) != 0)
(**tick)();
ip=(void (***)())*rp++;
}

/*
: doesii ( -- ) ip @ >R tick @ CELL + @+ SWAP ip ! ITC R> ip ! ;
*/
void does_ii()
{
*--rp=(Cell)ip;
ip=(void (***)())(*(tick+1));
*--sp=(Cell)(tick+2);
while((tick=*ip) != 0)
{
ip++;
if ( **(Cell **)tick & 1 )
end_program("does_ii with an odd tick");
/* something is worng! */
(**tick)();
}
ip=(void (***)())*rp++;
}

The instruction pointer, ip, is used to point to the body of the Forth definition being interpreted. The other pointer, tick, is used to point to the inner interpreter cell of the header being interpreted.

This makes it simple to get a constant value or a variable address. The end of a Forth definition is a zero. To make this complete, here is the definition of EXECUTE:

void EXECUTE()  /* a -- */
{
tick=(void (**)())(*sp++);
TestTick;(**tick)(); /* EXECUTE */
}

Dictionary Entries

Each word in the dictionary has a header composed as a structure in C:

extern void TIMEOUT();
struct{void *link; Byte name[8]; void (*tick)();}
_TIMEOUT={&_time,0x80|7,'T','I','M','E','O','U','T',TIMEOUT};

For all words defined in C, the inner interpreter is the address of the C definition. Each word is linked to the previous word in the file and all the header files are linked together by one master link file.

Non Blocking IO

All the code in the kernel is portable across different platforms thanks to the C standard with the exception of keyboard input. Each platform has their own way of defining nonblocking io. Nonblocking io is required since the keyboard input is polled.

5. Summary

This is a follow up paper to the one I presented last year ("C Without C"). At this point I have the kernel up and running on three different platforms with three different compilers:

Macintosh with Think C

HP RISC 700 workstations with the native C compiler

HP CISC 400 workstations with the GNU compiler

I've just compiled the kernel on the PC with Turbo C but it isn't running just yet. I've run a sieve benchmark on the three platforms in Forth and in C with different sets of optimization rules. All the times are in seconds:

 platform  SIEVE.F  sieve.c  sieveo.c  sieveopt.c
 Mac  17  7  4.1  1.2
 700  2.5  .6  .4  .14
 400  3.7  1.5  .7  .29

At this point I've achieved the goal of having my portable toolkit run on the different platforms which I work. Now it gets interesting.


A PDF version of the original paper is also available.