( C parser framework from "FUNRULES.F" Rob Chapman Apr 24, 1998 ) \ This framework parses C code and discards it. The different states \ of parsing are represented by rule sets. These rule sets can be \ augmented to capture, translate or manipulate any part of the code. \ Tools : WITHIN ( n \ first \ last -- f ) >R NUP < SWAP R> > OR 0= ; : SPACE? ( -- f ) INPUT C@ DUP BL = SWAP 9 = OR ; : VISIBLE? ( c -- f ) BL 1 + ` ~ WITHIN ; : NAME? ( s -- f ) 1 + C@ DUP ` a ` z WITHIN OVER ` A ` Z WITHIN OR OVER ` 0 ` 9 WITHIN OR SWAP ` _ = OR ; \ Character and keyword parsing CREATE form 0 C, 256 ALLOT ( for caching letters, digits, and underscores ) CREATE vgc 2 ALLOT ( for caching a visible graphic character as a string ) : IN-RULESQ? ( s -- s' \ f ) BEGIN RULE-SETS? WHILE DUP ruleq Q FIND ?DUP IF RESET-RULES NIP YES EXIT ENDIF NEXT-RULE-SET REPEAT RESET-RULES NO ; : DROP-INVISIBLE ( n -- ) DROP ; : ADD-TO-TOKEN ( -- ) form vgc COUNT +$ ; : CHECK-KEYWORD ( -- ) form C@ IF form MAKE-WORD inputq PUSH 0 form C! ENDIF ; : MORE-INPUT ( -- ) INPUT-LINE 0= IF SHELL-END ENDIF line-no @ 2 > IF 10 1 form C!+ C! CHECK-KEYWORD ENDIF ; : PUT-CHARACTER ( -- ) INPUT C@+ +IN 1 form C!+ C! CHECK-KEYWORD ; : KEY-CHARACTER ( -- ) ADD-TO-TOKEN CHECK-KEYWORD ; : CHECK-NAME ( s -- ) NAME? IF ADD-TO-TOKEN ELSE CHECK-KEYWORD KEY-CHARACTER ENDIF ; : CHECK-PUNCTUATION ( s -- ) IN-RULESQ? IF CHECK-KEYWORD inputq PUSH ELSE CHECK-NAME ENDIF ; : CHECK-VISIBILITY ( c -- ) DUP VISIBLE? IF 1 vgc C!+ C!- CHECK-PUNCTUATION ELSE DROP-INVISIBLE CHECK-KEYWORD ENDIF ; : CHECK-CHARACTER ( -- ) INPUT C@+ +IN CHECK-VISIBILITY ; : CHECK-SPACE ( -- ) SPACE? IF form C@ IF CHECK-KEYWORD ELSE PUT-CHARACTER ENDIF ELSE CHECK-CHARACTER ENDIF ; : CPARSE ( -- ) INPUT C@ IF CHECK-SPACE ELSE form C@ IF CHECK-KEYWORD ELSE MORE-INPUT ENDIF ENDIF ; \ Rule sets as parsing states RULE-SET comment ( used as a comment state ) [ CPARSE ] { }[ inputq PULL DROP ] { * / }[ CHANGE-RULES ] ( go back to calling rule set ) RULE-SET quote ( used as a quote state ) [ CPARSE ] { }[ inputq PULL DROP ] { " }[ CHANGE-RULES ] ( go back to calling rule set ) { \ " }[ ] RULE-SET tick ( used as a tick state ) [ CPARSE ] { }[ inputq PULL DROP ] { ' }[ CHANGE-RULES ] ( go back to calling rule set ) { \ ' }[ ] RULE-SET block ( used as a block state ) [ CPARSE ] { }[ inputq PULL DROP ] | } |[ CHANGE-RULES ] ( go back to calling rule set ) | { |[ RULES> block >RULES ] { " }[ RULES> quote >RULES ] { ' }[ RULES> tick >RULES ] { / * }[ RULES> comment >RULES ] RULE-SET paren ( used as a parenthesis state ) [ CPARSE ] { }[ inputq PULL DROP ] { ; }[ CHANGE-RULES ] | { |[ block CHANGE-RULES ] { / * }[ RULES> comment >RULES ] RULE-SET parameters ( used as a parameters state ) [ CPARSE ] { }[ inputq PULL DROP ] | { |[ RULES> block >RULES ] { ; }[ CHANGE-RULES ]{ ; } { , }[ CHANGE-RULES ]{ , } { / * }[ RULES> comment >RULES ] RULE-SET declaration ( used as a defining state ) [ CPARSE ] { }[ inputq PULL DROP ] { ; }[ CHANGE-RULES ] ( go back to calling rule set ) { ( }[ paren CHANGE-RULES ] { [ }{ = }[ RULES> parameters >RULES ] { / * }[ RULES> comment >RULES ] RULE-SET typedef [ CPARSE ] { }[ inputq PULL DROP ] { / * }[ RULES> comment >RULES ] | { |[ RULES> block >RULES ] { ; }[ CHANGE-RULES NO rule-echo ! ] RULE-SET c ( framework for parsing C source ) [ CPARSE ] { }[ inputq PULL DROP ] { signed }{ unsigned }{ } { void }{ char }{ short }{ int }{ long }{ float }{ double }{ declare a new word } { declare a new word }[ RULES> declaration >RULES ] { const }{ register }{ extern }{ volatile }{ static }{ } { / * }[ RULES> comment >RULES ] { # }[ 0 SCAN ] ( warning: won't do multiline; should define { \ } ) \ { # typedef }[ YES rule-echo ! RULES> typedef >RULES ] \ Always start off a translation in the c state : TRANSLATOR-INIT TRANSLATOR-INIT c RULES ;