This note will introduce the workflow of parse.pl
of the ecpg precompiler. Run the precompiler:
perl parse.pl . ../../../backend/parser/gram.y
workflow
- load
ecpg.addons
into an memory hash table. The key is composed of string literals from a production rule with no delimiter. The value is also a hash table which has two keys(type and lines). The type value could be of: block, rule and addon. The lines value is the code from below the addons definition. For more detail, seesrc/interfaces/ecpg/preproc/README.parser
.- For
block
type, the attached code is completely written as the new semantic action. - For
rule
type, the attached new rules is directly appended to the original rule. - For
addon
type, the attached code is prepended to the original semantic action.
- For
- read the gram.y file line by line until end of file
- split the line by space into an array
- load
ecpg.tokens
file content into memory buffer with tagtokens
if not yet - load
ecpg.header
file content into memory buffer with tagheader
if not yet - load
ecpg.types
file content into memory buffer with tagecpgtype
if not yet - for each token line without token type specified in gram.y(includes
%token
,%nonassoc
,%left
, etc), add each word to thetokens
set. Also, reconcatenate the words with single space. For the token%nonassoc IDENT
, add one more token%nonassoc CSTRING
. Finally, add the reconcatenated token line into memory buffer with tag orig_tokens. - skip other lines until the bison grammar rules section reached
- read each rule until rule delimiter ';'. In the process, we skip semantic action and only take care of the rule symbol.
- if the rule symbol is in
replace_token
hash table, update the rule symbol. - if the rule symbol is a non-terminal symbol,
- and is not defined in
replace_types
hash table, then set the rule symbol type tostr
and mark this rule as 'copymode'. - and is told being ignored, then go to read the next line.
- populate the memory buffer tagging with 'rules' with this non-terminal symbol
- if the non-terminal symbol is
stmt
, remember the state. - define the type of the non-terminal symbol, such as
%type <str> stmt
, and then populate the memory buffer with tag 'types'. - remember we're in a rule and going to process the remaining fields
- and is not defined in
- if the rule symbol is '%prec', mark this state
- if we're in 'copymode' and no '%prec' found and in processing the remaining fields,
- if the following two conditions are met:
- the symbol is not 'Op', and it is in the
tokens
set or it is a single quoted string - we'are in
stmt
rule,
then, get the target string if this symbol is inreplace_string
hash table otherwise use this symbol string as target. Push the target string, lowcase it if we're not instmt
rule, intofields
array.
- the symbol is not 'Op', and it is in the
- else, push the
$n
intofields
array wheren
is one plus the length offields
array.
- if the following two conditions are met:
- dump the memory buffer as the following order:
- header
- tokens
- types
- ecpgtype
- orig_tokens
- rules
- trailer
Thetrailer
buffer is loaded withecpg.trailer
file contents.