首页 > 其他分享 >Bison

Bison

时间:2024-06-12 20:21:21浏览次数:23  
标签:优先级 precedence expr rule token Bison

Bison

The Yacc-compatible Parser Generator

10 September 2021, Bison Version 3.8.1

by Charles Donnelly and Richard Stallman

目录

1.1 Languages and Context-Free Grammars

Here is a simple C function subdivided into tokens:

int /* keyword 'int' */ square (int x) 
/* identifier, open-paren, keyword 'int', identifier, close-paren */
{ /* open-brace */
    return x * x;
    /* keyword 'return', identifier, asterisk, identifier, semicolon */
} /* close-brace */

The syntactic groupings of C include the expression, the statement, the
declaration, and the function definition. These are represented in the
grammar of C by nonterminal symbols 'expression', 'statement', 'declaration'
and 'function definition'. The full grammar uses dozens of additional
language constructs, each with its own nonterminal symbol, in order to
express the meanings of these four. The example above is a function
definition; it contains one declaration, and one statement. In the statement,
each 'x' is an expression and so is 'x * x'.

One nonterminal symbol must be distinguished as the special one which defines
a complete utterance in the language. It is called the start symbol. In a
compiler, this means a complete input program. In the C language, the
nonterminal symbol 'sequence of definitions and declarations' plays this
role.

5 The Bison Parser Algorithm

As Bison reads tokens, it pushes them onto a stack along with their
semantic values. The stack is called the parser stack. Pushing a token is
traditionally called shifting.

For example, suppose the infix calculator has read '1 + 5 *', with a '3' to
come. The stack will have four elements, one for each token that was shifted.

But the stack does not always have an element for each token read. When the
last n tokens and groupings shifted match the components of a grammar
rule, they can be combined according to that rule. This is called
reduction. Those tokens and groupings are replaced on the stack by a
single grouping whose symbol is the result (left hand side) of that rule.
Running the rule's action is part of the process of reduction, because this
is what computes the semantic value of the resulting grouping.

The parser tries, by shifts and reductions, to reduce the entire input down
to a single grouping whose symbol is the grammar's start-symbol (see 1.1
[Languages and Context-Free Grammars]).

This kind of parser is known in the literature as a bottom-up parser.

5.1 Lookahead Tokens

The Bison parser does not always reduce immediately as soon as the last n
tokens and groupings match a rule.

Here is a simple case where lookahead is needed. These three rules define
expressions which contain binary addition operators and postfix unary
factorial operators ('!'), and allow parentheses for grouping.

expr:
  term '+' expr
| term
;
term:
  '(' expr ')'
| term '!'
| "number"
;

Suppose that the tokens '1 + 2' have been read and shifted; what should be
done?

  • If the following token is '!', then it must be shifted immediately so
    that '2 !' can be reduced to make a term. If instead the parser were to
    reduce before shifting, '1 + 2' would become an expr. It would then be
    impossible to shift the '!' because doing so would produce on the stack the
    sequence of symbols expr '!'. No rule allows that sequence.
  • If the following token is ')', then the first three tokens must be reduced
    to form an expr. This is the only valid course, because shifting the ')'
    would produce a sequence of symbols term ')', and no rule allows this.

阅读理解 5.1

这个例子也是一个关于 优先级(和结合性) 的很好的例子。在规则的定义中,优先级(和结
合性)低的操作数(符号)作为规则的子规则先被计算。换句话说,优先级(和结合性)的实
现,就是将高优先级和低优先级的规则分开,而将高优先级(结合性)的规则(符号)作为低
优先级的规则(符号)的子规则(符号)。

它在这个例子中的体现,包括了:

  • 加法(左结合性)中左侧的操作数,在规则递归中先被算出,即term在语法树中低于expr。
  • 阶乘(优先级高于加法),在规则递归中先被算出,即由term构成expr,而不是反过来。

同时,look ahead是为了保证当前的Recude不影响后面的Reduce。而后面的Reduction可能正好
就作用于优先级(和结合性)更高(在语法树中更底层)的规则。换句话说,look ahead也就是为
了使优先级(和结合性)更高的规则有机会先于更低的规则被执行;在这种情况下,高优先级的规
则作为一个单独且唯一可行的规则,在look ahead中保证了它的操作数的完整性(即不会被先出现
在文本中但是优先级(和结合性)更低的规则"划走")。

5.2 Shift/Reduce Conflicts

Suppose we are parsing a language which has if-then and if-then-else
statements, with a pair of rules like this:

if_stmt:
  "if" expr "then" stmt
| "if" expr "then" stmt "else" stmt
;

When the "else" token is read and becomes the lookahead token, the contents
of the stack (assuming the input is valid) are just right for reduction by
the first rule. But it is also legitimate to shift the "else", because that
would lead to eventual reduction by the second rule.

This situation, where either a shift or a reduction would be valid, is
called a shift/reduce conflict. Bison is designed to resolve these
conflicts by choosing to shift, unless otherwise directed by operator
precedence declarations. (It would ideally be cleaner to write an unambiguous
grammar, but that is very hard to do in this case.) This particular ambiguity
was first encountered in the specifications of Algol 60 and is called the
"dangling else" ambiguity.

  if e1 then  if e2 then s1  • else s2

5.3.1 When Precedence is Needed

Consider the following ambiguous grammar fragment:

expr:
  expr '-' expr
| expr '*' expr
| expr '<' expr
| '(' expr ')'
;...

Suppose the parser has seen the tokens '1', '-' and '2', should it reduce
them via the rule for the subtraction operator?

To decide which one Bison should do, we must consider the results. If the
next operator token op is shifted, then it must be reduced first in
order to permit another opportunity to reduce the difference. The result is
(in effect) '1 - (2 op 3)'. On the other hand, if the subtraction is reduced
before shifting op, the result is '(1 - 2) op 3'. Clearly, then, the
choice of shift or reduce should depend on the relative precedence of the
operators '-' and op: '*' should be shifted first, but not '<'.

What about input such as '1 - 2 - 5', should this be '(1 - 2) - 5' or should
it be '1 - (2 - 5)'? For most operators we prefer the former, which is called
left association. The latter alternative, right association, is desirable for
assignment operators. The choice of left or right association is a matter of
whether the parser chooses to shift or reduce when the stack contains '1 - 2'
and the lookahead token is '-': shifting makes right-associativity.

阅读理解 5.3.1

我们使用的数据结构是栈,所以,规约是从右往左进行的,也就是从栈顶开始的。这决定了:

  • 优先级(和结合性)更高的规则的子规则可以先(于栈内低优先级规则的规约)入栈;
    因为在出栈的时候,它会先于处于栈内更底层的低优先级规则被规约。即为shift。
  • 而在遇到优先级更低的规则时,则需要先将栈内高优先级的规则先做规约。即为reduce。

或者,反过来,这其实决定了:为了可以这样做,并将Bison的规则简化为仅以一次look ahead
决定shift or reduce,需要将高优先级的规则设计为独立且唯一的子规则。
于是,在look ahead的时候:

  • 遇到独立且唯一的(子)规则:做移进
  • 遇到在语法树中更高层的(父)规则:做规约,再做移进

5.3.2 Specifying Operator Precedence

  • %left left-associative
  • %right right-associative
  • %nonassoc, declares that it is an error to find the same operator
    twice in a/an row/expression.
  • %precedence without associativity

The relative precedence of different operators is controlled by the order in
which they are declared. The first precedence/associativity declaration in
the file declares the operators whose precedence is lowest, the next such
declaration declares the operators whose precedence is a little higher, and
so on.

阅读理解 5.3.2
  • 结合性标签同时指定了优先级。结合性相当于是在处理同一优先级内的符号之间的优先级。
  • Bison规定了将低优先级的符号写在相对的上面,高优先级的符号写在相对下面,与这些符号
    在语法树中对应节点的层级是相对应的。

5.3.3 Specifying Precedence Only

Since POSIX Yacc defines only %left, %right, and %nonassoc, which all defines
precedence and associativity, little attention is paid to the fact that
precedence cannot be defined without defining associativity. Yet, sometimes,
when trying to solve a conflict, precedence suffices. In such a case, using
%left, %right, or %nonassoc might hide future (associativity related)
conflicts that would remain hidden.

The dangling else ambiguity if e1 then if e2 then s1 • else s2 can be
solved explicitly.

The conflict involves the reduction of the rule 'IF expr THEN stmt', which
precedence is by default that of its last token (THEN), and the shifting
of the token ELSE. The usual disambiguation (attach the else to the closest
if), shifting must be preferred, i.e., the precedence of ELSE must be
higher than that of THEN. But neither is expected to be involved in an
associativity related conflict, which can be specified as follows.

%precedence THEN
%precedence ELSE

The unary-minus is another typical example where associativity is usually
over-specified.

阅读理解 5.3.3

结合性(和优先级)标签解决移进/规约冲突的token识别机制:

  • 规约:token为当前栈内的已经被移进过的最后一个token
  • 移进:token为look ahead的那个token,移进意味着,后续出栈将优先被弹出栈并规约

5.3.5 How Precedence Works

The first effect of the precedence declarations is to assign precedence
levels to the terminal symbols declared. The second effect is to assign
precedence levels to certain rules: each rule gets its precedence from the
last terminal symbol mentioned in the components. (You can also specify
explicitly the precedence of a rule. See 5.4 [Context-Dependent Precedence])

Finally, the resolution of conflicts works by comparing the precedence of the
rule being considered with that of the lookahead token. If the
token's precedence is higher, the choice is to shift. If the rule's
precedence is higher, the choice is to reduce. If they have equal precedence,
the choice is made based on the associativity of that precedence level. The
verbose output file made by -v (see 9 [Invoking Bison]) says how each
conflict was resolved.

Not all rules and not all tokens have precedence. If either the rule or the
lookahead token has no precedence, then the default is to shift.

5.3.6 Using Precedence For Non Operators

Alternatively, you may give both tokens the same precedence, in which case
associativity is used to solve the conflict. To preserve the shift action,
use right associativity:

%right "then" "else"

Neither solution is perfect however. Since Bison does not provide, so far,
scoped precedence, both force you to declare the precedence of these keywords
with respect to the other operators in your grammar. Therefore, instead of
being warned about new conflicts you would be unaware of (e.g., a shift/
reduce conflict due to 'if test then 1 else 2 + 3' being ambiguous: 'if test
then 1 else (2 + 3)' or '(if test then 1 else 2) + 3'?), the conflict will be
already “fixed”.

标签:优先级,precedence,expr,rule,token,Bison
From: https://www.cnblogs.com/qqiwei/p/18244643

相关文章

  • Flex&Bison
    Flex与Bison《Flex与Bison》阅读笔记Flex和Bison简介第一个flex程序%{intchars=0;intwords=0;intlines=0;%}%%[^\t\n\r\f\v]+{words++;chars+=strlen(yytext);}\n{chars++;lines++;}.{chars++;}%%intmain(int......
  • UBUNTU 18.04.6 在编译linux内核的时候执行make ARCH=arm socfpga_defconfig设置默认
    在编译linux内核的时候执行makeARCH=armsocfpga_defconfig设置默认配置时报错bisonflexnotfound缺少文件:/bin/sh:1:bison:notfound 输入命令sudoapt-getinstallbison进行安装: /bin/sh:1:flex:notfound 输入命令 sudoapt-getinstallflex进行安......
  • flex and bison usage in mysql
    queryparsinginmysqlmysqlsourcecodeversion:8.0.34(fromMYSQL_VERSIONfile)Thisanarticlefromquestionstounderstandings.whichfiledoesmysqlusetodefinesqlgrammar?sql/sql_yacc.yywhatisthenameyyparsereplacedwithinmysql?Sear......
  • linux系统安装bison,解决 These critical programs are missing or too old bison comp
    1、编译 glibc过程中报错../configure--prefix=/opt/glibc-2.272、首先查看bison版本  bison--versionbison-V貌似就没有安装bison。3、使用yum安装bison yuminstallbison 安装成功。 4、查看版本:bison--version 居然这么简单就完成了。5、继续编译 glibc......
  • 1、编译 glibc 过程中报错 ../configure --prefix=/opt/glibc-2.27       2、首
    64位安装包,查看系统位数,安装对应的mysqlLinux系统安装MySQL时,将MySQL-5.6.17-1.el6.x86_64.rpm-bundle.tar包打开,有7个rpm文件,如下:MySQL-client-5.6.17-1.el6.x86_64.rpmMySQL-devel-5.6.17-1.el6.x86_64.rpmMySQL-embedded-5.6.17-1.el6.x86_64.rpmMySQL-server-5.6.17-1.el6.......
  • flex and bison usage in PostgreSQL
    flex/bisonusageinpgsqlInregularbisonusage,wecallyyparse()togetanAST.So,IsearchedforyyparseinPostgreSQLsourcecode,whicheventuallyledmetothebase_yyparse()function.Whatisthat?Ingram.y:%name-prefix="base_yy"%par......
  • Linux 编译报错 /bin/sh: 1: flex: not found 和 /bin/sh: 1: bison: not found 解决
      配置内核菜单报错1、报错(1):/bin/sh:1:flex:notfound解决方案(1) sudoapt-getinstallflex2、报错(2):/bin/sh:1:bison:not......
  • 2022-10-04 语法分析器bison说明
    ​​https://www.gnu.org/software/bison/manual/bison.html​​参考: ​​https://zhuanlan.zhihu.com/p/52326306​​​​https://zhuanlan.zhihu.com/p/120812270​​​......