bison/src/bison/2.4.1/bison-2.4.1-src/doc/bison.info - Issue 10807020: Add native Windows binary for bison.

Side by Side Diff: bison/src/bison/2.4.1/bison-2.4.1-src/doc/bison.info

Issue 10807020: Add native Windows binary for bison. (Closed) Base URL: svn://chrome-svn/chrome/trunk/deps/third_party/

Patch Set: Created 8 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

OLD	NEW
(Empty)
	1 This is ../../bison-2.4.1-src/doc/bison.info, produced by makeinfo

	2 version 4.8 from ../../bison-2.4.1-src/doc/bison.texinfo.

	3

	4 This manual (19 November 2008) is for GNU Bison (version 2.4.1), the

	5 GNU parser generator.

	6

	7 Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999,

	8 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software

	9 Foundation, Inc.

	10

	11 Permission is granted to copy, distribute and/or modify this

	12 document under the terms of the GNU Free Documentation License,

	13 Version 1.2 or any later version published by the Free Software

	14 Foundation; with no Invariant Sections, with the Front-Cover texts

	15 being "A GNU Manual," and with the Back-Cover Texts as in (a)

	16 below. A copy of the license is included in the section entitled

	17 "GNU Free Documentation License."

	18

	19 (a) The FSF's Back-Cover Text is: "You have the freedom to copy and

	20 modify this GNU manual. Buying copies from the FSF supports it in

	21 developing GNU and promoting software freedom."

	22

	23 INFO-DIR-SECTION Software development

	24 START-INFO-DIR-ENTRY

	25 * bison: (bison). GNU parser generator (Yacc replacement).

	26 END-INFO-DIR-ENTRY

	27

	28

	29 File: bison.info, Node: Top, Next: Introduction, Up: (dir)

	30

	31 Bison

	32 *****

	33

	34 This manual (19 November 2008) is for GNU Bison (version 2.4.1), the

	35 GNU parser generator.

	36

	37 Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999,

	38 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software

	39 Foundation, Inc.

	40

	41 Permission is granted to copy, distribute and/or modify this

	42 document under the terms of the GNU Free Documentation License,

	43 Version 1.2 or any later version published by the Free Software

	44 Foundation; with no Invariant Sections, with the Front-Cover texts

	45 being "A GNU Manual," and with the Back-Cover Texts as in (a)

	46 below. A copy of the license is included in the section entitled

	47 "GNU Free Documentation License."

	48

	49 (a) The FSF's Back-Cover Text is: "You have the freedom to copy and

	50 modify this GNU manual. Buying copies from the FSF supports it in

	51 developing GNU and promoting software freedom."

	52

	53 * Menu:

	54

	55 * Introduction::

	56 * Conditions::

	57 * Copying:: The GNU General Public License says

	58 how you can copy and share Bison.

	59

	60 Tutorial sections:

	61 * Concepts:: Basic concepts for understanding Bison.

	62 * Examples:: Three simple explained examples of using Bison.

	63

	64 Reference sections:

	65 * Grammar File:: Writing Bison declarations and rules.

	66 * Interface:: C-language interface to the parser function `yyparse'.

	67 * Algorithm:: How the Bison parser works at run-time.

	68 * Error Recovery:: Writing rules for error recovery.

	69 * Context Dependency:: What to do if your language syntax is too

	70 messy for Bison to handle straightforwardly.

	71 * Debugging:: Understanding or debugging Bison parsers.

	72 * Invocation:: How to run Bison (to produce the parser source file).

	73 * Other Languages:: Creating C++ and Java parsers.

	74 * FAQ:: Frequently Asked Questions

	75 * Table of Symbols:: All the keywords of the Bison language are explained.

	76 * Glossary:: Basic concepts are explained.

	77 * Copying This Manual:: License for copying this manual.

	78 * Index:: Cross-references to the text.

	79

	80 --- The Detailed Node Listing ---

	81

	82 The Concepts of Bison

	83

	84 * Language and Grammar:: Languages and context-free grammars,

	85 as mathematical ideas.

	86 * Grammar in Bison:: How we represent grammars for Bison's sake.

	87 * Semantic Values:: Each token or syntactic grouping can have

	88 a semantic value (the value of an integer,

	89 the name of an identifier, etc.).

	90 * Semantic Actions:: Each rule can have an action containing C code.

	91 * GLR Parsers:: Writing parsers for general context-free languages.

	92 * Locations Overview:: Tracking Locations.

	93 * Bison Parser:: What are Bison's input and output,

	94 how is the output used?

	95 * Stages:: Stages in writing and running Bison grammars.

	96 * Grammar Layout:: Overall structure of a Bison grammar file.

	97

	98 Writing GLR Parsers

	99

	100 * Simple GLR Parsers:: Using GLR parsers on unambiguous grammars.

	101 * Merging GLR Parses:: Using GLR parsers to resolve ambiguities.

	102 * GLR Semantic Actions:: Deferred semantic actions have special concerns.

	103 * Compiler Requirements:: GLR parsers require a modern C compiler.

	104

	105 Examples

	106

	107 * RPN Calc:: Reverse polish notation calculator;

	108 a first example with no operator precedence.

	109 * Infix Calc:: Infix (algebraic) notation calculator.

	110 Operator precedence is introduced.

	111 * Simple Error Recovery:: Continuing after syntax errors.

	112 * Location Tracking Calc:: Demonstrating the use of @N and @$.

	113 * Multi-function Calc:: Calculator with memory and trig functions.

	114 It uses multiple data-types for semantic values.

	115 * Exercises:: Ideas for improving the multi-function calculator.

	116

	117 Reverse Polish Notation Calculator

	118

	119 * Rpcalc Declarations:: Prologue (declarations) for rpcalc.

	120 * Rpcalc Rules:: Grammar Rules for rpcalc, with explanation.

	121 * Rpcalc Lexer:: The lexical analyzer.

	122 * Rpcalc Main:: The controlling function.

	123 * Rpcalc Error:: The error reporting function.

	124 * Rpcalc Generate:: Running Bison on the grammar file.

	125 * Rpcalc Compile:: Run the C compiler on the output code.

	126

	127 Grammar Rules for `rpcalc'

	128

	129 * Rpcalc Input::

	130 * Rpcalc Line::

	131 * Rpcalc Expr::

	132

	133 Location Tracking Calculator: `ltcalc'

	134

	135 * Ltcalc Declarations:: Bison and C declarations for ltcalc.

	136 * Ltcalc Rules:: Grammar rules for ltcalc, with explanations.

	137 * Ltcalc Lexer:: The lexical analyzer.

	138

	139 Multi-Function Calculator: `mfcalc'

	140

	141 * Mfcalc Declarations:: Bison declarations for multi-function calculator.

	142 * Mfcalc Rules:: Grammar rules for the calculator.

	143 * Mfcalc Symbol Table:: Symbol table management subroutines.

	144

	145 Bison Grammar Files

	146

	147 * Grammar Outline:: Overall layout of the grammar file.

	148 * Symbols:: Terminal and nonterminal symbols.

	149 * Rules:: How to write grammar rules.

	150 * Recursion:: Writing recursive rules.

	151 * Semantics:: Semantic values and actions.

	152 * Locations:: Locations and actions.

	153 * Declarations:: All kinds of Bison declarations are described here.

	154 * Multiple Parsers:: Putting more than one Bison parser in one program.

	155

	156 Outline of a Bison Grammar

	157

	158 * Prologue:: Syntax and usage of the prologue.

	159 * Prologue Alternatives:: Syntax and usage of alternatives to the prologue.

	160 * Bison Declarations:: Syntax and usage of the Bison declarations section.

	161 * Grammar Rules:: Syntax and usage of the grammar rules section.

	162 * Epilogue:: Syntax and usage of the epilogue.

	163

	164 Defining Language Semantics

	165

	166 * Value Type:: Specifying one data type for all semantic values.

	167 * Multiple Types:: Specifying several alternative data types.

	168 * Actions:: An action is the semantic definition of a grammar rule.

	169 * Action Types:: Specifying data types for actions to operate on.

	170 * Mid-Rule Actions:: Most actions go at the end of a rule.

	171 This says when, why and how to use the exceptional

	172 action in the middle of a rule.

	173

	174 Tracking Locations

	175

	176 * Location Type:: Specifying a data type for locations.

	177 * Actions and Locations:: Using locations in actions.

	178 * Location Default Action:: Defining a general way to compute locations.

	179

	180 Bison Declarations

	181

	182 * Require Decl:: Requiring a Bison version.

	183 * Token Decl:: Declaring terminal symbols.

	184 * Precedence Decl:: Declaring terminals with precedence and associativity.

	185 * Union Decl:: Declaring the set of all semantic value types.

	186 * Type Decl:: Declaring the choice of type for a nonterminal symbol.

	187 * Initial Action Decl:: Code run before parsing starts.

	188 * Destructor Decl:: Declaring how symbols are freed.

	189 * Expect Decl:: Suppressing warnings about parsing conflicts.

	190 * Start Decl:: Specifying the start symbol.

	191 * Pure Decl:: Requesting a reentrant parser.

	192 * Push Decl:: Requesting a push parser.

	193 * Decl Summary:: Table of all Bison declarations.

	194

	195 Parser C-Language Interface

	196

	197 * Parser Function:: How to call `yyparse' and what it returns.

	198 * Push Parser Function:: How to call `yypush_parse' and what it returns.

	199 * Pull Parser Function:: How to call `yypull_parse' and what it returns.

	200 * Parser Create Function:: How to call `yypstate_new' and what it returns.

	201 * Parser Delete Function:: How to call `yypstate_delete' and what it returns.

	202 * Lexical:: You must supply a function `yylex'

	203 which reads tokens.

	204 * Error Reporting:: You must supply a function `yyerror'.

	205 * Action Features:: Special features for use in actions.

	206 * Internationalization:: How to let the parser speak in the user's

	207 native language.

	208

	209 The Lexical Analyzer Function `yylex'

	210

	211 * Calling Convention:: How `yyparse' calls `yylex'.

	212 * Token Values:: How `yylex' must return the semantic value

	213 of the token it has read.

	214 * Token Locations:: How `yylex' must return the text location

	215 (line number, etc.) of the token, if the

	216 actions want that.

	217 * Pure Calling:: How the calling convention differs in a pure parser

	218 (*note A Pure (Reentrant) Parser: Pure Decl.).

	219

	220 The Bison Parser Algorithm

	221

	222 * Lookahead:: Parser looks one token ahead when deciding what to do.

	223 * Shift/Reduce:: Conflicts: when either shifting or reduction is valid.

	224 * Precedence:: Operator precedence works by resolving conflicts.

	225 * Contextual Precedence:: When an operator's precedence depends on context.

	226 * Parser States:: The parser is a finite-state-machine with stack.

	227 * Reduce/Reduce:: When two rules are applicable in the same situation.

	228 * Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.

	229 * Generalized LR Parsing:: Parsing arbitrary context-free grammars.

	230 * Memory Management:: What happens when memory is exhausted. How to avoid it.

	231

	232 Operator Precedence

	233

	234 * Why Precedence:: An example showing why precedence is needed.

	235 * Using Precedence:: How to specify precedence in Bison grammars.

	236 * Precedence Examples:: How these features are used in the previous example.

	237 * How Precedence:: How they work.

	238

	239 Handling Context Dependencies

	240

	241 * Semantic Tokens:: Token parsing can depend on the semantic context.

	242 * Lexical Tie-ins:: Token parsing can depend on the syntactic context.

	243 * Tie-in Recovery:: Lexical tie-ins have implications for how

	244 error recovery rules must be written.

	245

	246 Debugging Your Parser

	247

	248 * Understanding:: Understanding the structure of your parser.

	249 * Tracing:: Tracing the execution of your parser.

	250

	251 Invoking Bison

	252

	253 * Bison Options:: All the options described in detail,

	254 in alphabetical order by short options.

	255 * Option Cross Key:: Alphabetical list of long options.

	256 * Yacc Library:: Yacc-compatible `yylex' and `main'.

	257

	258 Parsers Written In Other Languages

	259

	260 * C++ Parsers:: The interface to generate C++ parser classes

	261 * Java Parsers:: The interface to generate Java parser classes

	262

	263 C++ Parsers

	264

	265 * C++ Bison Interface:: Asking for C++ parser generation

	266 * C++ Semantic Values:: %union vs. C++

	267 * C++ Location Values:: The position and location classes

	268 * C++ Parser Interface:: Instantiating and running the parser

	269 * C++ Scanner Interface:: Exchanges between yylex and parse

	270 * A Complete C++ Example:: Demonstrating their use

	271

	272 A Complete C++ Example

	273

	274 * Calc++ --- C++ Calculator:: The specifications

	275 * Calc++ Parsing Driver:: An active parsing context

	276 * Calc++ Parser:: A parser class

	277 * Calc++ Scanner:: A pure C++ Flex scanner

	278 * Calc++ Top Level:: Conducting the band

	279

	280 Java Parsers

	281

	282 * Java Bison Interface:: Asking for Java parser generation

	283 * Java Semantic Values:: %type and %token vs. Java

	284 * Java Location Values:: The position and location classes

	285 * Java Parser Interface:: Instantiating and running the parser

	286 * Java Scanner Interface:: Specifying the scanner for the parser

	287 * Java Action Features:: Special features for use in actions

	288 * Java Differences:: Differences between C/C++ and Java Grammars

	289 * Java Declarations Summary:: List of Bison declarations used with Java

	290

	291 Frequently Asked Questions

	292

	293 * Memory Exhausted:: Breaking the Stack Limits

	294 * How Can I Reset the Parser:: `yyparse' Keeps some State

	295 * Strings are Destroyed:: `yylval' Loses Track of Strings

	296 * Implementing Gotos/Loops:: Control Flow in the Calculator

	297 * Multiple start-symbols:: Factoring closely related grammars

	298 * Secure? Conform?:: Is Bison POSIX safe?

	299 * I can't build Bison:: Troubleshooting

	300 * Where can I find help?:: Troubleshouting

	301 * Bug Reports:: Troublereporting

	302 * More Languages:: Parsers in C++, Java, and so on

	303 * Beta Testing:: Experimenting development versions

	304 * Mailing Lists:: Meeting other Bison users

	305

	306 Copying This Manual

	307

	308 * Copying This Manual:: License for copying this manual.

	309

	310

	311 File: bison.info, Node: Introduction, Next: Conditions, Prev: Top, Up: Top

	312

	313 Introduction

	314 ************

	315

	316 "Bison" is a general-purpose parser generator that converts an

	317 annotated context-free grammar into an LALR(1) or GLR parser for that

	318 grammar. Once you are proficient with Bison, you can use it to develop

	319 a wide range of language parsers, from those used in simple desk

	320 calculators to complex programming languages.

	321

	322 Bison is upward compatible with Yacc: all properly-written Yacc

	323 grammars ought to work with Bison with no change. Anyone familiar with

	324 Yacc should be able to use Bison with little trouble. You need to be

	325 fluent in C or C++ programming in order to use Bison or to understand

	326 this manual.

	327

	328 We begin with tutorial chapters that explain the basic concepts of

	329 using Bison and show three explained examples, each building on the

	330 last. If you don't know Bison or Yacc, start by reading these

	331 chapters. Reference chapters follow which describe specific aspects of

	332 Bison in detail.

	333

	334 Bison was written primarily by Robert Corbett; Richard Stallman made

	335 it Yacc-compatible. Wilfred Hansen of Carnegie Mellon University added

	336 multi-character string literals and other features.

	337

	338 This edition corresponds to version 2.4.1 of Bison.

	339

	340

	341 File: bison.info, Node: Conditions, Next: Copying, Prev: Introduction, Up: T op

	342

	343 Conditions for Using Bison

	344 **************************

	345

	346 The distribution terms for Bison-generated parsers permit using the

	347 parsers in nonfree programs. Before Bison version 2.2, these extra

	348 permissions applied only when Bison was generating LALR(1) parsers in

	349 C. And before Bison version 1.24, Bison-generated parsers could be

	350 used only in programs that were free software.

	351

	352 The other GNU programming tools, such as the GNU C compiler, have

	353 never had such a requirement. They could always be used for nonfree

	354 software. The reason Bison was different was not due to a special

	355 policy decision; it resulted from applying the usual General Public

	356 License to all of the Bison source code.

	357

	358 The output of the Bison utility--the Bison parser file--contains a

	359 verbatim copy of a sizable piece of Bison, which is the code for the

	360 parser's implementation. (The actions from your grammar are inserted

	361 into this implementation at one point, but most of the rest of the

	362 implementation is not changed.) When we applied the GPL terms to the

	363 skeleton code for the parser's implementation, the effect was to

	364 restrict the use of Bison output to free software.

	365

	366 We didn't change the terms because of sympathy for people who want to

	367 make software proprietary. Software should be free. But we

	368 concluded that limiting Bison's use to free software was doing little to

	369 encourage people to make other software free. So we decided to make the

	370 practical conditions for using Bison match the practical conditions for

	371 using the other GNU tools.

	372

	373 This exception applies when Bison is generating code for a parser.

	374 You can tell whether the exception applies to a Bison output file by

	375 inspecting the file for text beginning with "As a special

	376 exception...". The text spells out the exact terms of the exception.

	377

	378

	379 File: bison.info, Node: Copying, Next: Concepts, Prev: Conditions, Up: Top

	380

	381 GNU GENERAL PUBLIC LICENSE

	382 **************************

	383

	384 Version 3, 29 June 2007

	385

	386 Copyright (C) 2007 Free Software Foundation, Inc. `http://fsf.org/'

	387

	388 Everyone is permitted to copy and distribute verbatim copies of this

	389 license document, but changing it is not allowed.

	390

	391 Preamble

	392 ========

	393

	394 The GNU General Public License is a free, copyleft license for software

	395 and other kinds of works.

	396

	397 The licenses for most software and other practical works are designed

	398 to take away your freedom to share and change the works. By contrast,

	399 the GNU General Public License is intended to guarantee your freedom to

	400 share and change all versions of a program--to make sure it remains

	401 free software for all its users. We, the Free Software Foundation, use

	402 the GNU General Public License for most of our software; it applies

	403 also to any other work released this way by its authors. You can apply

	404 it to your programs, too.

	405

	406 When we speak of free software, we are referring to freedom, not

	407 price. Our General Public Licenses are designed to make sure that you

	408 have the freedom to distribute copies of free software (and charge for

	409 them if you wish), that you receive source code or can get it if you

	410 want it, that you can change the software or use pieces of it in new

	411 free programs, and that you know you can do these things.

	412

	413 To protect your rights, we need to prevent others from denying you

	414 these rights or asking you to surrender the rights. Therefore, you

	415 have certain responsibilities if you distribute copies of the software,

	416 or if you modify it: responsibilities to respect the freedom of others.

	417

	418 For example, if you distribute copies of such a program, whether

	419 gratis or for a fee, you must pass on to the recipients the same

	420 freedoms that you received. You must make sure that they, too, receive

	421 or can get the source code. And you must show them these terms so they

	422 know their rights.

	423

	424 Developers that use the GNU GPL protect your rights with two steps:

	425 (1) assert copyright on the software, and (2) offer you this License

	426 giving you legal permission to copy, distribute and/or modify it.

	427

	428 For the developers' and authors' protection, the GPL clearly explains

	429 that there is no warranty for this free software. For both users' and

	430 authors' sake, the GPL requires that modified versions be marked as

	431 changed, so that their problems will not be attributed erroneously to

	432 authors of previous versions.

	433

	434 Some devices are designed to deny users access to install or run

	435 modified versions of the software inside them, although the

	436 manufacturer can do so. This is fundamentally incompatible with the

	437 aim of protecting users' freedom to change the software. The

	438 systematic pattern of such abuse occurs in the area of products for

	439 individuals to use, which is precisely where it is most unacceptable.

	440 Therefore, we have designed this version of the GPL to prohibit the

	441 practice for those products. If such problems arise substantially in

	442 other domains, we stand ready to extend this provision to those domains

	443 in future versions of the GPL, as needed to protect the freedom of

	444 users.

	445

	446 Finally, every program is threatened constantly by software patents.

	447 States should not allow patents to restrict development and use of

	448 software on general-purpose computers, but in those that do, we wish to

	449 avoid the special danger that patents applied to a free program could

	450 make it effectively proprietary. To prevent this, the GPL assures that

	451 patents cannot be used to render the program non-free.

	452

	453 The precise terms and conditions for copying, distribution and

	454 modification follow.

	455

	456 TERMS AND CONDITIONS

	457 ====================

	458

	459 0. Definitions.

	460

	461 "This License" refers to version 3 of the GNU General Public

	462 License.

	463

	464 "Copyright" also means copyright-like laws that apply to other

	465 kinds of works, such as semiconductor masks.

	466

	467 "The Program" refers to any copyrightable work licensed under this

	468 License. Each licensee is addressed as "you". "Licensees" and

	469 "recipients" may be individuals or organizations.

	470

	471 To "modify" a work means to copy from or adapt all or part of the

	472 work in a fashion requiring copyright permission, other than the

	473 making of an exact copy. The resulting work is called a "modified

	474 version" of the earlier work or a work "based on" the earlier work.

	475

	476 A "covered work" means either the unmodified Program or a work

	477 based on the Program.

	478

	479 To "propagate" a work means to do anything with it that, without

	480 permission, would make you directly or secondarily liable for

	481 infringement under applicable copyright law, except executing it

	482 on a computer or modifying a private copy. Propagation includes

	483 copying, distribution (with or without modification), making

	484 available to the public, and in some countries other activities as

	485 well.

	486

	487 To "convey" a work means any kind of propagation that enables other

	488 parties to make or receive copies. Mere interaction with a user

	489 through a computer network, with no transfer of a copy, is not

	490 conveying.

	491

	492 An interactive user interface displays "Appropriate Legal Notices"

	493 to the extent that it includes a convenient and prominently visible

	494 feature that (1) displays an appropriate copyright notice, and (2)

	495 tells the user that there is no warranty for the work (except to

	496 the extent that warranties are provided), that licensees may

	497 convey the work under this License, and how to view a copy of this

	498 License. If the interface presents a list of user commands or

	499 options, such as a menu, a prominent item in the list meets this

	500 criterion.

	501

	502 1. Source Code.

	503

	504 The "source code" for a work means the preferred form of the work

	505 for making modifications to it. "Object code" means any

	506 non-source form of a work.

	507

	508 A "Standard Interface" means an interface that either is an

	509 official standard defined by a recognized standards body, or, in

	510 the case of interfaces specified for a particular programming

	511 language, one that is widely used among developers working in that

	512 language.

	513

	514 The "System Libraries" of an executable work include anything,

	515 other than the work as a whole, that (a) is included in the normal

	516 form of packaging a Major Component, but which is not part of that

	517 Major Component, and (b) serves only to enable use of the work

	518 with that Major Component, or to implement a Standard Interface

	519 for which an implementation is available to the public in source

	520 code form. A "Major Component", in this context, means a major

	521 essential component (kernel, window system, and so on) of the

	522 specific operating system (if any) on which the executable work

	523 runs, or a compiler used to produce the work, or an object code

	524 interpreter used to run it.

	525

	526 The "Corresponding Source" for a work in object code form means all

	527 the source code needed to generate, install, and (for an executable

	528 work) run the object code and to modify the work, including

	529 scripts to control those activities. However, it does not include

	530 the work's System Libraries, or general-purpose tools or generally

	531 available free programs which are used unmodified in performing

	532 those activities but which are not part of the work. For example,

	533 Corresponding Source includes interface definition files

	534 associated with source files for the work, and the source code for

	535 shared libraries and dynamically linked subprograms that the work

	536 is specifically designed to require, such as by intimate data

	537 communication or control flow between those subprograms and other

	538 parts of the work.

	539

	540 The Corresponding Source need not include anything that users can

	541 regenerate automatically from other parts of the Corresponding

	542 Source.

	543

	544 The Corresponding Source for a work in source code form is that

	545 same work.

	546

	547 2. Basic Permissions.

	548

	549 All rights granted under this License are granted for the term of

	550 copyright on the Program, and are irrevocable provided the stated

	551 conditions are met. This License explicitly affirms your unlimited

	552 permission to run the unmodified Program. The output from running

	553 a covered work is covered by this License only if the output,

	554 given its content, constitutes a covered work. This License

	555 acknowledges your rights of fair use or other equivalent, as

	556 provided by copyright law.

	557

	558 You may make, run and propagate covered works that you do not

	559 convey, without conditions so long as your license otherwise

	560 remains in force. You may convey covered works to others for the

	561 sole purpose of having them make modifications exclusively for

	562 you, or provide you with facilities for running those works,

	563 provided that you comply with the terms of this License in

	564 conveying all material for which you do not control copyright.

	565 Those thus making or running the covered works for you must do so

	566 exclusively on your behalf, under your direction and control, on

	567 terms that prohibit them from making any copies of your

	568 copyrighted material outside their relationship with you.

	569

	570 Conveying under any other circumstances is permitted solely under

	571 the conditions stated below. Sublicensing is not allowed; section

	572 10 makes it unnecessary.

	573

	574 3. Protecting Users' Legal Rights From Anti-Circumvention Law.

	575

	576 No covered work shall be deemed part of an effective technological

	577 measure under any applicable law fulfilling obligations under

	578 article 11 of the WIPO copyright treaty adopted on 20 December

	579 1996, or similar laws prohibiting or restricting circumvention of

	580 such measures.

	581

	582 When you convey a covered work, you waive any legal power to forbid

	583 circumvention of technological measures to the extent such

	584 circumvention is effected by exercising rights under this License

	585 with respect to the covered work, and you disclaim any intention

	586 to limit operation or modification of the work as a means of

	587 enforcing, against the work's users, your or third parties' legal

	588 rights to forbid circumvention of technological measures.

	589

	590 4. Conveying Verbatim Copies.

	591

	592 You may convey verbatim copies of the Program's source code as you

	593 receive it, in any medium, provided that you conspicuously and

	594 appropriately publish on each copy an appropriate copyright notice;

	595 keep intact all notices stating that this License and any

	596 non-permissive terms added in accord with section 7 apply to the

	597 code; keep intact all notices of the absence of any warranty; and

	598 give all recipients a copy of this License along with the Program.

	599

	600 You may charge any price or no price for each copy that you convey,

	601 and you may offer support or warranty protection for a fee.

	602

	603 5. Conveying Modified Source Versions.

	604

	605 You may convey a work based on the Program, or the modifications to

	606 produce it from the Program, in the form of source code under the

	607 terms of section 4, provided that you also meet all of these

	608 conditions:

	609

	610 a. The work must carry prominent notices stating that you

	611 modified it, and giving a relevant date.

	612

	613 b. The work must carry prominent notices stating that it is

	614 released under this License and any conditions added under

	615 section 7. This requirement modifies the requirement in

	616 section 4 to "keep intact all notices".

	617

	618 c. You must license the entire work, as a whole, under this

	619 License to anyone who comes into possession of a copy. This

	620 License will therefore apply, along with any applicable

	621 section 7 additional terms, to the whole of the work, and all

	622 its parts, regardless of how they are packaged. This License

	623 gives no permission to license the work in any other way, but

	624 it does not invalidate such permission if you have separately

	625 received it.

	626

	627 d. If the work has interactive user interfaces, each must display

	628 Appropriate Legal Notices; however, if the Program has

	629 interactive interfaces that do not display Appropriate Legal

	630 Notices, your work need not make them do so.

	631

	632 A compilation of a covered work with other separate and independent

	633 works, which are not by their nature extensions of the covered

	634 work, and which are not combined with it such as to form a larger

	635 program, in or on a volume of a storage or distribution medium, is

	636 called an "aggregate" if the compilation and its resulting

	637 copyright are not used to limit the access or legal rights of the

	638 compilation's users beyond what the individual works permit.

	639 Inclusion of a covered work in an aggregate does not cause this

	640 License to apply to the other parts of the aggregate.

	641

	642 6. Conveying Non-Source Forms.

	643

	644 You may convey a covered work in object code form under the terms

	645 of sections 4 and 5, provided that you also convey the

	646 machine-readable Corresponding Source under the terms of this

	647 License, in one of these ways:

	648

	649 a. Convey the object code in, or embodied in, a physical product

	650 (including a physical distribution medium), accompanied by the

	651 Corresponding Source fixed on a durable physical medium

	652 customarily used for software interchange.

	653

	654 b. Convey the object code in, or embodied in, a physical product

	655 (including a physical distribution medium), accompanied by a

	656 written offer, valid for at least three years and valid for

	657 as long as you offer spare parts or customer support for that

	658 product model, to give anyone who possesses the object code

	659 either (1) a copy of the Corresponding Source for all the

	660 software in the product that is covered by this License, on a

	661 durable physical medium customarily used for software

	662 interchange, for a price no more than your reasonable cost of

	663 physically performing this conveying of source, or (2) access

	664 to copy the Corresponding Source from a network server at no

	665 charge.

	666

	667 c. Convey individual copies of the object code with a copy of

	668 the written offer to provide the Corresponding Source. This

	669 alternative is allowed only occasionally and noncommercially,

	670 and only if you received the object code with such an offer,

	671 in accord with subsection 6b.

	672

	673 d. Convey the object code by offering access from a designated

	674 place (gratis or for a charge), and offer equivalent access

	675 to the Corresponding Source in the same way through the same

	676 place at no further charge. You need not require recipients

	677 to copy the Corresponding Source along with the object code.

	678 If the place to copy the object code is a network server, the

	679 Corresponding Source may be on a different server (operated

	680 by you or a third party) that supports equivalent copying

	681 facilities, provided you maintain clear directions next to

	682 the object code saying where to find the Corresponding Source.

	683 Regardless of what server hosts the Corresponding Source, you

	684 remain obligated to ensure that it is available for as long

	685 as needed to satisfy these requirements.

	686

	687 e. Convey the object code using peer-to-peer transmission,

	688 provided you inform other peers where the object code and

	689 Corresponding Source of the work are being offered to the

	690 general public at no charge under subsection 6d.

	691

	692

	693 A separable portion of the object code, whose source code is

	694 excluded from the Corresponding Source as a System Library, need

	695 not be included in conveying the object code work.

	696

	697 A "User Product" is either (1) a "consumer product", which means

	698 any tangible personal property which is normally used for personal,

	699 family, or household purposes, or (2) anything designed or sold for

	700 incorporation into a dwelling. In determining whether a product

	701 is a consumer product, doubtful cases shall be resolved in favor of

	702 coverage. For a particular product received by a particular user,

	703 "normally used" refers to a typical or common use of that class of

	704 product, regardless of the status of the particular user or of the

	705 way in which the particular user actually uses, or expects or is

	706 expected to use, the product. A product is a consumer product

	707 regardless of whether the product has substantial commercial,

	708 industrial or non-consumer uses, unless such uses represent the

	709 only significant mode of use of the product.

	710

	711 "Installation Information" for a User Product means any methods,

	712 procedures, authorization keys, or other information required to

	713 install and execute modified versions of a covered work in that

	714 User Product from a modified version of its Corresponding Source.

	715 The information must suffice to ensure that the continued

	716 functioning of the modified object code is in no case prevented or

	717 interfered with solely because modification has been made.

	718

	719 If you convey an object code work under this section in, or with,

	720 or specifically for use in, a User Product, and the conveying

	721 occurs as part of a transaction in which the right of possession

	722 and use of the User Product is transferred to the recipient in

	723 perpetuity or for a fixed term (regardless of how the transaction

	724 is characterized), the Corresponding Source conveyed under this

	725 section must be accompanied by the Installation Information. But

	726 this requirement does not apply if neither you nor any third party

	727 retains the ability to install modified object code on the User

	728 Product (for example, the work has been installed in ROM).

	729

	730 The requirement to provide Installation Information does not

	731 include a requirement to continue to provide support service,

	732 warranty, or updates for a work that has been modified or

	733 installed by the recipient, or for the User Product in which it

	734 has been modified or installed. Access to a network may be denied

	735 when the modification itself materially and adversely affects the

	736 operation of the network or violates the rules and protocols for

	737 communication across the network.

	738

	739 Corresponding Source conveyed, and Installation Information

	740 provided, in accord with this section must be in a format that is

	741 publicly documented (and with an implementation available to the

	742 public in source code form), and must require no special password

	743 or key for unpacking, reading or copying.

	744

	745 7. Additional Terms.

	746

	747 "Additional permissions" are terms that supplement the terms of

	748 this License by making exceptions from one or more of its

	749 conditions. Additional permissions that are applicable to the

	750 entire Program shall be treated as though they were included in

	751 this License, to the extent that they are valid under applicable

	752 law. If additional permissions apply only to part of the Program,

	753 that part may be used separately under those permissions, but the

	754 entire Program remains governed by this License without regard to

	755 the additional permissions.

	756

	757 When you convey a copy of a covered work, you may at your option

	758 remove any additional permissions from that copy, or from any part

	759 of it. (Additional permissions may be written to require their own

	760 removal in certain cases when you modify the work.) You may place

	761 additional permissions on material, added by you to a covered work,

	762 for which you have or can give appropriate copyright permission.

	763

	764 Notwithstanding any other provision of this License, for material

	765 you add to a covered work, you may (if authorized by the copyright

	766 holders of that material) supplement the terms of this License

	767 with terms:

	768

	769 a. Disclaiming warranty or limiting liability differently from

	770 the terms of sections 15 and 16 of this License; or

	771

	772 b. Requiring preservation of specified reasonable legal notices

	773 or author attributions in that material or in the Appropriate

	774 Legal Notices displayed by works containing it; or

	775

	776 c. Prohibiting misrepresentation of the origin of that material,

	777 or requiring that modified versions of such material be

	778 marked in reasonable ways as different from the original

	779 version; or

	780

	781 d. Limiting the use for publicity purposes of names of licensors

	782 or authors of the material; or

	783

	784 e. Declining to grant rights under trademark law for use of some

	785 trade names, trademarks, or service marks; or

	786

	787 f. Requiring indemnification of licensors and authors of that

	788 material by anyone who conveys the material (or modified

	789 versions of it) with contractual assumptions of liability to

	790 the recipient, for any liability that these contractual

	791 assumptions directly impose on those licensors and authors.

	792

	793 All other non-permissive additional terms are considered "further

	794 restrictions" within the meaning of section 10. If the Program as

	795 you received it, or any part of it, contains a notice stating that

	796 it is governed by this License along with a term that is a further

	797 restriction, you may remove that term. If a license document

	798 contains a further restriction but permits relicensing or

	799 conveying under this License, you may add to a covered work

	800 material governed by the terms of that license document, provided

	801 that the further restriction does not survive such relicensing or

	802 conveying.

	803

	804 If you add terms to a covered work in accord with this section, you

	805 must place, in the relevant source files, a statement of the

	806 additional terms that apply to those files, or a notice indicating

	807 where to find the applicable terms.

	808

	809 Additional terms, permissive or non-permissive, may be stated in

	810 the form of a separately written license, or stated as exceptions;

	811 the above requirements apply either way.

	812

	813 8. Termination.

	814

	815 You may not propagate or modify a covered work except as expressly

	816 provided under this License. Any attempt otherwise to propagate or

	817 modify it is void, and will automatically terminate your rights

	818 under this License (including any patent licenses granted under

	819 the third paragraph of section 11).

	820

	821 However, if you cease all violation of this License, then your

	822 license from a particular copyright holder is reinstated (a)

	823 provisionally, unless and until the copyright holder explicitly

	824 and finally terminates your license, and (b) permanently, if the

	825 copyright holder fails to notify you of the violation by some

	826 reasonable means prior to 60 days after the cessation.

	827

	828 Moreover, your license from a particular copyright holder is

	829 reinstated permanently if the copyright holder notifies you of the

	830 violation by some reasonable means, this is the first time you have

	831 received notice of violation of this License (for any work) from

	832 that copyright holder, and you cure the violation prior to 30 days

	833 after your receipt of the notice.

	834

	835 Termination of your rights under this section does not terminate

	836 the licenses of parties who have received copies or rights from

	837 you under this License. If your rights have been terminated and

	838 not permanently reinstated, you do not qualify to receive new

	839 licenses for the same material under section 10.

	840

	841 9. Acceptance Not Required for Having Copies.

	842

	843 You are not required to accept this License in order to receive or

	844 run a copy of the Program. Ancillary propagation of a covered work

	845 occurring solely as a consequence of using peer-to-peer

	846 transmission to receive a copy likewise does not require

	847 acceptance. However, nothing other than this License grants you

	848 permission to propagate or modify any covered work. These actions

	849 infringe copyright if you do not accept this License. Therefore,

	850 by modifying or propagating a covered work, you indicate your

	851 acceptance of this License to do so.

	852

	853 10. Automatic Licensing of Downstream Recipients.

	854

	855 Each time you convey a covered work, the recipient automatically

	856 receives a license from the original licensors, to run, modify and

	857 propagate that work, subject to this License. You are not

	858 responsible for enforcing compliance by third parties with this

	859 License.

	860

	861 An "entity transaction" is a transaction transferring control of an

	862 organization, or substantially all assets of one, or subdividing an

	863 organization, or merging organizations. If propagation of a

	864 covered work results from an entity transaction, each party to that

	865 transaction who receives a copy of the work also receives whatever

	866 licenses to the work the party's predecessor in interest had or

	867 could give under the previous paragraph, plus a right to

	868 possession of the Corresponding Source of the work from the

	869 predecessor in interest, if the predecessor has it or can get it

	870 with reasonable efforts.

	871

	872 You may not impose any further restrictions on the exercise of the

	873 rights granted or affirmed under this License. For example, you

	874 may not impose a license fee, royalty, or other charge for

	875 exercise of rights granted under this License, and you may not

	876 initiate litigation (including a cross-claim or counterclaim in a

	877 lawsuit) alleging that any patent claim is infringed by making,

	878 using, selling, offering for sale, or importing the Program or any

	879 portion of it.

	880

	881 11. Patents.

	882

	883 A "contributor" is a copyright holder who authorizes use under this

	884 License of the Program or a work on which the Program is based.

	885 The work thus licensed is called the contributor's "contributor

	886 version".

	887

	888 A contributor's "essential patent claims" are all patent claims

	889 owned or controlled by the contributor, whether already acquired or

	890 hereafter acquired, that would be infringed by some manner,

	891 permitted by this License, of making, using, or selling its

	892 contributor version, but do not include claims that would be

	893 infringed only as a consequence of further modification of the

	894 contributor version. For purposes of this definition, "control"

	895 includes the right to grant patent sublicenses in a manner

	896 consistent with the requirements of this License.

	897

	898 Each contributor grants you a non-exclusive, worldwide,

	899 royalty-free patent license under the contributor's essential

	900 patent claims, to make, use, sell, offer for sale, import and

	901 otherwise run, modify and propagate the contents of its

	902 contributor version.

	903

	904 In the following three paragraphs, a "patent license" is any

	905 express agreement or commitment, however denominated, not to

	906 enforce a patent (such as an express permission to practice a

	907 patent or covenant not to sue for patent infringement). To

	908 "grant" such a patent license to a party means to make such an

	909 agreement or commitment not to enforce a patent against the party.

	910

	911 If you convey a covered work, knowingly relying on a patent

	912 license, and the Corresponding Source of the work is not available

	913 for anyone to copy, free of charge and under the terms of this

	914 License, through a publicly available network server or other

	915 readily accessible means, then you must either (1) cause the

	916 Corresponding Source to be so available, or (2) arrange to deprive

	917 yourself of the benefit of the patent license for this particular

	918 work, or (3) arrange, in a manner consistent with the requirements

	919 of this License, to extend the patent license to downstream

	920 recipients. "Knowingly relying" means you have actual knowledge

	921 that, but for the patent license, your conveying the covered work

	922 in a country, or your recipient's use of the covered work in a

	923 country, would infringe one or more identifiable patents in that

	924 country that you have reason to believe are valid.

	925

	926 If, pursuant to or in connection with a single transaction or

	927 arrangement, you convey, or propagate by procuring conveyance of, a

	928 covered work, and grant a patent license to some of the parties

	929 receiving the covered work authorizing them to use, propagate,

	930 modify or convey a specific copy of the covered work, then the

	931 patent license you grant is automatically extended to all

	932 recipients of the covered work and works based on it.

	933

	934 A patent license is "discriminatory" if it does not include within

	935 the scope of its coverage, prohibits the exercise of, or is

	936 conditioned on the non-exercise of one or more of the rights that

	937 are specifically granted under this License. You may not convey a

	938 covered work if you are a party to an arrangement with a third

	939 party that is in the business of distributing software, under

	940 which you make payment to the third party based on the extent of

	941 your activity of conveying the work, and under which the third

	942 party grants, to any of the parties who would receive the covered

	943 work from you, a discriminatory patent license (a) in connection

	944 with copies of the covered work conveyed by you (or copies made

	945 from those copies), or (b) primarily for and in connection with

	946 specific products or compilations that contain the covered work,

	947 unless you entered into that arrangement, or that patent license

	948 was granted, prior to 28 March 2007.

	949

	950 Nothing in this License shall be construed as excluding or limiting

	951 any implied license or other defenses to infringement that may

	952 otherwise be available to you under applicable patent law.

	953

	954 12. No Surrender of Others' Freedom.

	955

	956 If conditions are imposed on you (whether by court order,

	957 agreement or otherwise) that contradict the conditions of this

	958 License, they do not excuse you from the conditions of this

	959 License. If you cannot convey a covered work so as to satisfy

	960 simultaneously your obligations under this License and any other

	961 pertinent obligations, then as a consequence you may not convey it

	962 at all. For example, if you agree to terms that obligate you to

	963 collect a royalty for further conveying from those to whom you

	964 convey the Program, the only way you could satisfy both those

	965 terms and this License would be to refrain entirely from conveying

	966 the Program.

	967

	968 13. Use with the GNU Affero General Public License.

	969

	970 Notwithstanding any other provision of this License, you have

	971 permission to link or combine any covered work with a work licensed

	972 under version 3 of the GNU Affero General Public License into a

	973 single combined work, and to convey the resulting work. The terms

	974 of this License will continue to apply to the part which is the

	975 covered work, but the special requirements of the GNU Affero

	976 General Public License, section 13, concerning interaction through

	977 a network will apply to the combination as such.

	978

	979 14. Revised Versions of this License.

	980

	981 The Free Software Foundation may publish revised and/or new

	982 versions of the GNU General Public License from time to time.

	983 Such new versions will be similar in spirit to the present

	984 version, but may differ in detail to address new problems or

	985 concerns.

	986

	987 Each version is given a distinguishing version number. If the

	988 Program specifies that a certain numbered version of the GNU

	989 General Public License "or any later version" applies to it, you

	990 have the option of following the terms and conditions either of

	991 that numbered version or of any later version published by the

	992 Free Software Foundation. If the Program does not specify a

	993 version number of the GNU General Public License, you may choose

	994 any version ever published by the Free Software Foundation.

	995

	996 If the Program specifies that a proxy can decide which future

	997 versions of the GNU General Public License can be used, that

	998 proxy's public statement of acceptance of a version permanently

	999 authorizes you to choose that version for the Program.

	1000

	1001 Later license versions may give you additional or different

	1002 permissions. However, no additional obligations are imposed on any

	1003 author or copyright holder as a result of your choosing to follow a

	1004 later version.

	1005

	1006 15. Disclaimer of Warranty.

	1007

	1008 THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY

	1009 APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE

	1010 COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"

	1011 WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED,

	1012 INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF

	1013 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE

	1014 RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.

	1015 SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL

	1016 NECESSARY SERVICING, REPAIR OR CORRECTION.

	1017

	1018 16. Limitation of Liability.

	1019

	1020 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN

	1021 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES

	1022 AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU

	1023 FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR

	1024 CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE

	1025 THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA

	1026 BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD

	1027 PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER

	1028 PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF

	1029 THE POSSIBILITY OF SUCH DAMAGES.

	1030

	1031 17. Interpretation of Sections 15 and 16.

	1032

	1033 If the disclaimer of warranty and limitation of liability provided

	1034 above cannot be given local legal effect according to their terms,

	1035 reviewing courts shall apply local law that most closely

	1036 approximates an absolute waiver of all civil liability in

	1037 connection with the Program, unless a warranty or assumption of

	1038 liability accompanies a copy of the Program in return for a fee.

	1039

	1040

	1041 END OF TERMS AND CONDITIONS

	1042 ===========================

	1043

	1044 How to Apply These Terms to Your New Programs

	1045 =============================================

	1046

	1047 If you develop a new program, and you want it to be of the greatest

	1048 possible use to the public, the best way to achieve this is to make it

	1049 free software which everyone can redistribute and change under these

	1050 terms.

	1051

	1052 To do so, attach the following notices to the program. It is safest

	1053 to attach them to the start of each source file to most effectively

	1054 state the exclusion of warranty; and each file should have at least the

	1055 "copyright" line and a pointer to where the full notice is found.

	1056

	1057 ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.

	1058 Copyright (C) YEAR NAME OF AUTHOR

	1059

	1060 This program is free software: you can redistribute it and/or modify

	1061 it under the terms of the GNU General Public License as published by

	1062 the Free Software Foundation, either version 3 of the License, or (at

	1063 your option) any later version.

	1064

	1065 This program is distributed in the hope that it will be useful, but

	1066 WITHOUT ANY WARRANTY; without even the implied warranty of

	1067 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU

	1068 General Public License for more details.

	1069

	1070 You should have received a copy of the GNU General Public License

	1071 along with this program. If not, see `http://www.gnu.org/licenses/'.

	1072

	1073 Also add information on how to contact you by electronic and paper

	1074 mail.

	1075

	1076 If the program does terminal interaction, make it output a short

	1077 notice like this when it starts in an interactive mode:

	1078

	1079 PROGRAM Copyright (C) YEAR NAME OF AUTHOR

	1080 This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.

	1081 This is free software, and you are welcome to redistribute it

	1082 under certain conditions; type `show c' for details.

	1083

	1084 The hypothetical commands `show w' and `show c' should show the

	1085 appropriate parts of the General Public License. Of course, your

	1086 program's commands might be different; for a GUI interface, you would

	1087 use an "about box".

	1088

	1089 You should also get your employer (if you work as a programmer) or

	1090 school, if any, to sign a "copyright disclaimer" for the program, if

	1091 necessary. For more information on this, and how to apply and follow

	1092 the GNU GPL, see `http://www.gnu.org/licenses/'.

	1093

	1094 The GNU General Public License does not permit incorporating your

	1095 program into proprietary programs. If your program is a subroutine

	1096 library, you may consider it more useful to permit linking proprietary

	1097 applications with the library. If this is what you want to do, use the

	1098 GNU Lesser General Public License instead of this License. But first,

	1099 please read `http://www.gnu.org/philosophy/why-not-lgpl.html'.

	1100

	1101

	1102 File: bison.info, Node: Concepts, Next: Examples, Prev: Copying, Up: Top

	1103

	1104 1 The Concepts of Bison

	1105 ***********************

	1106

	1107 This chapter introduces many of the basic concepts without which the

	1108 details of Bison will not make sense. If you do not already know how to

	1109 use Bison or Yacc, we suggest you start by reading this chapter

	1110 carefully.

	1111

	1112 * Menu:

	1113

	1114 * Language and Grammar:: Languages and context-free grammars,

	1115 as mathematical ideas.

	1116 * Grammar in Bison:: How we represent grammars for Bison's sake.

	1117 * Semantic Values:: Each token or syntactic grouping can have

	1118 a semantic value (the value of an integer,

	1119 the name of an identifier, etc.).

	1120 * Semantic Actions:: Each rule can have an action containing C code.

	1121 * GLR Parsers:: Writing parsers for general context-free languages.

	1122 * Locations Overview:: Tracking Locations.

	1123 * Bison Parser:: What are Bison's input and output,

	1124 how is the output used?

	1125 * Stages:: Stages in writing and running Bison grammars.

	1126 * Grammar Layout:: Overall structure of a Bison grammar file.

	1127

	1128

	1129 File: bison.info, Node: Language and Grammar, Next: Grammar in Bison, Up: Con cepts

	1130

	1131 1.1 Languages and Context-Free Grammars

	1132 =======================================

	1133

	1134 In order for Bison to parse a language, it must be described by a

	1135 "context-free grammar". This means that you specify one or more

	1136 "syntactic groupings" and give rules for constructing them from their

	1137 parts. For example, in the C language, one kind of grouping is called

	1138 an `expression'. One rule for making an expression might be, "An

	1139 expression can be made of a minus sign and another expression".

	1140 Another would be, "An expression can be an integer". As you can see,

	1141 rules are often recursive, but there must be at least one rule which

	1142 leads out of the recursion.

	1143

	1144 The most common formal system for presenting such rules for humans

	1145 to read is "Backus-Naur Form" or "BNF", which was developed in order to

	1146 specify the language Algol 60. Any grammar expressed in BNF is a

	1147 context-free grammar. The input to Bison is essentially

	1148 machine-readable BNF.

	1149

	1150 There are various important subclasses of context-free grammar.

	1151 Although it can handle almost all context-free grammars, Bison is

	1152 optimized for what are called LALR(1) grammars. In brief, in these

	1153 grammars, it must be possible to tell how to parse any portion of an

	1154 input string with just a single token of lookahead. Strictly speaking,

	1155 that is a description of an LR(1) grammar, and LALR(1) involves

	1156 additional restrictions that are hard to explain simply; but it is rare

	1157 in actual practice to find an LR(1) grammar that fails to be LALR(1).

	1158 *Note Mysterious Reduce/Reduce Conflicts: Mystery Conflicts, for more

	1159 information on this.

	1160

	1161 Parsers for LALR(1) grammars are "deterministic", meaning roughly

	1162 that the next grammar rule to apply at any point in the input is

	1163 uniquely determined by the preceding input and a fixed, finite portion

	1164 (called a "lookahead") of the remaining input. A context-free grammar

	1165 can be "ambiguous", meaning that there are multiple ways to apply the

	1166 grammar rules to get the same inputs. Even unambiguous grammars can be

	1167 "nondeterministic", meaning that no fixed lookahead always suffices to

	1168 determine the next grammar rule to apply. With the proper

	1169 declarations, Bison is also able to parse these more general

	1170 context-free grammars, using a technique known as GLR parsing (for

	1171 Generalized LR). Bison's GLR parsers are able to handle any

	1172 context-free grammar for which the number of possible parses of any

	1173 given string is finite.

	1174

	1175 In the formal grammatical rules for a language, each kind of

	1176 syntactic unit or grouping is named by a "symbol". Those which are

	1177 built by grouping smaller constructs according to grammatical rules are

	1178 called "nonterminal symbols"; those which can't be subdivided are called

	1179 "terminal symbols" or "token types". We call a piece of input

	1180 corresponding to a single terminal symbol a "token", and a piece

	1181 corresponding to a single nonterminal symbol a "grouping".

	1182

	1183 We can use the C language as an example of what symbols, terminal and

	1184 nonterminal, mean. The tokens of C are identifiers, constants (numeric

	1185 and string), and the various keywords, arithmetic operators and

	1186 punctuation marks. So the terminal symbols of a grammar for C include

	1187 `identifier', `number', `string', plus one symbol for each keyword,

	1188 operator or punctuation mark: `if', `return', `const', `static', `int',

	1189 `char', `plus-sign', `open-brace', `close-brace', `comma' and many more.

	1190 (These tokens can be subdivided into characters, but that is a matter of

	1191 lexicography, not grammar.)

	1192

	1193 Here is a simple C function subdivided into tokens:

	1194

	1195 int /* keyword `int' */

	1196 square (int x) /* identifier, open-paren, keyword `int',

	1197 identifier, close-paren */

	1198 { /* open-brace */

	1199 return x * x; /* keyword `return', identifier, asterisk,

	1200 identifier, semicolon */

	1201 } /* close-brace */

	1202

	1203 The syntactic groupings of C include the expression, the statement,

	1204 the declaration, and the function definition. These are represented in

	1205 the grammar of C by nonterminal symbols `expression', `statement',

	1206 `declaration' and `function definition'. The full grammar uses dozens

	1207 of additional language constructs, each with its own nonterminal

	1208 symbol, in order to express the meanings of these four. The example

	1209 above is a function definition; it contains one declaration, and one

	1210 statement. In the statement, each `x' is an expression and so is `x *

	1211 x'.

	1212

	1213 Each nonterminal symbol must have grammatical rules showing how it

	1214 is made out of simpler constructs. For example, one kind of C

	1215 statement is the `return' statement; this would be described with a

	1216 grammar rule which reads informally as follows:

	1217

	1218 A `statement' can be made of a `return' keyword, an `expression'

	1219 and a `semicolon'.

	1220

	1221 There would be many other rules for `statement', one for each kind of

	1222 statement in C.

	1223

	1224 One nonterminal symbol must be distinguished as the special one which

	1225 defines a complete utterance in the language. It is called the "start

	1226 symbol". In a compiler, this means a complete input program. In the C

	1227 language, the nonterminal symbol `sequence of definitions and

	1228 declarations' plays this role.

	1229

	1230 For example, `1 + 2' is a valid C expression--a valid part of a C

	1231 program--but it is not valid as an _entire_ C program. In the

	1232 context-free grammar of C, this follows from the fact that `expression'

	1233 is not the start symbol.

	1234

	1235 The Bison parser reads a sequence of tokens as its input, and groups

	1236 the tokens using the grammar rules. If the input is valid, the end

	1237 result is that the entire token sequence reduces to a single grouping

	1238 whose symbol is the grammar's start symbol. If we use a grammar for C,

	1239 the entire input must be a `sequence of definitions and declarations'.

	1240 If not, the parser reports a syntax error.

	1241

	1242

	1243 File: bison.info, Node: Grammar in Bison, Next: Semantic Values, Prev: Langua ge and Grammar, Up: Concepts

	1244

	1245 1.2 From Formal Rules to Bison Input

	1246 ====================================

	1247

	1248 A formal grammar is a mathematical construct. To define the language

	1249 for Bison, you must write a file expressing the grammar in Bison syntax:

	1250 a "Bison grammar" file. *Note Bison Grammar Files: Grammar File.

	1251

	1252 A nonterminal symbol in the formal grammar is represented in Bison

	1253 input as an identifier, like an identifier in C. By convention, it

	1254 should be in lower case, such as `expr', `stmt' or `declaration'.

	1255

	1256 The Bison representation for a terminal symbol is also called a

	1257 "token type". Token types as well can be represented as C-like

	1258 identifiers. By convention, these identifiers should be upper case to

	1259 distinguish them from nonterminals: for example, `INTEGER',

	1260 `IDENTIFIER', `IF' or `RETURN'. A terminal symbol that stands for a

	1261 particular keyword in the language should be named after that keyword

	1262 converted to upper case. The terminal symbol `error' is reserved for

	1263 error recovery. *Note Symbols::.

	1264

	1265 A terminal symbol can also be represented as a character literal,

	1266 just like a C character constant. You should do this whenever a token

	1267 is just a single character (parenthesis, plus-sign, etc.): use that

	1268 same character in a literal as the terminal symbol for that token.

	1269

	1270 A third way to represent a terminal symbol is with a C string

	1271 constant containing several characters. *Note Symbols::, for more

	1272 information.

	1273

	1274 The grammar rules also have an expression in Bison syntax. For

	1275 example, here is the Bison rule for a C `return' statement. The

	1276 semicolon in quotes is a literal character token, representing part of

	1277 the C syntax for the statement; the naked semicolon, and the colon, are

	1278 Bison punctuation used in every rule.

	1279

	1280 stmt: RETURN expr ';'

	1281 ;

	1282

	1283 *Note Syntax of Grammar Rules: Rules.

	1284

	1285

	1286 File: bison.info, Node: Semantic Values, Next: Semantic Actions, Prev: Gramma r in Bison, Up: Concepts

	1287

	1288 1.3 Semantic Values

	1289 ===================

	1290

	1291 A formal grammar selects tokens only by their classifications: for

	1292 example, if a rule mentions the terminal symbol `integer constant', it

	1293 means that _any_ integer constant is grammatically valid in that

	1294 position. The precise value of the constant is irrelevant to how to

	1295 parse the input: if `x+4' is grammatical then `x+1' or `x+3989' is

	1296 equally grammatical.

	1297

	1298 But the precise value is very important for what the input means

	1299 once it is parsed. A compiler is useless if it fails to distinguish

	1300 between 4, 1 and 3989 as constants in the program! Therefore, each

	1301 token in a Bison grammar has both a token type and a "semantic value".

	1302 *Note Defining Language Semantics: Semantics, for details.

	1303

	1304 The token type is a terminal symbol defined in the grammar, such as

	1305 `INTEGER', `IDENTIFIER' or `',''. It tells everything you need to know

	1306 to decide where the token may validly appear and how to group it with

	1307 other tokens. The grammar rules know nothing about tokens except their

	1308 types.

	1309

	1310 The semantic value has all the rest of the information about the

	1311 meaning of the token, such as the value of an integer, or the name of an

	1312 identifier. (A token such as `','' which is just punctuation doesn't

	1313 need to have any semantic value.)

	1314

	1315 For example, an input token might be classified as token type

	1316 `INTEGER' and have the semantic value 4. Another input token might

	1317 have the same token type `INTEGER' but value 3989. When a grammar rule

	1318 says that `INTEGER' is allowed, either of these tokens is acceptable

	1319 because each is an `INTEGER'. When the parser accepts the token, it

	1320 keeps track of the token's semantic value.

	1321

	1322 Each grouping can also have a semantic value as well as its

	1323 nonterminal symbol. For example, in a calculator, an expression

	1324 typically has a semantic value that is a number. In a compiler for a

	1325 programming language, an expression typically has a semantic value that

	1326 is a tree structure describing the meaning of the expression.

	1327

	1328

	1329 File: bison.info, Node: Semantic Actions, Next: GLR Parsers, Prev: Semantic V alues, Up: Concepts

	1330

	1331 1.4 Semantic Actions

	1332 ====================

	1333

	1334 In order to be useful, a program must do more than parse input; it must

	1335 also produce some output based on the input. In a Bison grammar, a

	1336 grammar rule can have an "action" made up of C statements. Each time

	1337 the parser recognizes a match for that rule, the action is executed.

	1338 *Note Actions::.

	1339

	1340 Most of the time, the purpose of an action is to compute the

	1341 semantic value of the whole construct from the semantic values of its

	1342 parts. For example, suppose we have a rule which says an expression

	1343 can be the sum of two expressions. When the parser recognizes such a

	1344 sum, each of the subexpressions has a semantic value which describes

	1345 how it was built up. The action for this rule should create a similar

	1346 sort of value for the newly recognized larger expression.

	1347

	1348 For example, here is a rule that says an expression can be the sum of

	1349 two subexpressions:

	1350

	1351 expr: expr '+' expr { $$ = $1 + $3; }

	1352 ;

	1353

	1354 The action says how to produce the semantic value of the sum expression

	1355 from the values of the two subexpressions.

	1356

	1357

	1358 File: bison.info, Node: GLR Parsers, Next: Locations Overview, Prev: Semantic Actions, Up: Concepts

	1359

	1360 1.5 Writing GLR Parsers

	1361 =======================

	1362

	1363 In some grammars, Bison's standard LALR(1) parsing algorithm cannot

	1364 decide whether to apply a certain grammar rule at a given point. That

	1365 is, it may not be able to decide (on the basis of the input read so

	1366 far) which of two possible reductions (applications of a grammar rule)

	1367 applies, or whether to apply a reduction or read more of the input and

	1368 apply a reduction later in the input. These are known respectively as

	1369 "reduce/reduce" conflicts (*note Reduce/Reduce::), and "shift/reduce"

	1370 conflicts (*note Shift/Reduce::).

	1371

	1372 To use a grammar that is not easily modified to be LALR(1), a more

	1373 general parsing algorithm is sometimes necessary. If you include

	1374 `%glr-parser' among the Bison declarations in your file (*note Grammar

	1375 Outline::), the result is a Generalized LR (GLR) parser. These parsers

	1376 handle Bison grammars that contain no unresolved conflicts (i.e., after

	1377 applying precedence declarations) identically to LALR(1) parsers.

	1378 However, when faced with unresolved shift/reduce and reduce/reduce

	1379 conflicts, GLR parsers use the simple expedient of doing both,

	1380 effectively cloning the parser to follow both possibilities. Each of

	1381 the resulting parsers can again split, so that at any given time, there

	1382 can be any number of possible parses being explored. The parsers

	1383 proceed in lockstep; that is, all of them consume (shift) a given input

	1384 symbol before any of them proceed to the next. Each of the cloned

	1385 parsers eventually meets one of two possible fates: either it runs into

	1386 a parsing error, in which case it simply vanishes, or it merges with

	1387 another parser, because the two of them have reduced the input to an

	1388 identical set of symbols.

	1389

	1390 During the time that there are multiple parsers, semantic actions are

	1391 recorded, but not performed. When a parser disappears, its recorded

	1392 semantic actions disappear as well, and are never performed. When a

	1393 reduction makes two parsers identical, causing them to merge, Bison

	1394 records both sets of semantic actions. Whenever the last two parsers

	1395 merge, reverting to the single-parser case, Bison resolves all the

	1396 outstanding actions either by precedences given to the grammar rules

	1397 involved, or by performing both actions, and then calling a designated

	1398 user-defined function on the resulting values to produce an arbitrary

	1399 merged result.

	1400

	1401 * Menu:

	1402

	1403 * Simple GLR Parsers:: Using GLR parsers on unambiguous grammars.

	1404 * Merging GLR Parses:: Using GLR parsers to resolve ambiguities.

	1405 * GLR Semantic Actions:: Deferred semantic actions have special concerns.

	1406 * Compiler Requirements:: GLR parsers require a modern C compiler.

	1407

	1408

	1409 File: bison.info, Node: Simple GLR Parsers, Next: Merging GLR Parses, Up: GLR Parsers

	1410

	1411 1.5.1 Using GLR on Unambiguous Grammars

	1412 ---------------------------------------

	1413

	1414 In the simplest cases, you can use the GLR algorithm to parse grammars

	1415 that are unambiguous, but fail to be LALR(1). Such grammars typically

	1416 require more than one symbol of lookahead, or (in rare cases) fall into

	1417 the category of grammars in which the LALR(1) algorithm throws away too

	1418 much information (they are in LR(1), but not LALR(1), *Note Mystery

	1419 Conflicts::).

	1420

	1421 Consider a problem that arises in the declaration of enumerated and

	1422 subrange types in the programming language Pascal. Here are some

	1423 examples:

	1424

	1425 type subrange = lo .. hi;

	1426 type enum = (a, b, c);

	1427

	1428 The original language standard allows only numeric literals and

	1429 constant identifiers for the subrange bounds (`lo' and `hi'), but

	1430 Extended Pascal (ISO/IEC 10206) and many other Pascal implementations

	1431 allow arbitrary expressions there. This gives rise to the following

	1432 situation, containing a superfluous pair of parentheses:

	1433

	1434 type subrange = (a) .. b;

	1435

	1436 Compare this to the following declaration of an enumerated type with

	1437 only one value:

	1438

	1439 type enum = (a);

	1440

	1441 (These declarations are contrived, but they are syntactically valid,

	1442 and more-complicated cases can come up in practical programs.)

	1443

	1444 These two declarations look identical until the `..' token. With

	1445 normal LALR(1) one-token lookahead it is not possible to decide between

	1446 the two forms when the identifier `a' is parsed. It is, however,

	1447 desirable for a parser to decide this, since in the latter case `a'

	1448 must become a new identifier to represent the enumeration value, while

	1449 in the former case `a' must be evaluated with its current meaning,

	1450 which may be a constant or even a function call.

	1451

	1452 You could parse `(a)' as an "unspecified identifier in parentheses",

	1453 to be resolved later, but this typically requires substantial

	1454 contortions in both semantic actions and large parts of the grammar,

	1455 where the parentheses are nested in the recursive rules for expressions.

	1456

	1457 You might think of using the lexer to distinguish between the two

	1458 forms by returning different tokens for currently defined and undefined

	1459 identifiers. But if these declarations occur in a local scope, and `a'

	1460 is defined in an outer scope, then both forms are possible--either

	1461 locally redefining `a', or using the value of `a' from the outer scope.

	1462 So this approach cannot work.

	1463

	1464 A simple solution to this problem is to declare the parser to use

	1465 the GLR algorithm. When the GLR parser reaches the critical state, it

	1466 merely splits into two branches and pursues both syntax rules

	1467 simultaneously. Sooner or later, one of them runs into a parsing

	1468 error. If there is a `..' token before the next `;', the rule for

	1469 enumerated types fails since it cannot accept `..' anywhere; otherwise,

	1470 the subrange type rule fails since it requires a `..' token. So one of

	1471 the branches fails silently, and the other one continues normally,

	1472 performing all the intermediate actions that were postponed during the

	1473 split.

	1474

	1475 If the input is syntactically incorrect, both branches fail and the

	1476 parser reports a syntax error as usual.

	1477

	1478 The effect of all this is that the parser seems to "guess" the

	1479 correct branch to take, or in other words, it seems to use more

	1480 lookahead than the underlying LALR(1) algorithm actually allows for.

	1481 In this example, LALR(2) would suffice, but also some cases that are

	1482 not LALR(k) for any k can be handled this way.

	1483

	1484 In general, a GLR parser can take quadratic or cubic worst-case time,

	1485 and the current Bison parser even takes exponential time and space for

	1486 some grammars. In practice, this rarely happens, and for many grammars

	1487 it is possible to prove that it cannot happen. The present example

	1488 contains only one conflict between two rules, and the type-declaration

	1489 context containing the conflict cannot be nested. So the number of

	1490 branches that can exist at any time is limited by the constant 2, and

	1491 the parsing time is still linear.

	1492

	1493 Here is a Bison grammar corresponding to the example above. It

	1494 parses a vastly simplified form of Pascal type declarations.

	1495

	1496 %token TYPE DOTDOT ID

	1497

	1498 %left '+' '-'

	1499 %left '*' '/'

	1500

	1501 %%

	1502

	1503 type_decl : TYPE ID '=' type ';'

	1504 ;

	1505

	1506 type : '(' id_list ')'

	1507 \| expr DOTDOT expr

	1508 ;

	1509

	1510 id_list : ID

	1511 \| id_list ',' ID

	1512 ;

	1513

	1514 expr : '(' expr ')'

	1515 \| expr '+' expr

	1516 \| expr '-' expr

	1517 \| expr '*' expr

	1518 \| expr '/' expr

	1519 \| ID

	1520 ;

	1521

	1522 When used as a normal LALR(1) grammar, Bison correctly complains

	1523 about one reduce/reduce conflict. In the conflicting situation the

	1524 parser chooses one of the alternatives, arbitrarily the one declared

	1525 first. Therefore the following correct input is not recognized:

	1526

	1527 type t = (a) .. b;

	1528

	1529 The parser can be turned into a GLR parser, while also telling Bison

	1530 to be silent about the one known reduce/reduce conflict, by adding

	1531 these two declarations to the Bison input file (before the first `%%'):

	1532

	1533 %glr-parser

	1534 %expect-rr 1

	1535

	1536 No change in the grammar itself is required. Now the parser recognizes

	1537 all valid declarations, according to the limited syntax above,

	1538 transparently. In fact, the user does not even notice when the parser

	1539 splits.

	1540

	1541 So here we have a case where we can use the benefits of GLR, almost

	1542 without disadvantages. Even in simple cases like this, however, there

	1543 are at least two potential problems to beware. First, always analyze

	1544 the conflicts reported by Bison to make sure that GLR splitting is only

	1545 done where it is intended. A GLR parser splitting inadvertently may

	1546 cause problems less obvious than an LALR parser statically choosing the

	1547 wrong alternative in a conflict. Second, consider interactions with

	1548 the lexer (*note Semantic Tokens::) with great care. Since a split

	1549 parser consumes tokens without performing any actions during the split,

	1550 the lexer cannot obtain information via parser actions. Some cases of

	1551 lexer interactions can be eliminated by using GLR to shift the

	1552 complications from the lexer to the parser. You must check the

	1553 remaining cases for correctness.

	1554

	1555 In our example, it would be safe for the lexer to return tokens

	1556 based on their current meanings in some symbol table, because no new

	1557 symbols are defined in the middle of a type declaration. Though it is

	1558 possible for a parser to define the enumeration constants as they are

	1559 parsed, before the type declaration is completed, it actually makes no

	1560 difference since they cannot be used within the same enumerated type

	1561 declaration.

	1562

	1563

	1564 File: bison.info, Node: Merging GLR Parses, Next: GLR Semantic Actions, Prev: Simple GLR Parsers, Up: GLR Parsers

	1565

	1566 1.5.2 Using GLR to Resolve Ambiguities

	1567 --------------------------------------

	1568

	1569 Let's consider an example, vastly simplified from a C++ grammar.

	1570

	1571 %{

	1572 #include <stdio.h>

	1573 #define YYSTYPE char const *

	1574 int yylex (void);

	1575 void yyerror (char const *);

	1576 %}

	1577

	1578 %token TYPENAME ID

	1579

	1580 %right '='

	1581 %left '+'

	1582

	1583 %glr-parser

	1584

	1585 %%

	1586

	1587 prog :

	1588 \| prog stmt { printf ("\n"); }

	1589 ;

	1590

	1591 stmt : expr ';' %dprec 1

	1592 \| decl %dprec 2

	1593 ;

	1594

	1595 expr : ID { printf ("%s ", $$); }

	1596 \| TYPENAME '(' expr ')'

	1597 { printf ("%s <cast> ", $1); }

	1598 \| expr '+' expr { printf ("+ "); }

	1599 \| expr '=' expr { printf ("= "); }

	1600 ;

	1601

	1602 decl : TYPENAME declarator ';'

	1603 { printf ("%s <declare> ", $1); }

	1604 \| TYPENAME declarator '=' expr ';'

	1605 { printf ("%s <init-declare> ", $1); }

	1606 ;

	1607

	1608 declarator : ID { printf ("\"%s\" ", $1); }

	1609 \| '(' declarator ')'

	1610 ;

	1611

	1612 This models a problematic part of the C++ grammar--the ambiguity between

	1613 certain declarations and statements. For example,

	1614

	1615 T (x) = y+z;

	1616

	1617 parses as either an `expr' or a `stmt' (assuming that `T' is recognized

	1618 as a `TYPENAME' and `x' as an `ID'). Bison detects this as a

	1619 reduce/reduce conflict between the rules `expr : ID' and `declarator :

	1620 ID', which it cannot resolve at the time it encounters `x' in the

	1621 example above. Since this is a GLR parser, it therefore splits the

	1622 problem into two parses, one for each choice of resolving the

	1623 reduce/reduce conflict. Unlike the example from the previous section

	1624 (*note Simple GLR Parsers::), however, neither of these parses "dies,"

	1625 because the grammar as it stands is ambiguous. One of the parsers

	1626 eventually reduces `stmt : expr ';'' and the other reduces `stmt :

	1627 decl', after which both parsers are in an identical state: they've seen

	1628 `prog stmt' and have the same unprocessed input remaining. We say that

	1629 these parses have "merged."

	1630

	1631 At this point, the GLR parser requires a specification in the

	1632 grammar of how to choose between the competing parses. In the example

	1633 above, the two `%dprec' declarations specify that Bison is to give

	1634 precedence to the parse that interprets the example as a `decl', which

	1635 implies that `x' is a declarator. The parser therefore prints

	1636

	1637 "x" y z + T <init-declare>

	1638

	1639 The `%dprec' declarations only come into play when more than one

	1640 parse survives. Consider a different input string for this parser:

	1641

	1642 T (x) + y;

	1643

	1644 This is another example of using GLR to parse an unambiguous construct,

	1645 as shown in the previous section (*note Simple GLR Parsers::). Here,

	1646 there is no ambiguity (this cannot be parsed as a declaration).

	1647 However, at the time the Bison parser encounters `x', it does not have

	1648 enough information to resolve the reduce/reduce conflict (again,

	1649 between `x' as an `expr' or a `declarator'). In this case, no

	1650 precedence declaration is used. Again, the parser splits into two, one

	1651 assuming that `x' is an `expr', and the other assuming `x' is a

	1652 `declarator'. The second of these parsers then vanishes when it sees

	1653 `+', and the parser prints

	1654

	1655 x T <cast> y +

	1656

	1657 Suppose that instead of resolving the ambiguity, you wanted to see

	1658 all the possibilities. For this purpose, you must merge the semantic

	1659 actions of the two possible parsers, rather than choosing one over the

	1660 other. To do so, you could change the declaration of `stmt' as follows:

	1661

	1662 stmt : expr ';' %merge <stmtMerge>

	1663 \| decl %merge <stmtMerge>

	1664 ;

	1665

	1666 and define the `stmtMerge' function as:

	1667

	1668 static YYSTYPE

	1669 stmtMerge (YYSTYPE x0, YYSTYPE x1)

	1670 {

	1671 printf ("<OR> ");

	1672 return "";

	1673 }

	1674

	1675 with an accompanying forward declaration in the C declarations at the

	1676 beginning of the file:

	1677

	1678 %{

	1679 #define YYSTYPE char const *

	1680 static YYSTYPE stmtMerge (YYSTYPE x0, YYSTYPE x1);

	1681 %}

	1682

	1683 With these declarations, the resulting parser parses the first example

	1684 as both an `expr' and a `decl', and prints

	1685

	1686 "x" y z + T <init-declare> x T <cast> y z + = <OR>

	1687

	1688 Bison requires that all of the productions that participate in any

	1689 particular merge have identical `%merge' clauses. Otherwise, the

	1690 ambiguity would be unresolvable, and the parser will report an error

	1691 during any parse that results in the offending merge.

	1692

	1693

	1694 File: bison.info, Node: GLR Semantic Actions, Next: Compiler Requirements, Pr ev: Merging GLR Parses, Up: GLR Parsers

	1695

	1696 1.5.3 GLR Semantic Actions

	1697 --------------------------

	1698

	1699 By definition, a deferred semantic action is not performed at the same

	1700 time as the associated reduction. This raises caveats for several

	1701 Bison features you might use in a semantic action in a GLR parser.

	1702

	1703 In any semantic action, you can examine `yychar' to determine the

	1704 type of the lookahead token present at the time of the associated

	1705 reduction. After checking that `yychar' is not set to `YYEMPTY' or

	1706 `YYEOF', you can then examine `yylval' and `yylloc' to determine the

	1707 lookahead token's semantic value and location, if any. In a

	1708 nondeferred semantic action, you can also modify any of these variables

	1709 to influence syntax analysis. *Note Lookahead Tokens: Lookahead.

	1710

	1711 In a deferred semantic action, it's too late to influence syntax

	1712 analysis. In this case, `yychar', `yylval', and `yylloc' are set to

	1713 shallow copies of the values they had at the time of the associated

	1714 reduction. For this reason alone, modifying them is dangerous.

	1715 Moreover, the result of modifying them is undefined and subject to

	1716 change with future versions of Bison. For example, if a semantic

	1717 action might be deferred, you should never write it to invoke

	1718 `yyclearin' (*note Action Features::) or to attempt to free memory

	1719 referenced by `yylval'.

	1720

	1721 Another Bison feature requiring special consideration is `YYERROR'

	1722 (*note Action Features::), which you can invoke in a semantic action to

	1723 initiate error recovery. During deterministic GLR operation, the

	1724 effect of `YYERROR' is the same as its effect in an LALR(1) parser. In

	1725 a deferred semantic action, its effect is undefined.

	1726

	1727 Also, see *Note Default Action for Locations: Location Default

	1728 Action, which describes a special usage of `YYLLOC_DEFAULT' in GLR

	1729 parsers.

	1730

	1731

	1732 File: bison.info, Node: Compiler Requirements, Prev: GLR Semantic Actions, Up : GLR Parsers

	1733

	1734 1.5.4 Considerations when Compiling GLR Parsers

	1735 -----------------------------------------------

	1736

	1737 The GLR parsers require a compiler for ISO C89 or later. In addition,

	1738 they use the `inline' keyword, which is not C89, but is C99 and is a

	1739 common extension in pre-C99 compilers. It is up to the user of these

	1740 parsers to handle portability issues. For instance, if using Autoconf

	1741 and the Autoconf macro `AC_C_INLINE', a mere

	1742

	1743 %{

	1744 #include <config.h>

	1745 %}

	1746

	1747 will suffice. Otherwise, we suggest

	1748

	1749 %{

	1750 #if __STDC_VERSION__ < 199901 && ! defined __GNUC__ && ! defined inline

	1751 #define inline

	1752 #endif

	1753 %}

	1754

	1755

	1756 File: bison.info, Node: Locations Overview, Next: Bison Parser, Prev: GLR Par sers, Up: Concepts

	1757

	1758 1.6 Locations

	1759 =============

	1760

	1761 Many applications, like interpreters or compilers, have to produce

	1762 verbose and useful error messages. To achieve this, one must be able

	1763 to keep track of the "textual location", or "location", of each

	1764 syntactic construct. Bison provides a mechanism for handling these

	1765 locations.

	1766

	1767 Each token has a semantic value. In a similar fashion, each token

	1768 has an associated location, but the type of locations is the same for

	1769 all tokens and groupings. Moreover, the output parser is equipped with

	1770 a default data structure for storing locations (*note Locations::, for

	1771 more details).

	1772

	1773 Like semantic values, locations can be reached in actions using a

	1774 dedicated set of constructs. In the example above, the location of the

	1775 whole grouping is `@$', while the locations of the subexpressions are

	1776 `@1' and `@3'.

	1777

	1778 When a rule is matched, a default action is used to compute the

	1779 semantic value of its left hand side (*note Actions::). In the same

	1780 way, another default action is used for locations. However, the action

	1781 for locations is general enough for most cases, meaning there is

	1782 usually no need to describe for each rule how `@$' should be formed.

	1783 When building a new location for a given grouping, the default behavior

	1784 of the output parser is to take the beginning of the first symbol, and

	1785 the end of the last symbol.

	1786

	1787

	1788 File: bison.info, Node: Bison Parser, Next: Stages, Prev: Locations Overview, Up: Concepts

	1789

	1790 1.7 Bison Output: the Parser File

	1791 =================================

	1792

	1793 When you run Bison, you give it a Bison grammar file as input. The

	1794 output is a C source file that parses the language described by the

	1795 grammar. This file is called a "Bison parser". Keep in mind that the

	1796 Bison utility and the Bison parser are two distinct programs: the Bison

	1797 utility is a program whose output is the Bison parser that becomes part

	1798 of your program.

	1799

	1800 The job of the Bison parser is to group tokens into groupings

	1801 according to the grammar rules--for example, to build identifiers and

	1802 operators into expressions. As it does this, it runs the actions for

	1803 the grammar rules it uses.

	1804

	1805 The tokens come from a function called the "lexical analyzer" that

	1806 you must supply in some fashion (such as by writing it in C). The Bison

	1807 parser calls the lexical analyzer each time it wants a new token. It

	1808 doesn't know what is "inside" the tokens (though their semantic values

	1809 may reflect this). Typically the lexical analyzer makes the tokens by

	1810 parsing characters of text, but Bison does not depend on this. *Note

	1811 The Lexical Analyzer Function `yylex': Lexical.

	1812

	1813 The Bison parser file is C code which defines a function named

	1814 `yyparse' which implements that grammar. This function does not make a

	1815 complete C program: you must supply some additional functions. One is

	1816 the lexical analyzer. Another is an error-reporting function which the

	1817 parser calls to report an error. In addition, a complete C program must

	1818 start with a function called `main'; you have to provide this, and

	1819 arrange for it to call `yyparse' or the parser will never run. *Note

	1820 Parser C-Language Interface: Interface.

	1821

	1822 Aside from the token type names and the symbols in the actions you

	1823 write, all symbols defined in the Bison parser file itself begin with

	1824 `yy' or `YY'. This includes interface functions such as the lexical

	1825 analyzer function `yylex', the error reporting function `yyerror' and

	1826 the parser function `yyparse' itself. This also includes numerous

	1827 identifiers used for internal purposes. Therefore, you should avoid

	1828 using C identifiers starting with `yy' or `YY' in the Bison grammar

	1829 file except for the ones defined in this manual. Also, you should

	1830 avoid using the C identifiers `malloc' and `free' for anything other

	1831 than their usual meanings.

	1832

	1833 In some cases the Bison parser file includes system headers, and in

	1834 those cases your code should respect the identifiers reserved by those

	1835 headers. On some non-GNU hosts, `<alloca.h>', `<malloc.h>',

	1836 `<stddef.h>', and `<stdlib.h>' are included as needed to declare memory

	1837 allocators and related types. `<libintl.h>' is included if message

	1838 translation is in use (*note Internationalization::). Other system

	1839 headers may be included if you define `YYDEBUG' to a nonzero value

	1840 (*note Tracing Your Parser: Tracing.).

	1841

	1842

	1843 File: bison.info, Node: Stages, Next: Grammar Layout, Prev: Bison Parser, Up : Concepts

	1844

	1845 1.8 Stages in Using Bison

	1846 =========================

	1847

	1848 The actual language-design process using Bison, from grammar

	1849 specification to a working compiler or interpreter, has these parts:

	1850

	1851 1. Formally specify the grammar in a form recognized by Bison (*note

	1852 Bison Grammar Files: Grammar File.). For each grammatical rule in

	1853 the language, describe the action that is to be taken when an

	1854 instance of that rule is recognized. The action is described by a

	1855 sequence of C statements.

	1856

	1857 2. Write a lexical analyzer to process input and pass tokens to the

	1858 parser. The lexical analyzer may be written by hand in C (*note

	1859 The Lexical Analyzer Function `yylex': Lexical.). It could also

	1860 be produced using Lex, but the use of Lex is not discussed in this

	1861 manual.

	1862

	1863 3. Write a controlling function that calls the Bison-produced parser.

	1864

	1865 4. Write error-reporting routines.

	1866

	1867 To turn this source code as written into a runnable program, you

	1868 must follow these steps:

	1869

	1870 1. Run Bison on the grammar to produce the parser.

	1871

	1872 2. Compile the code output by Bison, as well as any other source

	1873 files.

	1874

	1875 3. Link the object files to produce the finished product.

	1876

	1877

	1878 File: bison.info, Node: Grammar Layout, Prev: Stages, Up: Concepts

	1879

	1880 1.9 The Overall Layout of a Bison Grammar

	1881 =========================================

	1882

	1883 The input file for the Bison utility is a "Bison grammar file". The

	1884 general form of a Bison grammar file is as follows:

	1885

	1886 %{

	1887 PROLOGUE

	1888 %}

	1889

	1890 BISON DECLARATIONS

	1891

	1892 %%

	1893 GRAMMAR RULES

	1894 %%

	1895 EPILOGUE

	1896

	1897 The `%%', `%{' and `%}' are punctuation that appears in every Bison

	1898 grammar file to separate the sections.

	1899

	1900 The prologue may define types and variables used in the actions.

	1901 You can also use preprocessor commands to define macros used there, and

	1902 use `#include' to include header files that do any of these things.

	1903 You need to declare the lexical analyzer `yylex' and the error printer

	1904 `yyerror' here, along with any other global identifiers used by the

	1905 actions in the grammar rules.

	1906

	1907 The Bison declarations declare the names of the terminal and

	1908 nonterminal symbols, and may also describe operator precedence and the

	1909 data types of semantic values of various symbols.

	1910

	1911 The grammar rules define how to construct each nonterminal symbol

	1912 from its parts.

	1913

	1914 The epilogue can contain any code you want to use. Often the

	1915 definitions of functions declared in the prologue go here. In a simple

	1916 program, all the rest of the program can go here.

	1917

	1918

	1919 File: bison.info, Node: Examples, Next: Grammar File, Prev: Concepts, Up: To p

	1920

	1921 2 Examples

	1922 **********

	1923

	1924 Now we show and explain three sample programs written using Bison: a

	1925 reverse polish notation calculator, an algebraic (infix) notation

	1926 calculator, and a multi-function calculator. All three have been tested

	1927 under BSD Unix 4.3; each produces a usable, though limited, interactive

	1928 desk-top calculator.

	1929

	1930 These examples are simple, but Bison grammars for real programming

	1931 languages are written the same way. You can copy these examples into a

	1932 source file to try them.

	1933

	1934 * Menu:

	1935

	1936 * RPN Calc:: Reverse polish notation calculator;

	1937 a first example with no operator precedence.

	1938 * Infix Calc:: Infix (algebraic) notation calculator.

	1939 Operator precedence is introduced.

	1940 * Simple Error Recovery:: Continuing after syntax errors.

	1941 * Location Tracking Calc:: Demonstrating the use of @N and @$.

	1942 * Multi-function Calc:: Calculator with memory and trig functions.

	1943 It uses multiple data-types for semantic values.

	1944 * Exercises:: Ideas for improving the multi-function calculator.

	1945

	1946

	1947 File: bison.info, Node: RPN Calc, Next: Infix Calc, Up: Examples

	1948

	1949 2.1 Reverse Polish Notation Calculator

	1950 ======================================

	1951

	1952 The first example is that of a simple double-precision "reverse polish

	1953 notation" calculator (a calculator using postfix operators). This

	1954 example provides a good starting point, since operator precedence is

	1955 not an issue. The second example will illustrate how operator

	1956 precedence is handled.

	1957

	1958 The source code for this calculator is named `rpcalc.y'. The `.y'

	1959 extension is a convention used for Bison input files.

	1960

	1961 * Menu:

	1962

	1963 * Rpcalc Declarations:: Prologue (declarations) for rpcalc.

	1964 * Rpcalc Rules:: Grammar Rules for rpcalc, with explanation.

	1965 * Rpcalc Lexer:: The lexical analyzer.

	1966 * Rpcalc Main:: The controlling function.

	1967 * Rpcalc Error:: The error reporting function.

	1968 * Rpcalc Generate:: Running Bison on the grammar file.

	1969 * Rpcalc Compile:: Run the C compiler on the output code.

	1970

	1971

	1972 File: bison.info, Node: Rpcalc Declarations, Next: Rpcalc Rules, Up: RPN Calc

	1973

	1974 2.1.1 Declarations for `rpcalc'

	1975 -------------------------------

	1976

	1977 Here are the C and Bison declarations for the reverse polish notation

	1978 calculator. As in C, comments are placed between `/.../'.

	1979

	1980 /* Reverse polish notation calculator. */

	1981

	1982 %{

	1983 #define YYSTYPE double

	1984 #include <math.h>

	1985 int yylex (void);

	1986 void yyerror (char const *);

	1987 %}

	1988

	1989 %token NUM

	1990

	1991 %% /* Grammar rules and actions follow. */

	1992

	1993 The declarations section (*note The prologue: Prologue.) contains two

	1994 preprocessor directives and two forward declarations.

	1995

	1996 The `#define' directive defines the macro `YYSTYPE', thus specifying

	1997 the C data type for semantic values of both tokens and groupings (*note

	1998 Data Types of Semantic Values: Value Type.). The Bison parser will use

	1999 whatever type `YYSTYPE' is defined as; if you don't define it, `int' is

	2000 the default. Because we specify `double', each token and each

	2001 expression has an associated value, which is a floating point number.

	2002

	2003 The `#include' directive is used to declare the exponentiation

	2004 function `pow'.

	2005

	2006 The forward declarations for `yylex' and `yyerror' are needed

	2007 because the C language requires that functions be declared before they

	2008 are used. These functions will be defined in the epilogue, but the

	2009 parser calls them so they must be declared in the prologue.

	2010

	2011 The second section, Bison declarations, provides information to Bison

	2012 about the token types (*note The Bison Declarations Section: Bison

	2013 Declarations.). Each terminal symbol that is not a single-character

	2014 literal must be declared here. (Single-character literals normally

	2015 don't need to be declared.) In this example, all the arithmetic

	2016 operators are designated by single-character literals, so the only

	2017 terminal symbol that needs to be declared is `NUM', the token type for

	2018 numeric constants.

	2019

	2020

	2021 File: bison.info, Node: Rpcalc Rules, Next: Rpcalc Lexer, Prev: Rpcalc Declar ations, Up: RPN Calc

	2022

	2023 2.1.2 Grammar Rules for `rpcalc'

	2024 --------------------------------

	2025

	2026 Here are the grammar rules for the reverse polish notation calculator.

	2027

	2028 input: /* empty */

	2029 \| input line

	2030 ;

	2031

	2032 line: '\n'

	2033 \| exp '\n' { printf ("\t%.10g\n", $1); }

	2034 ;

	2035

	2036 exp: NUM { $$ = $1; }

	2037 \| exp exp '+' { $$ = $1 + $2; }

	2038 \| exp exp '-' { $$ = $1 - $2; }

	2039 \| exp exp '' { $$ = $1 $2; }

	2040 \| exp exp '/' { $$ = $1 / $2; }

	2041 /* Exponentiation */

	2042 \| exp exp '^' { $$ = pow ($1, $2); }

	2043 /* Unary minus */

	2044 \| exp 'n' { $$ = -$1; }

	2045 ;

	2046 %%

	2047

	2048 The groupings of the rpcalc "language" defined here are the

	2049 expression (given the name `exp'), the line of input (`line'), and the

	2050 complete input transcript (`input'). Each of these nonterminal symbols

	2051 has several alternate rules, joined by the vertical bar `\|' which is

	2052 read as "or". The following sections explain what these rules mean.

	2053

	2054 The semantics of the language is determined by the actions taken

	2055 when a grouping is recognized. The actions are the C code that appears

	2056 inside braces. *Note Actions::.

	2057

	2058 You must specify these actions in C, but Bison provides the means for

	2059 passing semantic values between the rules. In each action, the

	2060 pseudo-variable `$$' stands for the semantic value for the grouping

	2061 that the rule is going to construct. Assigning a value to `$$' is the

	2062 main job of most actions. The semantic values of the components of the

	2063 rule are referred to as `$1', `$2', and so on.

	2064

	2065 * Menu:

	2066

	2067 * Rpcalc Input::

	2068 * Rpcalc Line::

	2069 * Rpcalc Expr::

	2070

	2071

	2072 File: bison.info, Node: Rpcalc Input, Next: Rpcalc Line, Up: Rpcalc Rules

	2073

	2074 2.1.2.1 Explanation of `input'

	2075 ..............................

	2076

	2077 Consider the definition of `input':

	2078

	2079 input: /* empty */

	2080 \| input line

	2081 ;

	2082

	2083 This definition reads as follows: "A complete input is either an

	2084 empty string, or a complete input followed by an input line". Notice

	2085 that "complete input" is defined in terms of itself. This definition

	2086 is said to be "left recursive" since `input' appears always as the

	2087 leftmost symbol in the sequence. *Note Recursive Rules: Recursion.

	2088

	2089 The first alternative is empty because there are no symbols between

	2090 the colon and the first `\|'; this means that `input' can match an empty

	2091 string of input (no tokens). We write the rules this way because it is

	2092 legitimate to type `Ctrl-d' right after you start the calculator. It's

	2093 conventional to put an empty alternative first and write the comment

	2094 `/* empty */' in it.

	2095

	2096 The second alternate rule (`input line') handles all nontrivial

	2097 input. It means, "After reading any number of lines, read one more

	2098 line if possible." The left recursion makes this rule into a loop.

	2099 Since the first alternative matches empty input, the loop can be

	2100 executed zero or more times.

	2101

	2102 The parser function `yyparse' continues to process input until a

	2103 grammatical error is seen or the lexical analyzer says there are no more

	2104 input tokens; we will arrange for the latter to happen at end-of-input.

	2105

	2106

	2107 File: bison.info, Node: Rpcalc Line, Next: Rpcalc Expr, Prev: Rpcalc Input, Up: Rpcalc Rules

	2108

	2109 2.1.2.2 Explanation of `line'

	2110 .............................

	2111

	2112 Now consider the definition of `line':

	2113

	2114 line: '\n'

	2115 \| exp '\n' { printf ("\t%.10g\n", $1); }

	2116 ;

	2117

	2118 The first alternative is a token which is a newline character; this

	2119 means that rpcalc accepts a blank line (and ignores it, since there is

	2120 no action). The second alternative is an expression followed by a

	2121 newline. This is the alternative that makes rpcalc useful. The

	2122 semantic value of the `exp' grouping is the value of `$1' because the

	2123 `exp' in question is the first symbol in the alternative. The action

	2124 prints this value, which is the result of the computation the user

	2125 asked for.

	2126

	2127 This action is unusual because it does not assign a value to `$$'.

	2128 As a consequence, the semantic value associated with the `line' is

	2129 uninitialized (its value will be unpredictable). This would be a bug if

	2130 that value were ever used, but we don't use it: once rpcalc has printed

	2131 the value of the user's input line, that value is no longer needed.

	2132

	2133

	2134 File: bison.info, Node: Rpcalc Expr, Prev: Rpcalc Line, Up: Rpcalc Rules

	2135

	2136 2.1.2.3 Explanation of `expr'

	2137 .............................

	2138

	2139 The `exp' grouping has several rules, one for each kind of expression.

	2140 The first rule handles the simplest expressions: those that are just

	2141 numbers. The second handles an addition-expression, which looks like

	2142 two expressions followed by a plus-sign. The third handles

	2143 subtraction, and so on.

	2144

	2145 exp: NUM

	2146 \| exp exp '+' { $$ = $1 + $2; }

	2147 \| exp exp '-' { $$ = $1 - $2; }

	2148 ...

	2149 ;

	2150

	2151 We have used `\|' to join all the rules for `exp', but we could

	2152 equally well have written them separately:

	2153

	2154 exp: NUM ;

	2155 exp: exp exp '+' { $$ = $1 + $2; } ;

	2156 exp: exp exp '-' { $$ = $1 - $2; } ;

	2157 ...

	2158

	2159 Most of the rules have actions that compute the value of the

	2160 expression in terms of the value of its parts. For example, in the

	2161 rule for addition, `$1' refers to the first component `exp' and `$2'

	2162 refers to the second one. The third component, `'+'', has no meaningful

	2163 associated semantic value, but if it had one you could refer to it as

	2164 `$3'. When `yyparse' recognizes a sum expression using this rule, the

	2165 sum of the two subexpressions' values is produced as the value of the

	2166 entire expression. *Note Actions::.

	2167

	2168 You don't have to give an action for every rule. When a rule has no

	2169 action, Bison by default copies the value of `$1' into `$$'. This is

	2170 what happens in the first rule (the one that uses `NUM').

	2171

	2172 The formatting shown here is the recommended convention, but Bison

	2173 does not require it. You can add or change white space as much as you

	2174 wish. For example, this:

	2175

	2176 exp : NUM \| exp exp '+' {$$ = $1 + $2; } \| ... ;

	2177

	2178 means the same thing as this:

	2179

	2180 exp: NUM

	2181 \| exp exp '+' { $$ = $1 + $2; }

	2182 \| ...

	2183 ;

	2184

	2185 The latter, however, is much more readable.

	2186

	2187

	2188 File: bison.info, Node: Rpcalc Lexer, Next: Rpcalc Main, Prev: Rpcalc Rules, Up: RPN Calc

	2189

	2190 2.1.3 The `rpcalc' Lexical Analyzer

	2191 -----------------------------------

	2192

	2193 The lexical analyzer's job is low-level parsing: converting characters

	2194 or sequences of characters into tokens. The Bison parser gets its

	2195 tokens by calling the lexical analyzer. *Note The Lexical Analyzer

	2196 Function `yylex': Lexical.

	2197

	2198 Only a simple lexical analyzer is needed for the RPN calculator.

	2199 This lexical analyzer skips blanks and tabs, then reads in numbers as

	2200 `double' and returns them as `NUM' tokens. Any other character that

	2201 isn't part of a number is a separate token. Note that the token-code

	2202 for such a single-character token is the character itself.

	2203

	2204 The return value of the lexical analyzer function is a numeric code

	2205 which represents a token type. The same text used in Bison rules to

	2206 stand for this token type is also a C expression for the numeric code

	2207 for the type. This works in two ways. If the token type is a

	2208 character literal, then its numeric code is that of the character; you

	2209 can use the same character literal in the lexical analyzer to express

	2210 the number. If the token type is an identifier, that identifier is

	2211 defined by Bison as a C macro whose definition is the appropriate

	2212 number. In this example, therefore, `NUM' becomes a macro for `yylex'

	2213 to use.

	2214

	2215 The semantic value of the token (if it has one) is stored into the

	2216 global variable `yylval', which is where the Bison parser will look for

	2217 it. (The C data type of `yylval' is `YYSTYPE', which was defined at

	2218 the beginning of the grammar; *note Declarations for `rpcalc': Rpcalc

	2219 Declarations.)

	2220

	2221 A token type code of zero is returned if the end-of-input is

	2222 encountered. (Bison recognizes any nonpositive value as indicating

	2223 end-of-input.)

	2224

	2225 Here is the code for the lexical analyzer:

	2226

	2227 /* The lexical analyzer returns a double floating point

	2228 number on the stack and the token NUM, or the numeric code

	2229 of the character read if not a number. It skips all blanks

	2230 and tabs, and returns 0 for end-of-input. */

	2231

	2232 #include <ctype.h>

	2233

	2234 int

	2235 yylex (void)

	2236 {

	2237 int c;

	2238

	2239 /* Skip white space. */

	2240 while ((c = getchar ()) == ' ' \|\| c == '\t')

	2241 ;

	2242 /* Process numbers. */

	2243 if (c == '.' \|\| isdigit (c))

	2244 {

	2245 ungetc (c, stdin);

	2246 scanf ("%lf", &yylval);

	2247 return NUM;

	2248 }

	2249 /* Return end-of-input. */

	2250 if (c == EOF)

	2251 return 0;

	2252 /* Return a single char. */

	2253 return c;

	2254 }

	2255

	2256

	2257 File: bison.info, Node: Rpcalc Main, Next: Rpcalc Error, Prev: Rpcalc Lexer, Up: RPN Calc

	2258

	2259 2.1.4 The Controlling Function

	2260 ------------------------------

	2261

	2262 In keeping with the spirit of this example, the controlling function is

	2263 kept to the bare minimum. The only requirement is that it call

	2264 `yyparse' to start the process of parsing.

	2265

	2266 int

	2267 main (void)

	2268 {

	2269 return yyparse ();

	2270 }

	2271

	2272

	2273 File: bison.info, Node: Rpcalc Error, Next: Rpcalc Generate, Prev: Rpcalc Mai n, Up: RPN Calc

	2274

	2275 2.1.5 The Error Reporting Routine

	2276 ---------------------------------

	2277

	2278 When `yyparse' detects a syntax error, it calls the error reporting

	2279 function `yyerror' to print an error message (usually but not always

	2280 `"syntax error"'). It is up to the programmer to supply `yyerror'

	2281 (*note Parser C-Language Interface: Interface.), so here is the

	2282 definition we will use:

	2283

	2284 #include <stdio.h>

	2285

	2286 /* Called by yyparse on error. */

	2287 void

	2288 yyerror (char const *s)

	2289 {

	2290 fprintf (stderr, "%s\n", s);

	2291 }

	2292

	2293 After `yyerror' returns, the Bison parser may recover from the error

	2294 and continue parsing if the grammar contains a suitable error rule

	2295 (*note Error Recovery::). Otherwise, `yyparse' returns nonzero. We

	2296 have not written any error rules in this example, so any invalid input

	2297 will cause the calculator program to exit. This is not clean behavior

	2298 for a real calculator, but it is adequate for the first example.

	2299

	2300

	2301 File: bison.info, Node: Rpcalc Generate, Next: Rpcalc Compile, Prev: Rpcalc E rror, Up: RPN Calc

	2302

	2303 2.1.6 Running Bison to Make the Parser

	2304 --------------------------------------

	2305

	2306 Before running Bison to produce a parser, we need to decide how to

	2307 arrange all the source code in one or more source files. For such a

	2308 simple example, the easiest thing is to put everything in one file. The

	2309 definitions of `yylex', `yyerror' and `main' go at the end, in the

	2310 epilogue of the file (*note The Overall Layout of a Bison Grammar:

	2311 Grammar Layout.).

	2312

	2313 For a large project, you would probably have several source files,

	2314 and use `make' to arrange to recompile them.

	2315

	2316 With all the source in a single file, you use the following command

	2317 to convert it into a parser file:

	2318

	2319 bison FILE.y

	2320

	2321 In this example the file was called `rpcalc.y' (for "Reverse Polish

	2322 CALCulator"). Bison produces a file named `FILE.tab.c', removing the

	2323 `.y' from the original file name. The file output by Bison contains

	2324 the source code for `yyparse'. The additional functions in the input

	2325 file (`yylex', `yyerror' and `main') are copied verbatim to the output.

	2326

	2327

	2328 File: bison.info, Node: Rpcalc Compile, Prev: Rpcalc Generate, Up: RPN Calc

	2329

	2330 2.1.7 Compiling the Parser File

	2331 -------------------------------

	2332

	2333 Here is how to compile and run the parser file:

	2334

	2335 # List files in current directory.

	2336 $ ls

	2337 rpcalc.tab.c rpcalc.y

	2338

	2339 # Compile the Bison parser.

	2340 # `-lm' tells compiler to search math library for `pow'.

	2341 $ cc -lm -o rpcalc rpcalc.tab.c

	2342

	2343 # List files again.

	2344 $ ls

	2345 rpcalc rpcalc.tab.c rpcalc.y

	2346

	2347 The file `rpcalc' now contains the executable code. Here is an

	2348 example session using `rpcalc'.

	2349

	2350 $ rpcalc

	2351 4 9 +

	2352 13

	2353 3 7 + 3 4 5 *+-

	2354 -13

	2355 3 7 + 3 4 5 * + - n Note the unary minus, `n'

	2356 13

	2357 5 6 / 4 n +

	2358 -3.166666667

	2359 3 4 ^ Exponentiation

	2360 81

	2361 ^D End-of-file indicator

	2362 $

	2363

	2364

	2365 File: bison.info, Node: Infix Calc, Next: Simple Error Recovery, Prev: RPN Ca lc, Up: Examples

	2366

	2367 2.2 Infix Notation Calculator: `calc'

	2368 =====================================

	2369

	2370 We now modify rpcalc to handle infix operators instead of postfix.

	2371 Infix notation involves the concept of operator precedence and the need

	2372 for parentheses nested to arbitrary depth. Here is the Bison code for

	2373 `calc.y', an infix desk-top calculator.

	2374

	2375 /* Infix notation calculator. */

	2376

	2377 %{

	2378 #define YYSTYPE double

	2379 #include <math.h>

	2380 #include <stdio.h>

	2381 int yylex (void);

	2382 void yyerror (char const *);

	2383 %}

	2384

	2385 /* Bison declarations. */

	2386 %token NUM

	2387 %left '-' '+'

	2388 %left '*' '/'

	2389 %left NEG /* negation--unary minus */

	2390 %right '^' /* exponentiation */

	2391

	2392 %% /* The grammar follows. */

	2393 input: /* empty */

	2394 \| input line

	2395 ;

	2396

	2397 line: '\n'

	2398 \| exp '\n' { printf ("\t%.10g\n", $1); }

	2399 ;

	2400

	2401 exp: NUM { $$ = $1; }

	2402 \| exp '+' exp { $$ = $1 + $3; }

	2403 \| exp '-' exp { $$ = $1 - $3; }

	2404 \| exp '' exp { $$ = $1 $3; }

	2405 \| exp '/' exp { $$ = $1 / $3; }

	2406 \| '-' exp %prec NEG { $$ = -$2; }

	2407 \| exp '^' exp { $$ = pow ($1, $3); }

	2408 \| '(' exp ')' { $$ = $2; }

	2409 ;

	2410 %%

	2411

	2412 The functions `yylex', `yyerror' and `main' can be the same as before.

	2413

	2414 There are two important new features shown in this code.

	2415

	2416 In the second section (Bison declarations), `%left' declares token

	2417 types and says they are left-associative operators. The declarations

	2418 `%left' and `%right' (right associativity) take the place of `%token'

	2419 which is used to declare a token type name without associativity.

	2420 (These tokens are single-character literals, which ordinarily don't

	2421 need to be declared. We declare them here to specify the

	2422 associativity.)

	2423

	2424 Operator precedence is determined by the line ordering of the

	2425 declarations; the higher the line number of the declaration (lower on

	2426 the page or screen), the higher the precedence. Hence, exponentiation

	2427 has the highest precedence, unary minus (`NEG') is next, followed by

	2428 `' and `/', and so on. Note Operator Precedence: Precedence.

	2429

	2430 The other important new feature is the `%prec' in the grammar

	2431 section for the unary minus operator. The `%prec' simply instructs

	2432 Bison that the rule `\| '-' exp' has the same precedence as `NEG'--in

	2433 this case the next-to-highest. *Note Context-Dependent Precedence:

	2434 Contextual Precedence.

	2435

	2436 Here is a sample run of `calc.y':

	2437

	2438 $ calc

	2439 4 + 4.5 - (34/(8*3+-3))

	2440 6.880952381

	2441 -56 + 2

	2442 -54

	2443 3 ^ 2

	2444 9

	2445

	2446

	2447 File: bison.info, Node: Simple Error Recovery, Next: Location Tracking Calc, Prev: Infix Calc, Up: Examples

	2448

	2449 2.3 Simple Error Recovery

	2450 =========================

	2451

	2452 Up to this point, this manual has not addressed the issue of "error

	2453 recovery"--how to continue parsing after the parser detects a syntax

	2454 error. All we have handled is error reporting with `yyerror'. Recall

	2455 that by default `yyparse' returns after calling `yyerror'. This means

	2456 that an erroneous input line causes the calculator program to exit.

	2457 Now we show how to rectify this deficiency.

	2458

	2459 The Bison language itself includes the reserved word `error', which

	2460 may be included in the grammar rules. In the example below it has been

	2461 added to one of the alternatives for `line':

	2462

	2463 line: '\n'

	2464 \| exp '\n' { printf ("\t%.10g\n", $1); }

	2465 \| error '\n' { yyerrok; }

	2466 ;

	2467

	2468 This addition to the grammar allows for simple error recovery in the

	2469 event of a syntax error. If an expression that cannot be evaluated is

	2470 read, the error will be recognized by the third rule for `line', and

	2471 parsing will continue. (The `yyerror' function is still called upon to

	2472 print its message as well.) The action executes the statement

	2473 `yyerrok', a macro defined automatically by Bison; its meaning is that

	2474 error recovery is complete (*note Error Recovery::). Note the

	2475 difference between `yyerrok' and `yyerror'; neither one is a misprint.

	2476

	2477 This form of error recovery deals with syntax errors. There are

	2478 other kinds of errors; for example, division by zero, which raises an

	2479 exception signal that is normally fatal. A real calculator program

	2480 must handle this signal and use `longjmp' to return to `main' and

	2481 resume parsing input lines; it would also have to discard the rest of

	2482 the current line of input. We won't discuss this issue further because

	2483 it is not specific to Bison programs.

	2484

	2485

	2486 File: bison.info, Node: Location Tracking Calc, Next: Multi-function Calc, Pr ev: Simple Error Recovery, Up: Examples

	2487

	2488 2.4 Location Tracking Calculator: `ltcalc'

	2489 ==========================================

	2490

	2491 This example extends the infix notation calculator with location

	2492 tracking. This feature will be used to improve the error messages. For

	2493 the sake of clarity, this example is a simple integer calculator, since

	2494 most of the work needed to use locations will be done in the lexical

	2495 analyzer.

	2496

	2497 * Menu:

	2498

	2499 * Ltcalc Declarations:: Bison and C declarations for ltcalc.

	2500 * Ltcalc Rules:: Grammar rules for ltcalc, with explanations.

	2501 * Ltcalc Lexer:: The lexical analyzer.

	2502

	2503

	2504 File: bison.info, Node: Ltcalc Declarations, Next: Ltcalc Rules, Up: Location Tracking Calc

	2505

	2506 2.4.1 Declarations for `ltcalc'

	2507 -------------------------------

	2508

	2509 The C and Bison declarations for the location tracking calculator are

	2510 the same as the declarations for the infix notation calculator.

	2511

	2512 /* Location tracking calculator. */

	2513

	2514 %{

	2515 #define YYSTYPE int

	2516 #include <math.h>

	2517 int yylex (void);

	2518 void yyerror (char const *);

	2519 %}

	2520

	2521 /* Bison declarations. */

	2522 %token NUM

	2523

	2524 %left '-' '+'

	2525 %left '*' '/'

	2526 %left NEG

	2527 %right '^'

	2528

	2529 %% /* The grammar follows. */

	2530

	2531 Note there are no declarations specific to locations. Defining a data

	2532 type for storing locations is not needed: we will use the type provided

	2533 by default (*note Data Types of Locations: Location Type.), which is a

	2534 four member structure with the following integer fields: `first_line',

	2535 `first_column', `last_line' and `last_column'. By conventions, and in

	2536 accordance with the GNU Coding Standards and common practice, the line

	2537 and column count both start at 1.

	2538

	2539

	2540 File: bison.info, Node: Ltcalc Rules, Next: Ltcalc Lexer, Prev: Ltcalc Declar ations, Up: Location Tracking Calc

	2541

	2542 2.4.2 Grammar Rules for `ltcalc'

	2543 --------------------------------

	2544

	2545 Whether handling locations or not has no effect on the syntax of your

	2546 language. Therefore, grammar rules for this example will be very close

	2547 to those of the previous example: we will only modify them to benefit

	2548 from the new information.

	2549

	2550 Here, we will use locations to report divisions by zero, and locate

	2551 the wrong expressions or subexpressions.

	2552

	2553 input : /* empty */

	2554 \| input line

	2555 ;

	2556

	2557 line : '\n'

	2558 \| exp '\n' { printf ("%d\n", $1); }

	2559 ;

	2560

	2561 exp : NUM { $$ = $1; }

	2562 \| exp '+' exp { $$ = $1 + $3; }

	2563 \| exp '-' exp { $$ = $1 - $3; }

	2564 \| exp '' exp { $$ = $1 $3; }

	2565 \| exp '/' exp

	2566 {

	2567 if ($3)

	2568 $$ = $1 / $3;

	2569 else

	2570 {

	2571 $$ = 1;

	2572 fprintf (stderr, "%d.%d-%d.%d: division by zero",

	2573 @3.first_line, @3.first_column,

	2574 @3.last_line, @3.last_column);

	2575 }

	2576 }

	2577 \| '-' exp %prec NEG { $$ = -$2; }

	2578 \| exp '^' exp { $$ = pow ($1, $3); }

	2579 \| '(' exp ')' { $$ = $2; }

	2580

	2581 This code shows how to reach locations inside of semantic actions, by

	2582 using the pseudo-variables `@N' for rule components, and the

	2583 pseudo-variable `@$' for groupings.

	2584

	2585 We don't need to assign a value to `@$': the output parser does it

	2586 automatically. By default, before executing the C code of each action,

	2587 `@$' is set to range from the beginning of `@1' to the end of `@N', for

	2588 a rule with N components. This behavior can be redefined (*note

	2589 Default Action for Locations: Location Default Action.), and for very

	2590 specific rules, `@$' can be computed by hand.

	2591

	2592

	2593 File: bison.info, Node: Ltcalc Lexer, Prev: Ltcalc Rules, Up: Location Tracki ng Calc

	2594

	2595 2.4.3 The `ltcalc' Lexical Analyzer.

	2596 ------------------------------------

	2597

	2598 Until now, we relied on Bison's defaults to enable location tracking.

	2599 The next step is to rewrite the lexical analyzer, and make it able to

	2600 feed the parser with the token locations, as it already does for

	2601 semantic values.

	2602

	2603 To this end, we must take into account every single character of the

	2604 input text, to avoid the computed locations of being fuzzy or wrong:

	2605

	2606 int

	2607 yylex (void)

	2608 {

	2609 int c;

	2610

	2611 /* Skip white space. */

	2612 while ((c = getchar ()) == ' ' \|\| c == '\t')

	2613 ++yylloc.last_column;

	2614

	2615 /* Step. */

	2616 yylloc.first_line = yylloc.last_line;

	2617 yylloc.first_column = yylloc.last_column;

	2618

	2619 /* Process numbers. */

	2620 if (isdigit (c))

	2621 {

	2622 yylval = c - '0';

	2623 ++yylloc.last_column;

	2624 while (isdigit (c = getchar ()))

	2625 {

	2626 ++yylloc.last_column;

	2627 yylval = yylval * 10 + c - '0';

	2628 }

	2629 ungetc (c, stdin);

	2630 return NUM;

	2631 }

	2632

	2633 /* Return end-of-input. */

	2634 if (c == EOF)

	2635 return 0;

	2636

	2637 /* Return a single char, and update location. */

	2638 if (c == '\n')

	2639 {

	2640 ++yylloc.last_line;

	2641 yylloc.last_column = 0;

	2642 }

	2643 else

	2644 ++yylloc.last_column;

	2645 return c;

	2646 }

	2647

	2648 Basically, the lexical analyzer performs the same processing as

	2649 before: it skips blanks and tabs, and reads numbers or single-character

	2650 tokens. In addition, it updates `yylloc', the global variable (of type

	2651 `YYLTYPE') containing the token's location.

	2652

	2653 Now, each time this function returns a token, the parser has its

	2654 number as well as its semantic value, and its location in the text.

	2655 The last needed change is to initialize `yylloc', for example in the

	2656 controlling function:

	2657

	2658 int

	2659 main (void)

	2660 {

	2661 yylloc.first_line = yylloc.last_line = 1;

	2662 yylloc.first_column = yylloc.last_column = 0;

	2663 return yyparse ();

	2664 }

	2665

	2666 Remember that computing locations is not a matter of syntax. Every

	2667 character must be associated to a location update, whether it is in

	2668 valid input, in comments, in literal strings, and so on.

	2669

	2670

	2671 File: bison.info, Node: Multi-function Calc, Next: Exercises, Prev: Location Tracking Calc, Up: Examples

	2672

	2673 2.5 Multi-Function Calculator: `mfcalc'

	2674 =======================================

	2675

	2676 Now that the basics of Bison have been discussed, it is time to move on

	2677 to a more advanced problem. The above calculators provided only five

	2678 functions, `+', `-', `*', `/' and `^'. It would be nice to have a

	2679 calculator that provides other mathematical functions such as `sin',

	2680 `cos', etc.

	2681

	2682 It is easy to add new operators to the infix calculator as long as

	2683 they are only single-character literals. The lexical analyzer `yylex'

	2684 passes back all nonnumeric characters as tokens, so new grammar rules

	2685 suffice for adding a new operator. But we want something more

	2686 flexible: built-in functions whose syntax has this form:

	2687

	2688 FUNCTION_NAME (ARGUMENT)

	2689

	2690 At the same time, we will add memory to the calculator, by allowing you

	2691 to create named variables, store values in them, and use them later.

	2692 Here is a sample session with the multi-function calculator:

	2693

	2694 $ mfcalc

	2695 pi = 3.141592653589

	2696 3.1415926536

	2697 sin(pi)

	2698 0.0000000000

	2699 alpha = beta1 = 2.3

	2700 2.3000000000

	2701 alpha

	2702 2.3000000000

	2703 ln(alpha)

	2704 0.8329091229

	2705 exp(ln(beta1))

	2706 2.3000000000

	2707 $

	2708

	2709 Note that multiple assignment and nested function calls are

	2710 permitted.

	2711

	2712 * Menu:

	2713

	2714 * Mfcalc Declarations:: Bison declarations for multi-function calculator.

	2715 * Mfcalc Rules:: Grammar rules for the calculator.

	2716 * Mfcalc Symbol Table:: Symbol table management subroutines.

	2717

	2718

	2719 File: bison.info, Node: Mfcalc Declarations, Next: Mfcalc Rules, Up: Multi-fu nction Calc

	2720

	2721 2.5.1 Declarations for `mfcalc'

	2722 -------------------------------

	2723

	2724 Here are the C and Bison declarations for the multi-function calculator.

	2725

	2726 %{

	2727 #include <math.h> /* For math functions, cos(), sin(), etc. */

	2728 #include "calc.h" /* Contains definition of `symrec'. */

	2729 int yylex (void);

	2730 void yyerror (char const *);

	2731 %}

	2732 %union {

	2733 double val; /* For returning numbers. */

	2734 symrec tptr; / For returning symbol-table pointers. */

	2735 }

	2736 %token <val> NUM /* Simple double precision number. */

	2737 %token <tptr> VAR FNCT /* Variable and Function. */

	2738 %type <val> exp

	2739

	2740 %right '='

	2741 %left '-' '+'

	2742 %left '*' '/'

	2743 %left NEG /* negation--unary minus */

	2744 %right '^' /* exponentiation */

	2745 %% /* The grammar follows. */

	2746

	2747 The above grammar introduces only two new features of the Bison

	2748 language. These features allow semantic values to have various data

	2749 types (*note More Than One Value Type: Multiple Types.).

	2750

	2751 The `%union' declaration specifies the entire list of possible types;

	2752 this is instead of defining `YYSTYPE'. The allowable types are now

	2753 double-floats (for `exp' and `NUM') and pointers to entries in the

	2754 symbol table. *Note The Collection of Value Types: Union Decl.

	2755

	2756 Since values can now have various types, it is necessary to

	2757 associate a type with each grammar symbol whose semantic value is used.

	2758 These symbols are `NUM', `VAR', `FNCT', and `exp'. Their declarations

	2759 are augmented with information about their data type (placed between

	2760 angle brackets).

	2761

	2762 The Bison construct `%type' is used for declaring nonterminal

	2763 symbols, just as `%token' is used for declaring token types. We have

	2764 not used `%type' before because nonterminal symbols are normally

	2765 declared implicitly by the rules that define them. But `exp' must be

	2766 declared explicitly so we can specify its value type. *Note

	2767 Nonterminal Symbols: Type Decl.

	2768

	2769

	2770 File: bison.info, Node: Mfcalc Rules, Next: Mfcalc Symbol Table, Prev: Mfcalc Declarations, Up: Multi-function Calc

	2771

	2772 2.5.2 Grammar Rules for `mfcalc'

	2773 --------------------------------

	2774

	2775 Here are the grammar rules for the multi-function calculator. Most of

	2776 them are copied directly from `calc'; three rules, those which mention

	2777 `VAR' or `FNCT', are new.

	2778

	2779 input: /* empty */

	2780 \| input line

	2781 ;

	2782

	2783 line:

	2784 '\n'

	2785 \| exp '\n' { printf ("\t%.10g\n", $1); }

	2786 \| error '\n' { yyerrok; }

	2787 ;

	2788

	2789 exp: NUM { $$ = $1; }

	2790 \| VAR { $$ = $1->value.var; }

	2791 \| VAR '=' exp { $$ = $3; $1->value.var = $3; }

	2792 \| FNCT '(' exp ')' { $$ = (*($1->value.fnctptr))($3); }

	2793 \| exp '+' exp { $$ = $1 + $3; }

	2794 \| exp '-' exp { $$ = $1 - $3; }

	2795 \| exp '' exp { $$ = $1 $3; }

	2796 \| exp '/' exp { $$ = $1 / $3; }

	2797 \| '-' exp %prec NEG { $$ = -$2; }

	2798 \| exp '^' exp { $$ = pow ($1, $3); }

	2799 \| '(' exp ')' { $$ = $2; }

	2800 ;

	2801 /* End of grammar. */

	2802 %%

	2803

	2804

	2805 File: bison.info, Node: Mfcalc Symbol Table, Prev: Mfcalc Rules, Up: Multi-fu nction Calc

	2806

	2807 2.5.3 The `mfcalc' Symbol Table

	2808 -------------------------------

	2809

	2810 The multi-function calculator requires a symbol table to keep track of

	2811 the names and meanings of variables and functions. This doesn't affect

	2812 the grammar rules (except for the actions) or the Bison declarations,

	2813 but it requires some additional C functions for support.

	2814

	2815 The symbol table itself consists of a linked list of records. Its

	2816 definition, which is kept in the header `calc.h', is as follows. It

	2817 provides for either functions or variables to be placed in the table.

	2818

	2819 /* Function type. */

	2820 typedef double (*func_t) (double);

	2821

	2822 /* Data type for links in the chain of symbols. */

	2823 struct symrec

	2824 {

	2825 char name; / name of symbol */

	2826 int type; /* type of symbol: either VAR or FNCT */

	2827 union

	2828 {

	2829 double var; /* value of a VAR */

	2830 func_t fnctptr; /* value of a FNCT */

	2831 } value;

	2832 struct symrec next; / link field */

	2833 };

	2834

	2835 typedef struct symrec symrec;

	2836

	2837 /* The symbol table: a chain of `struct symrec'. */

	2838 extern symrec *sym_table;

	2839

	2840 symrec putsym (char const , int);

	2841 symrec getsym (char const );

	2842

	2843 The new version of `main' includes a call to `init_table', a

	2844 function that initializes the symbol table. Here it is, and

	2845 `init_table' as well:

	2846

	2847 #include <stdio.h>

	2848

	2849 /* Called by yyparse on error. */

	2850 void

	2851 yyerror (char const *s)

	2852 {

	2853 printf ("%s\n", s);

	2854 }

	2855

	2856 struct init

	2857 {

	2858 char const *fname;

	2859 double (*fnct) (double);

	2860 };

	2861

	2862 struct init const arith_fncts[] =

	2863 {

	2864 "sin", sin,

	2865 "cos", cos,

	2866 "atan", atan,

	2867 "ln", log,

	2868 "exp", exp,

	2869 "sqrt", sqrt,

	2870 0, 0

	2871 };

	2872

	2873 /* The symbol table: a chain of `struct symrec'. */

	2874 symrec *sym_table;

	2875

	2876 /* Put arithmetic functions in table. */

	2877 void

	2878 init_table (void)

	2879 {

	2880 int i;

	2881 symrec *ptr;

	2882 for (i = 0; arith_fncts[i].fname != 0; i++)

	2883 {

	2884 ptr = putsym (arith_fncts[i].fname, FNCT);

	2885 ptr->value.fnctptr = arith_fncts[i].fnct;

	2886 }

	2887 }

	2888

	2889 int

	2890 main (void)

	2891 {

	2892 init_table ();

	2893 return yyparse ();

	2894 }

	2895

	2896 By simply editing the initialization list and adding the necessary

	2897 include files, you can add additional functions to the calculator.

	2898

	2899 Two important functions allow look-up and installation of symbols in

	2900 the symbol table. The function `putsym' is passed a name and the type

	2901 (`VAR' or `FNCT') of the object to be installed. The object is linked

	2902 to the front of the list, and a pointer to the object is returned. The

	2903 function `getsym' is passed the name of the symbol to look up. If

	2904 found, a pointer to that symbol is returned; otherwise zero is returned.

	2905

	2906 symrec *

	2907 putsym (char const *sym_name, int sym_type)

	2908 {

	2909 symrec *ptr;

	2910 ptr = (symrec *) malloc (sizeof (symrec));

	2911 ptr->name = (char *) malloc (strlen (sym_name) + 1);

	2912 strcpy (ptr->name,sym_name);

	2913 ptr->type = sym_type;

	2914 ptr->value.var = 0; /* Set value to 0 even if fctn. */

	2915 ptr->next = (struct symrec *)sym_table;

	2916 sym_table = ptr;

	2917 return ptr;

	2918 }

	2919

	2920 symrec *

	2921 getsym (char const *sym_name)

	2922 {

	2923 symrec *ptr;

	2924 for (ptr = sym_table; ptr != (symrec *) 0;

	2925 ptr = (symrec *)ptr->next)

	2926 if (strcmp (ptr->name,sym_name) == 0)

	2927 return ptr;

	2928 return 0;

	2929 }

	2930

	2931 The function `yylex' must now recognize variables, numeric values,

	2932 and the single-character arithmetic operators. Strings of alphanumeric

	2933 characters with a leading letter are recognized as either variables or

	2934 functions depending on what the symbol table says about them.

	2935

	2936 The string is passed to `getsym' for look up in the symbol table. If

	2937 the name appears in the table, a pointer to its location and its type

	2938 (`VAR' or `FNCT') is returned to `yyparse'. If it is not already in

	2939 the table, then it is installed as a `VAR' using `putsym'. Again, a

	2940 pointer and its type (which must be `VAR') is returned to `yyparse'.

	2941

	2942 No change is needed in the handling of numeric values and arithmetic

	2943 operators in `yylex'.

	2944

	2945 #include <ctype.h>

	2946

	2947 int

	2948 yylex (void)

	2949 {

	2950 int c;

	2951

	2952 /* Ignore white space, get first nonwhite character. */

	2953 while ((c = getchar ()) == ' ' \|\| c == '\t');

	2954

	2955 if (c == EOF)

	2956 return 0;

	2957

	2958 /* Char starts a number => parse the number. */

	2959 if (c == '.' \|\| isdigit (c))

	2960 {

	2961 ungetc (c, stdin);

	2962 scanf ("%lf", &yylval.val);

	2963 return NUM;

	2964 }

	2965

	2966 /* Char starts an identifier => read the name. */

	2967 if (isalpha (c))

	2968 {

	2969 symrec *s;

	2970 static char *symbuf = 0;

	2971 static int length = 0;

	2972 int i;

	2973

	2974 /* Initially make the buffer long enough

	2975 for a 40-character symbol name. */

	2976 if (length == 0)

	2977 length = 40, symbuf = (char *)malloc (length + 1);

	2978

	2979 i = 0;

	2980 do

	2981 {

	2982 /* If buffer is full, make it bigger. */

	2983 if (i == length)

	2984 {

	2985 length *= 2;

	2986 symbuf = (char *) realloc (symbuf, length + 1);

	2987 }

	2988 /* Add this character to the buffer. */

	2989 symbuf[i++] = c;

	2990 /* Get another character. */

	2991 c = getchar ();

	2992 }

	2993 while (isalnum (c));

	2994

	2995 ungetc (c, stdin);

	2996 symbuf[i] = '\0';

	2997

	2998 s = getsym (symbuf);

	2999 if (s == 0)

	3000 s = putsym (symbuf, VAR);

	3001 yylval.tptr = s;

	3002 return s->type;

	3003 }

	3004

	3005 /* Any other character is a token by itself. */

	3006 return c;

	3007 }

	3008

	3009 This program is both powerful and flexible. You may easily add new

	3010 functions, and it is a simple job to modify this code to install

	3011 predefined variables such as `pi' or `e' as well.

	3012

	3013

	3014 File: bison.info, Node: Exercises, Prev: Multi-function Calc, Up: Examples

	3015

	3016 2.6 Exercises

	3017 =============

	3018

	3019 1. Add some new functions from `math.h' to the initialization list.

	3020

	3021 2. Add another array that contains constants and their values. Then

	3022 modify `init_table' to add these constants to the symbol table.

	3023 It will be easiest to give the constants type `VAR'.

	3024

	3025 3. Make the program report an error if the user refers to an

	3026 uninitialized variable in any way except to store a value in it.

	3027

	3028

	3029 File: bison.info, Node: Grammar File, Next: Interface, Prev: Examples, Up: T op

	3030

	3031 3 Bison Grammar Files

	3032 *********************

	3033

	3034 Bison takes as input a context-free grammar specification and produces a

	3035 C-language function that recognizes correct instances of the grammar.

	3036

	3037 The Bison grammar input file conventionally has a name ending in

	3038 `.y'. *Note Invoking Bison: Invocation.

	3039

	3040 * Menu:

	3041

	3042 * Grammar Outline:: Overall layout of the grammar file.

	3043 * Symbols:: Terminal and nonterminal symbols.

	3044 * Rules:: How to write grammar rules.

	3045 * Recursion:: Writing recursive rules.

	3046 * Semantics:: Semantic values and actions.

	3047 * Locations:: Locations and actions.

	3048 * Declarations:: All kinds of Bison declarations are described here.

	3049 * Multiple Parsers:: Putting more than one Bison parser in one program.

	3050

	3051

	3052 File: bison.info, Node: Grammar Outline, Next: Symbols, Up: Grammar File

	3053

	3054 3.1 Outline of a Bison Grammar

	3055 ==============================

	3056

	3057 A Bison grammar file has four main sections, shown here with the

	3058 appropriate delimiters:

	3059

	3060 %{

	3061 PROLOGUE

	3062 %}

	3063

	3064 BISON DECLARATIONS

	3065

	3066 %%

	3067 GRAMMAR RULES

	3068 %%

	3069

	3070 EPILOGUE

	3071

	3072 Comments enclosed in `/* ... */' may appear in any of the sections.

	3073 As a GNU extension, `//' introduces a comment that continues until end

	3074 of line.

	3075

	3076 * Menu:

	3077

	3078 * Prologue:: Syntax and usage of the prologue.

	3079 * Prologue Alternatives:: Syntax and usage of alternatives to the prologue.

	3080 * Bison Declarations:: Syntax and usage of the Bison declarations section.

	3081 * Grammar Rules:: Syntax and usage of the grammar rules section.

	3082 * Epilogue:: Syntax and usage of the epilogue.

	3083

	3084

	3085 File: bison.info, Node: Prologue, Next: Prologue Alternatives, Up: Grammar Ou tline

	3086

	3087 3.1.1 The prologue

	3088 ------------------

	3089

	3090 The PROLOGUE section contains macro definitions and declarations of

	3091 functions and variables that are used in the actions in the grammar

	3092 rules. These are copied to the beginning of the parser file so that

	3093 they precede the definition of `yyparse'. You can use `#include' to

	3094 get the declarations from a header file. If you don't need any C

	3095 declarations, you may omit the `%{' and `%}' delimiters that bracket

	3096 this section.

	3097

	3098 The PROLOGUE section is terminated by the first occurrence of `%}'

	3099 that is outside a comment, a string literal, or a character constant.

	3100

	3101 You may have more than one PROLOGUE section, intermixed with the

	3102 BISON DECLARATIONS. This allows you to have C and Bison declarations

	3103 that refer to each other. For example, the `%union' declaration may

	3104 use types defined in a header file, and you may wish to prototype

	3105 functions that take arguments of type `YYSTYPE'. This can be done with

	3106 two PROLOGUE blocks, one before and one after the `%union' declaration.

	3107

	3108 %{

	3109 #define _GNU_SOURCE

	3110 #include <stdio.h>

	3111 #include "ptypes.h"

	3112 %}

	3113

	3114 %union {

	3115 long int n;

	3116 tree t; /* `tree' is defined in `ptypes.h'. */

	3117 }

	3118

	3119 %{

	3120 static void print_token_value (FILE *, int, YYSTYPE);

	3121 #define YYPRINT(F, N, L) print_token_value (F, N, L)

	3122 %}

	3123

	3124 ...

	3125

	3126 When in doubt, it is usually safer to put prologue code before all

	3127 Bison declarations, rather than after. For example, any definitions of

	3128 feature test macros like `_GNU_SOURCE' or `_POSIX_C_SOURCE' should

	3129 appear before all Bison declarations, as feature test macros can affect

	3130 the behavior of Bison-generated `#include' directives.

	3131

	3132

	3133 File: bison.info, Node: Prologue Alternatives, Next: Bison Declarations, Prev : Prologue, Up: Grammar Outline

	3134

	3135 3.1.2 Prologue Alternatives

	3136 ---------------------------

	3137

	3138 (The prologue alternatives described here are experimental. More user

	3139 feedback will help to determine whether they should become permanent

	3140 features.)

	3141

	3142 The functionality of PROLOGUE sections can often be subtle and

	3143 inflexible. As an alternative, Bison provides a %code directive with

	3144 an explicit qualifier field, which identifies the purpose of the code

	3145 and thus the location(s) where Bison should generate it. For C/C++,

	3146 the qualifier can be omitted for the default location, or it can be one

	3147 of `requires', `provides', `top'. *Note %code: Decl Summary.

	3148

	3149 Look again at the example of the previous section:

	3150

	3151 %{

	3152 #define _GNU_SOURCE

	3153 #include <stdio.h>

	3154 #include "ptypes.h"

	3155 %}

	3156

	3157 %union {

	3158 long int n;

	3159 tree t; /* `tree' is defined in `ptypes.h'. */

	3160 }

	3161

	3162 %{

	3163 static void print_token_value (FILE *, int, YYSTYPE);

	3164 #define YYPRINT(F, N, L) print_token_value (F, N, L)

	3165 %}

	3166

	3167 ...

	3168

	3169 Notice that there are two PROLOGUE sections here, but there's a subtle

	3170 distinction between their functionality. For example, if you decide to

	3171 override Bison's default definition for `YYLTYPE', in which PROLOGUE

	3172 section should you write your new definition? You should write it in

	3173 the first since Bison will insert that code into the parser source code

	3174 file _before_ the default `YYLTYPE' definition. In which PROLOGUE

	3175 section should you prototype an internal function, `trace_token', that

	3176 accepts `YYLTYPE' and `yytokentype' as arguments? You should prototype

	3177 it in the second since Bison will insert that code _after_ the

	3178 `YYLTYPE' and `yytokentype' definitions.

	3179

	3180 This distinction in functionality between the two PROLOGUE sections

	3181 is established by the appearance of the `%union' between them. This

	3182 behavior raises a few questions. First, why should the position of a

	3183 `%union' affect definitions related to `YYLTYPE' and `yytokentype'?

	3184 Second, what if there is no `%union'? In that case, the second kind of

	3185 PROLOGUE section is not available. This behavior is not intuitive.

	3186

	3187 To avoid this subtle `%union' dependency, rewrite the example using a

	3188 `%code top' and an unqualified `%code'. Let's go ahead and add the new

	3189 `YYLTYPE' definition and the `trace_token' prototype at the same time:

	3190

	3191 %code top {

	3192 #define _GNU_SOURCE

	3193 #include <stdio.h>

	3194

	3195 /* WARNING: The following code really belongs

	3196 * in a `%code requires'; see below. */

	3197

	3198 #include "ptypes.h"

	3199 #define YYLTYPE YYLTYPE

	3200 typedef struct YYLTYPE

	3201 {

	3202 int first_line;

	3203 int first_column;

	3204 int last_line;

	3205 int last_column;

	3206 char *filename;

	3207 } YYLTYPE;

	3208 }

	3209

	3210 %union {

	3211 long int n;

	3212 tree t; /* `tree' is defined in `ptypes.h'. */

	3213 }

	3214

	3215 %code {

	3216 static void print_token_value (FILE *, int, YYSTYPE);

	3217 #define YYPRINT(F, N, L) print_token_value (F, N, L)

	3218 static void trace_token (enum yytokentype token, YYLTYPE loc);

	3219 }

	3220

	3221 ...

	3222

	3223 In this way, `%code top' and the unqualified `%code' achieve the same

	3224 functionality as the two kinds of PROLOGUE sections, but it's always

	3225 explicit which kind you intend. Moreover, both kinds are always

	3226 available even in the absence of `%union'.

	3227

	3228 The `%code top' block above logically contains two parts. The first

	3229 two lines before the warning need to appear near the top of the parser

	3230 source code file. The first line after the warning is required by

	3231 `YYSTYPE' and thus also needs to appear in the parser source code file.

	3232 However, if you've instructed Bison to generate a parser header file

	3233 (*note %defines: Decl Summary.), you probably want that line to appear

	3234 before the `YYSTYPE' definition in that header file as well. The

	3235 `YYLTYPE' definition should also appear in the parser header file to

	3236 override the default `YYLTYPE' definition there.

	3237

	3238 In other words, in the `%code top' block above, all but the first two

	3239 lines are dependency code required by the `YYSTYPE' and `YYLTYPE'

	3240 definitions. Thus, they belong in one or more `%code requires':

	3241

	3242 %code top {

	3243 #define _GNU_SOURCE

	3244 #include <stdio.h>

	3245 }

	3246

	3247 %code requires {

	3248 #include "ptypes.h"

	3249 }

	3250 %union {

	3251 long int n;

	3252 tree t; /* `tree' is defined in `ptypes.h'. */

	3253 }

	3254

	3255 %code requires {

	3256 #define YYLTYPE YYLTYPE

	3257 typedef struct YYLTYPE

	3258 {

	3259 int first_line;

	3260 int first_column;

	3261 int last_line;

	3262 int last_column;

	3263 char *filename;

	3264 } YYLTYPE;

	3265 }

	3266

	3267 %code {

	3268 static void print_token_value (FILE *, int, YYSTYPE);

	3269 #define YYPRINT(F, N, L) print_token_value (F, N, L)

	3270 static void trace_token (enum yytokentype token, YYLTYPE loc);

	3271 }

	3272

	3273 ...

	3274

	3275 Now Bison will insert `#include "ptypes.h"' and the new `YYLTYPE'

	3276 definition before the Bison-generated `YYSTYPE' and `YYLTYPE'

	3277 definitions in both the parser source code file and the parser header

	3278 file. (By the same reasoning, `%code requires' would also be the

	3279 appropriate place to write your own definition for `YYSTYPE'.)

	3280

	3281 When you are writing dependency code for `YYSTYPE' and `YYLTYPE', you

	3282 should prefer `%code requires' over `%code top' regardless of whether

	3283 you instruct Bison to generate a parser header file. When you are

	3284 writing code that you need Bison to insert only into the parser source

	3285 code file and that has no special need to appear at the top of that

	3286 file, you should prefer the unqualified `%code' over `%code top'.

	3287 These practices will make the purpose of each block of your code

	3288 explicit to Bison and to other developers reading your grammar file.

	3289 Following these practices, we expect the unqualified `%code' and `%code

	3290 requires' to be the most important of the four PROLOGUE alternatives.

	3291

	3292 At some point while developing your parser, you might decide to

	3293 provide `trace_token' to modules that are external to your parser.

	3294 Thus, you might wish for Bison to insert the prototype into both the

	3295 parser header file and the parser source code file. Since this

	3296 function is not a dependency required by `YYSTYPE' or `YYLTYPE', it

	3297 doesn't make sense to move its prototype to a `%code requires'. More

	3298 importantly, since it depends upon `YYLTYPE' and `yytokentype', `%code

	3299 requires' is not sufficient. Instead, move its prototype from the

	3300 unqualified `%code' to a `%code provides':

	3301

	3302 %code top {

	3303 #define _GNU_SOURCE

	3304 #include <stdio.h>

	3305 }

	3306

	3307 %code requires {

	3308 #include "ptypes.h"

	3309 }

	3310 %union {

	3311 long int n;

	3312 tree t; /* `tree' is defined in `ptypes.h'. */

	3313 }

	3314

	3315 %code requires {

	3316 #define YYLTYPE YYLTYPE

	3317 typedef struct YYLTYPE

	3318 {

	3319 int first_line;

	3320 int first_column;

	3321 int last_line;

	3322 int last_column;

	3323 char *filename;

	3324 } YYLTYPE;

	3325 }

	3326

	3327 %code provides {

	3328 void trace_token (enum yytokentype token, YYLTYPE loc);

	3329 }

	3330

	3331 %code {

	3332 static void print_token_value (FILE *, int, YYSTYPE);

	3333 #define YYPRINT(F, N, L) print_token_value (F, N, L)

	3334 }

	3335

	3336 ...

	3337

	3338 Bison will insert the `trace_token' prototype into both the parser

	3339 header file and the parser source code file after the definitions for

	3340 `yytokentype', `YYLTYPE', and `YYSTYPE'.

	3341

	3342 The above examples are careful to write directives in an order that

	3343 reflects the layout of the generated parser source code and header

	3344 files: `%code top', `%code requires', `%code provides', and then

	3345 `%code'. While your grammar files may generally be easier to read if

	3346 you also follow this order, Bison does not require it. Instead, Bison

	3347 lets you choose an organization that makes sense to you.

	3348

	3349 You may declare any of these directives multiple times in the

	3350 grammar file. In that case, Bison concatenates the contained code in

	3351 declaration order. This is the only way in which the position of one

	3352 of these directives within the grammar file affects its functionality.

	3353

	3354 The result of the previous two properties is greater flexibility in

	3355 how you may organize your grammar file. For example, you may organize

	3356 semantic-type-related directives by semantic type:

	3357

	3358 %code requires { #include "type1.h" }

	3359 %union { type1 field1; }

	3360 %destructor { type1_free ($$); } <field1>

	3361 %printer { type1_print ($$); } <field1>

	3362

	3363 %code requires { #include "type2.h" }

	3364 %union { type2 field2; }

	3365 %destructor { type2_free ($$); } <field2>

	3366 %printer { type2_print ($$); } <field2>

	3367

	3368 You could even place each of the above directive groups in the rules

	3369 section of the grammar file next to the set of rules that uses the

	3370 associated semantic type. (In the rules section, you must terminate

	3371 each of those directives with a semicolon.) And you don't have to

	3372 worry that some directive (like a `%union') in the definitions section

	3373 is going to adversely affect their functionality in some

	3374 counter-intuitive manner just because it comes first. Such an

	3375 organization is not possible using PROLOGUE sections.

	3376

	3377 This section has been concerned with explaining the advantages of

	3378 the four PROLOGUE alternatives over the original Yacc PROLOGUE.

	3379 However, in most cases when using these directives, you shouldn't need

	3380 to think about all the low-level ordering issues discussed here.

	3381 Instead, you should simply use these directives to label each block of

	3382 your code according to its purpose and let Bison handle the ordering.

	3383 `%code' is the most generic label. Move code to `%code requires',

	3384 `%code provides', or `%code top' as needed.

	3385

	3386

	3387 File: bison.info, Node: Bison Declarations, Next: Grammar Rules, Prev: Prolog ue Alternatives, Up: Grammar Outline

	3388

	3389 3.1.3 The Bison Declarations Section

	3390 ------------------------------------

	3391

	3392 The BISON DECLARATIONS section contains declarations that define

	3393 terminal and nonterminal symbols, specify precedence, and so on. In

	3394 some simple grammars you may not need any declarations. *Note Bison

	3395 Declarations: Declarations.

	3396

	3397

	3398 File: bison.info, Node: Grammar Rules, Next: Epilogue, Prev: Bison Declaratio ns, Up: Grammar Outline

	3399

	3400 3.1.4 The Grammar Rules Section

	3401 -------------------------------

	3402

	3403 The "grammar rules" section contains one or more Bison grammar rules,

	3404 and nothing else. *Note Syntax of Grammar Rules: Rules.

	3405

	3406 There must always be at least one grammar rule, and the first `%%'

	3407 (which precedes the grammar rules) may never be omitted even if it is

	3408 the first thing in the file.

	3409

	3410

	3411 File: bison.info, Node: Epilogue, Prev: Grammar Rules, Up: Grammar Outline

	3412

	3413 3.1.5 The epilogue

	3414 ------------------

	3415

	3416 The EPILOGUE is copied verbatim to the end of the parser file, just as

	3417 the PROLOGUE is copied to the beginning. This is the most convenient

	3418 place to put anything that you want to have in the parser file but

	3419 which need not come before the definition of `yyparse'. For example,

	3420 the definitions of `yylex' and `yyerror' often go here. Because C

	3421 requires functions to be declared before being used, you often need to

	3422 declare functions like `yylex' and `yyerror' in the Prologue, even if

	3423 you define them in the Epilogue. *Note Parser C-Language Interface:

	3424 Interface.

	3425

	3426 If the last section is empty, you may omit the `%%' that separates it

	3427 from the grammar rules.

	3428

	3429 The Bison parser itself contains many macros and identifiers whose

	3430 names start with `yy' or `YY', so it is a good idea to avoid using any

	3431 such names (except those documented in this manual) in the epilogue of

	3432 the grammar file.

	3433

	3434

	3435 File: bison.info, Node: Symbols, Next: Rules, Prev: Grammar Outline, Up: Gra mmar File

	3436

	3437 3.2 Symbols, Terminal and Nonterminal

	3438 =====================================

	3439

	3440 "Symbols" in Bison grammars represent the grammatical classifications

	3441 of the language.

	3442

	3443 A "terminal symbol" (also known as a "token type") represents a

	3444 class of syntactically equivalent tokens. You use the symbol in grammar

	3445 rules to mean that a token in that class is allowed. The symbol is

	3446 represented in the Bison parser by a numeric code, and the `yylex'

	3447 function returns a token type code to indicate what kind of token has

	3448 been read. You don't need to know what the code value is; you can use

	3449 the symbol to stand for it.

	3450

	3451 A "nonterminal symbol" stands for a class of syntactically

	3452 equivalent groupings. The symbol name is used in writing grammar rules.

	3453 By convention, it should be all lower case.

	3454

	3455 Symbol names can contain letters, digits (not at the beginning),

	3456 underscores and periods. Periods make sense only in nonterminals.

	3457

	3458 There are three ways of writing terminal symbols in the grammar:

	3459

	3460 * A "named token type" is written with an identifier, like an

	3461 identifier in C. By convention, it should be all upper case. Each

	3462 such name must be defined with a Bison declaration such as

	3463 `%token'. *Note Token Type Names: Token Decl.

	3464

	3465 * A "character token type" (or "literal character token") is written

	3466 in the grammar using the same syntax used in C for character

	3467 constants; for example, `'+'' is a character token type. A

	3468 character token type doesn't need to be declared unless you need to

	3469 specify its semantic value data type (*note Data Types of Semantic

	3470 Values: Value Type.), associativity, or precedence (*note Operator

	3471 Precedence: Precedence.).

	3472

	3473 By convention, a character token type is used only to represent a

	3474 token that consists of that particular character. Thus, the token

	3475 type `'+'' is used to represent the character `+' as a token.

	3476 Nothing enforces this convention, but if you depart from it, your

	3477 program will confuse other readers.

	3478

	3479 All the usual escape sequences used in character literals in C can

	3480 be used in Bison as well, but you must not use the null character

	3481 as a character literal because its numeric code, zero, signifies

	3482 end-of-input (*note Calling Convention for `yylex': Calling

	3483 Convention.). Also, unlike standard C, trigraphs have no special

	3484 meaning in Bison character literals, nor is backslash-newline

	3485 allowed.

	3486

	3487 * A "literal string token" is written like a C string constant; for

	3488 example, `"<="' is a literal string token. A literal string token

	3489 doesn't need to be declared unless you need to specify its semantic

	3490 value data type (*note Value Type::), associativity, or precedence

	3491 (*note Precedence::).

	3492

	3493 You can associate the literal string token with a symbolic name as

	3494 an alias, using the `%token' declaration (*note Token

	3495 Declarations: Token Decl.). If you don't do that, the lexical

	3496 analyzer has to retrieve the token number for the literal string

	3497 token from the `yytname' table (*note Calling Convention::).

	3498

	3499 Warning: literal string tokens do not work in Yacc.

	3500

	3501 By convention, a literal string token is used only to represent a

	3502 token that consists of that particular string. Thus, you should

	3503 use the token type `"<="' to represent the string `<=' as a token.

	3504 Bison does not enforce this convention, but if you depart from

	3505 it, people who read your program will be confused.

	3506

	3507 All the escape sequences used in string literals in C can be used

	3508 in Bison as well, except that you must not use a null character

	3509 within a string literal. Also, unlike Standard C, trigraphs have

	3510 no special meaning in Bison string literals, nor is

	3511 backslash-newline allowed. A literal string token must contain

	3512 two or more characters; for a token containing just one character,

	3513 use a character token (see above).

	3514

	3515 How you choose to write a terminal symbol has no effect on its

	3516 grammatical meaning. That depends only on where it appears in rules and

	3517 on when the parser function returns that symbol.

	3518

	3519 The value returned by `yylex' is always one of the terminal symbols,

	3520 except that a zero or negative value signifies end-of-input. Whichever

	3521 way you write the token type in the grammar rules, you write it the

	3522 same way in the definition of `yylex'. The numeric code for a

	3523 character token type is simply the positive numeric code of the

	3524 character, so `yylex' can use the identical value to generate the

	3525 requisite code, though you may need to convert it to `unsigned char' to

	3526 avoid sign-extension on hosts where `char' is signed. Each named token

	3527 type becomes a C macro in the parser file, so `yylex' can use the name

	3528 to stand for the code. (This is why periods don't make sense in

	3529 terminal symbols.) *Note Calling Convention for `yylex': Calling

	3530 Convention.

	3531

	3532 If `yylex' is defined in a separate file, you need to arrange for the

	3533 token-type macro definitions to be available there. Use the `-d'

	3534 option when you run Bison, so that it will write these macro definitions

	3535 into a separate header file `NAME.tab.h' which you can include in the

	3536 other source files that need it. *Note Invoking Bison: Invocation.

	3537

	3538 If you want to write a grammar that is portable to any Standard C

	3539 host, you must use only nonnull character tokens taken from the basic

	3540 execution character set of Standard C. This set consists of the ten

	3541 digits, the 52 lower- and upper-case English letters, and the

	3542 characters in the following C-language string:

	3543

	3544 "\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_{\|}~"

	3545

	3546 The `yylex' function and Bison must use a consistent character set

	3547 and encoding for character tokens. For example, if you run Bison in an

	3548 ASCII environment, but then compile and run the resulting program in an

	3549 environment that uses an incompatible character set like EBCDIC, the

	3550 resulting program may not work because the tables generated by Bison

	3551 will assume ASCII numeric values for character tokens. It is standard

	3552 practice for software distributions to contain C source files that were

	3553 generated by Bison in an ASCII environment, so installers on platforms

	3554 that are incompatible with ASCII must rebuild those files before

	3555 compiling them.

	3556

	3557 The symbol `error' is a terminal symbol reserved for error recovery

	3558 (*note Error Recovery::); you shouldn't use it for any other purpose.

	3559 In particular, `yylex' should never return this value. The default

	3560 value of the error token is 256, unless you explicitly assigned 256 to

	3561 one of your tokens with a `%token' declaration.

	3562

	3563

	3564 File: bison.info, Node: Rules, Next: Recursion, Prev: Symbols, Up: Grammar F ile

	3565

	3566 3.3 Syntax of Grammar Rules

	3567 ===========================

	3568

	3569 A Bison grammar rule has the following general form:

	3570

	3571 RESULT: COMPONENTS...

	3572 ;

	3573

	3574 where RESULT is the nonterminal symbol that this rule describes, and

	3575 COMPONENTS are various terminal and nonterminal symbols that are put

	3576 together by this rule (*note Symbols::).

	3577

	3578 For example,

	3579

	3580 exp: exp '+' exp

	3581 ;

	3582

	3583 says that two groupings of type `exp', with a `+' token in between, can

	3584 be combined into a larger grouping of type `exp'.

	3585

	3586 White space in rules is significant only to separate symbols. You

	3587 can add extra white space as you wish.

	3588

	3589 Scattered among the components can be ACTIONS that determine the

	3590 semantics of the rule. An action looks like this:

	3591

	3592 {C STATEMENTS}

	3593

	3594 This is an example of "braced code", that is, C code surrounded by

	3595 braces, much like a compound statement in C. Braced code can contain

	3596 any sequence of C tokens, so long as its braces are balanced. Bison

	3597 does not check the braced code for correctness directly; it merely

	3598 copies the code to the output file, where the C compiler can check it.

	3599

	3600 Within braced code, the balanced-brace count is not affected by

	3601 braces within comments, string literals, or character constants, but it

	3602 is affected by the C digraphs `<%' and `%>' that represent braces. At

	3603 the top level braced code must be terminated by `}' and not by a

	3604 digraph. Bison does not look for trigraphs, so if braced code uses

	3605 trigraphs you should ensure that they do not affect the nesting of

	3606 braces or the boundaries of comments, string literals, or character

	3607 constants.

	3608

	3609 Usually there is only one action and it follows the components.

	3610 *Note Actions::.

	3611

	3612 Multiple rules for the same RESULT can be written separately or can

	3613 be joined with the vertical-bar character `\|' as follows:

	3614

	3615 RESULT: RULE1-COMPONENTS...

	3616 \| RULE2-COMPONENTS...

	3617 ...

	3618 ;

	3619

	3620 They are still considered distinct rules even when joined in this way.

	3621

	3622 If COMPONENTS in a rule is empty, it means that RESULT can match the

	3623 empty string. For example, here is how to define a comma-separated

	3624 sequence of zero or more `exp' groupings:

	3625

	3626 expseq: /* empty */

	3627 \| expseq1

	3628 ;

	3629

	3630 expseq1: exp

	3631 \| expseq1 ',' exp

	3632 ;

	3633

	3634 It is customary to write a comment `/* empty */' in each rule with no

	3635 components.

	3636

	3637

	3638 File: bison.info, Node: Recursion, Next: Semantics, Prev: Rules, Up: Grammar File

	3639

	3640 3.4 Recursive Rules

	3641 ===================

	3642

	3643 A rule is called "recursive" when its RESULT nonterminal appears also

	3644 on its right hand side. Nearly all Bison grammars need to use

	3645 recursion, because that is the only way to define a sequence of any

	3646 number of a particular thing. Consider this recursive definition of a

	3647 comma-separated sequence of one or more expressions:

	3648

	3649 expseq1: exp

	3650 \| expseq1 ',' exp

	3651 ;

	3652

	3653 Since the recursive use of `expseq1' is the leftmost symbol in the

	3654 right hand side, we call this "left recursion". By contrast, here the

	3655 same construct is defined using "right recursion":

	3656

	3657 expseq1: exp

	3658 \| exp ',' expseq1

	3659 ;

	3660

	3661 Any kind of sequence can be defined using either left recursion or right

	3662 recursion, but you should always use left recursion, because it can

	3663 parse a sequence of any number of elements with bounded stack space.

	3664 Right recursion uses up space on the Bison stack in proportion to the

	3665 number of elements in the sequence, because all the elements must be

	3666 shifted onto the stack before the rule can be applied even once. *Note

	3667 The Bison Parser Algorithm: Algorithm, for further explanation of this.

	3668

	3669 "Indirect" or "mutual" recursion occurs when the result of the rule

	3670 does not appear directly on its right hand side, but does appear in

	3671 rules for other nonterminals which do appear on its right hand side.

	3672

	3673 For example:

	3674

	3675 expr: primary

	3676 \| primary '+' primary

	3677 ;

	3678

	3679 primary: constant

	3680 \| '(' expr ')'

	3681 ;

	3682

	3683 defines two mutually-recursive nonterminals, since each refers to the

	3684 other.

	3685

	3686

	3687 File: bison.info, Node: Semantics, Next: Locations, Prev: Recursion, Up: Gra mmar File

	3688

	3689 3.5 Defining Language Semantics

	3690 ===============================

	3691

	3692 The grammar rules for a language determine only the syntax. The

	3693 semantics are determined by the semantic values associated with various

	3694 tokens and groupings, and by the actions taken when various groupings

	3695 are recognized.

	3696

	3697 For example, the calculator calculates properly because the value

	3698 associated with each expression is the proper number; it adds properly

	3699 because the action for the grouping `X + Y' is to add the numbers

	3700 associated with X and Y.

	3701

	3702 * Menu:

	3703

	3704 * Value Type:: Specifying one data type for all semantic values.

	3705 * Multiple Types:: Specifying several alternative data types.

	3706 * Actions:: An action is the semantic definition of a grammar rule.

	3707 * Action Types:: Specifying data types for actions to operate on.

	3708 * Mid-Rule Actions:: Most actions go at the end of a rule.

	3709 This says when, why and how to use the exceptional

	3710 action in the middle of a rule.

	3711

	3712

	3713 File: bison.info, Node: Value Type, Next: Multiple Types, Up: Semantics

	3714

	3715 3.5.1 Data Types of Semantic Values

	3716 -----------------------------------

	3717

	3718 In a simple program it may be sufficient to use the same data type for

	3719 the semantic values of all language constructs. This was true in the

	3720 RPN and infix calculator examples (*note Reverse Polish Notation

	3721 Calculator: RPN Calc.).

	3722

	3723 Bison normally uses the type `int' for semantic values if your

	3724 program uses the same data type for all language constructs. To

	3725 specify some other type, define `YYSTYPE' as a macro, like this:

	3726

	3727 #define YYSTYPE double

	3728

	3729 `YYSTYPE''s replacement list should be a type name that does not

	3730 contain parentheses or square brackets. This macro definition must go

	3731 in the prologue of the grammar file (*note Outline of a Bison Grammar:

	3732 Grammar Outline.).

	3733

	3734

	3735 File: bison.info, Node: Multiple Types, Next: Actions, Prev: Value Type, Up: Semantics

	3736

	3737 3.5.2 More Than One Value Type

	3738 ------------------------------

	3739

	3740 In most programs, you will need different data types for different kinds

	3741 of tokens and groupings. For example, a numeric constant may need type

	3742 `int' or `long int', while a string constant needs type `char *', and

	3743 an identifier might need a pointer to an entry in the symbol table.

	3744

	3745 To use more than one data type for semantic values in one parser,

	3746 Bison requires you to do two things:

	3747

	3748 * Specify the entire collection of possible data types, either by

	3749 using the `%union' Bison declaration (*note The Collection of

	3750 Value Types: Union Decl.), or by using a `typedef' or a `#define'

	3751 to define `YYSTYPE' to be a union type whose member names are the

	3752 type tags.

	3753

	3754 * Choose one of those types for each symbol (terminal or

	3755 nonterminal) for which semantic values are used. This is done for

	3756 tokens with the `%token' Bison declaration (*note Token Type

	3757 Names: Token Decl.) and for groupings with the `%type' Bison

	3758 declaration (*note Nonterminal Symbols: Type Decl.).

	3759

	3760

	3761 File: bison.info, Node: Actions, Next: Action Types, Prev: Multiple Types, U p: Semantics

	3762

	3763 3.5.3 Actions

	3764 -------------

	3765

	3766 An action accompanies a syntactic rule and contains C code to be

	3767 executed each time an instance of that rule is recognized. The task of

	3768 most actions is to compute a semantic value for the grouping built by

	3769 the rule from the semantic values associated with tokens or smaller

	3770 groupings.

	3771

	3772 An action consists of braced code containing C statements, and can be

	3773 placed at any position in the rule; it is executed at that position.

	3774 Most rules have just one action at the end of the rule, following all

	3775 the components. Actions in the middle of a rule are tricky and used

	3776 only for special purposes (*note Actions in Mid-Rule: Mid-Rule

	3777 Actions.).

	3778

	3779 The C code in an action can refer to the semantic values of the

	3780 components matched by the rule with the construct `$N', which stands for

	3781 the value of the Nth component. The semantic value for the grouping

	3782 being constructed is `$$'. Bison translates both of these constructs

	3783 into expressions of the appropriate type when it copies the actions

	3784 into the parser file. `$$' is translated to a modifiable lvalue, so it

	3785 can be assigned to.

	3786

	3787 Here is a typical example:

	3788

	3789 exp: ...

	3790 \| exp '+' exp

	3791 { $$ = $1 + $3; }

	3792

	3793 This rule constructs an `exp' from two smaller `exp' groupings

	3794 connected by a plus-sign token. In the action, `$1' and `$3' refer to

	3795 the semantic values of the two component `exp' groupings, which are the

	3796 first and third symbols on the right hand side of the rule. The sum is

	3797 stored into `$$' so that it becomes the semantic value of the

	3798 addition-expression just recognized by the rule. If there were a

	3799 useful semantic value associated with the `+' token, it could be

	3800 referred to as `$2'.

	3801

	3802 Note that the vertical-bar character `\|' is really a rule separator,

	3803 and actions are attached to a single rule. This is a difference with

	3804 tools like Flex, for which `\|' stands for either "or", or "the same

	3805 action as that of the next rule". In the following example, the action

	3806 is triggered only when `b' is found:

	3807

	3808 a-or-b: 'a'\|'b' { a_or_b_found = 1; };

	3809

	3810 If you don't specify an action for a rule, Bison supplies a default:

	3811 `$$ = $1'. Thus, the value of the first symbol in the rule becomes the

	3812 value of the whole rule. Of course, the default action is valid only

	3813 if the two data types match. There is no meaningful default action for

	3814 an empty rule; every empty rule must have an explicit action unless the

	3815 rule's value does not matter.

	3816

	3817 `$N' with N zero or negative is allowed for reference to tokens and

	3818 groupings on the stack _before_ those that match the current rule.

	3819 This is a very risky practice, and to use it reliably you must be

	3820 certain of the context in which the rule is applied. Here is a case in

	3821 which you can use this reliably:

	3822

	3823 foo: expr bar '+' expr { ... }

	3824 \| expr bar '-' expr { ... }

	3825 ;

	3826

	3827 bar: /* empty */

	3828 { previous_expr = $0; }

	3829 ;

	3830

	3831 As long as `bar' is used only in the fashion shown here, `$0' always

	3832 refers to the `expr' which precedes `bar' in the definition of `foo'.

	3833

	3834 It is also possible to access the semantic value of the lookahead

	3835 token, if any, from a semantic action. This semantic value is stored

	3836 in `yylval'. *Note Special Features for Use in Actions: Action

	3837 Features.

	3838

	3839

	3840 File: bison.info, Node: Action Types, Next: Mid-Rule Actions, Prev: Actions, Up: Semantics

	3841

	3842 3.5.4 Data Types of Values in Actions

	3843 -------------------------------------

	3844

	3845 If you have chosen a single data type for semantic values, the `$$' and

	3846 `$N' constructs always have that data type.

	3847

	3848 If you have used `%union' to specify a variety of data types, then

	3849 you must declare a choice among these types for each terminal or

	3850 nonterminal symbol that can have a semantic value. Then each time you

	3851 use `$$' or `$N', its data type is determined by which symbol it refers

	3852 to in the rule. In this example,

	3853

	3854 exp: ...

	3855 \| exp '+' exp

	3856 { $$ = $1 + $3; }

	3857

	3858 `$1' and `$3' refer to instances of `exp', so they all have the data

	3859 type declared for the nonterminal symbol `exp'. If `$2' were used, it

	3860 would have the data type declared for the terminal symbol `'+'',

	3861 whatever that might be.

	3862

	3863 Alternatively, you can specify the data type when you refer to the

	3864 value, by inserting `<TYPE>' after the `$' at the beginning of the

	3865 reference. For example, if you have defined types as shown here:

	3866

	3867 %union {

	3868 int itype;

	3869 double dtype;

	3870 }

	3871

	3872 then you can write `$<itype>1' to refer to the first subunit of the

	3873 rule as an integer, or `$<dtype>1' to refer to it as a double.

	3874

	3875

	3876 File: bison.info, Node: Mid-Rule Actions, Prev: Action Types, Up: Semantics

	3877

	3878 3.5.5 Actions in Mid-Rule

	3879 -------------------------

	3880

	3881 Occasionally it is useful to put an action in the middle of a rule.

	3882 These actions are written just like usual end-of-rule actions, but they

	3883 are executed before the parser even recognizes the following components.

	3884

	3885 A mid-rule action may refer to the components preceding it using

	3886 `$N', but it may not refer to subsequent components because it is run

	3887 before they are parsed.

	3888

	3889 The mid-rule action itself counts as one of the components of the

	3890 rule. This makes a difference when there is another action later in

	3891 the same rule (and usually there is another at the end): you have to

	3892 count the actions along with the symbols when working out which number

	3893 N to use in `$N'.

	3894

	3895 The mid-rule action can also have a semantic value. The action can

	3896 set its value with an assignment to `$$', and actions later in the rule

	3897 can refer to the value using `$N'. Since there is no symbol to name

	3898 the action, there is no way to declare a data type for the value in

	3899 advance, so you must use the `$<...>N' construct to specify a data type

	3900 each time you refer to this value.

	3901

	3902 There is no way to set the value of the entire rule with a mid-rule

	3903 action, because assignments to `$$' do not have that effect. The only

	3904 way to set the value for the entire rule is with an ordinary action at

	3905 the end of the rule.

	3906

	3907 Here is an example from a hypothetical compiler, handling a `let'

	3908 statement that looks like `let (VARIABLE) STATEMENT' and serves to

	3909 create a variable named VARIABLE temporarily for the duration of

	3910 STATEMENT. To parse this construct, we must put VARIABLE into the

	3911 symbol table while STATEMENT is parsed, then remove it afterward. Here

	3912 is how it is done:

	3913

	3914 stmt: LET '(' var ')'

	3915 { $<context>$ = push_context ();

	3916 declare_variable ($3); }

	3917 stmt { $$ = $6;

	3918 pop_context ($<context>5); }

	3919

	3920 As soon as `let (VARIABLE)' has been recognized, the first action is

	3921 run. It saves a copy of the current semantic context (the list of

	3922 accessible variables) as its semantic value, using alternative

	3923 `context' in the data-type union. Then it calls `declare_variable' to

	3924 add the new variable to that list. Once the first action is finished,

	3925 the embedded statement `stmt' can be parsed. Note that the mid-rule

	3926 action is component number 5, so the `stmt' is component number 6.

	3927

	3928 After the embedded statement is parsed, its semantic value becomes

	3929 the value of the entire `let'-statement. Then the semantic value from

	3930 the earlier action is used to restore the prior list of variables. This

	3931 removes the temporary `let'-variable from the list so that it won't

	3932 appear to exist while the rest of the program is parsed.

	3933

	3934 In the above example, if the parser initiates error recovery (*note

	3935 Error Recovery::) while parsing the tokens in the embedded statement

	3936 `stmt', it might discard the previous semantic context `$<context>5'

	3937 without restoring it. Thus, `$<context>5' needs a destructor (*note

	3938 Freeing Discarded Symbols: Destructor Decl.). However, Bison currently

	3939 provides no means to declare a destructor specific to a particular

	3940 mid-rule action's semantic value.

	3941

	3942 One solution is to bury the mid-rule action inside a nonterminal

	3943 symbol and to declare a destructor for that symbol:

	3944

	3945 %type <context> let

	3946 %destructor { pop_context ($$); } let

	3947

	3948 %%

	3949

	3950 stmt: let stmt

	3951 { $$ = $2;

	3952 pop_context ($1); }

	3953 ;

	3954

	3955 let: LET '(' var ')'

	3956 { $$ = push_context ();

	3957 declare_variable ($3); }

	3958 ;

	3959

	3960 Note that the action is now at the end of its rule. Any mid-rule

	3961 action can be converted to an end-of-rule action in this way, and this

	3962 is what Bison actually does to implement mid-rule actions.

	3963

	3964 Taking action before a rule is completely recognized often leads to

	3965 conflicts since the parser must commit to a parse in order to execute

	3966 the action. For example, the following two rules, without mid-rule

	3967 actions, can coexist in a working parser because the parser can shift

	3968 the open-brace token and look at what follows before deciding whether

	3969 there is a declaration or not:

	3970

	3971 compound: '{' declarations statements '}'

	3972 \| '{' statements '}'

	3973 ;

	3974

	3975 But when we add a mid-rule action as follows, the rules become

	3976 nonfunctional:

	3977

	3978 compound: { prepare_for_local_variables (); }

	3979 '{' declarations statements '}'

	3980 \| '{' statements '}'

	3981 ;

	3982

	3983 Now the parser is forced to decide whether to run the mid-rule action

	3984 when it has read no farther than the open-brace. In other words, it

	3985 must commit to using one rule or the other, without sufficient

	3986 information to do it correctly. (The open-brace token is what is called

	3987 the "lookahead" token at this time, since the parser is still deciding

	3988 what to do about it. *Note Lookahead Tokens: Lookahead.)

	3989

	3990 You might think that you could correct the problem by putting

	3991 identical actions into the two rules, like this:

	3992

	3993 compound: { prepare_for_local_variables (); }

	3994 '{' declarations statements '}'

	3995 \| { prepare_for_local_variables (); }

	3996 '{' statements '}'

	3997 ;

	3998

	3999 But this does not help, because Bison does not realize that the two

	4000 actions are identical. (Bison never tries to understand the C code in

	4001 an action.)

	4002

	4003 If the grammar is such that a declaration can be distinguished from a

	4004 statement by the first token (which is true in C), then one solution

	4005 which does work is to put the action after the open-brace, like this:

	4006

	4007 compound: '{' { prepare_for_local_variables (); }

	4008 declarations statements '}'

	4009 \| '{' statements '}'

	4010 ;

	4011

	4012 Now the first token of the following declaration or statement, which

	4013 would in any case tell Bison which rule to use, can still do so.

	4014

	4015 Another solution is to bury the action inside a nonterminal symbol

	4016 which serves as a subroutine:

	4017

	4018 subroutine: /* empty */

	4019 { prepare_for_local_variables (); }

	4020 ;

	4021

	4022 compound: subroutine

	4023 '{' declarations statements '}'

	4024 \| subroutine

	4025 '{' statements '}'

	4026 ;

	4027

	4028 Now Bison can execute the action in the rule for `subroutine' without

	4029 deciding which rule for `compound' it will eventually use.

	4030

	4031

	4032 File: bison.info, Node: Locations, Next: Declarations, Prev: Semantics, Up: Grammar File

	4033

	4034 3.6 Tracking Locations

	4035 ======================

	4036

	4037 Though grammar rules and semantic actions are enough to write a fully

	4038 functional parser, it can be useful to process some additional

	4039 information, especially symbol locations.

	4040

	4041 The way locations are handled is defined by providing a data type,

	4042 and actions to take when rules are matched.

	4043

	4044 * Menu:

	4045

	4046 * Location Type:: Specifying a data type for locations.

	4047 * Actions and Locations:: Using locations in actions.

	4048 * Location Default Action:: Defining a general way to compute locations.

	4049

	4050

	4051 File: bison.info, Node: Location Type, Next: Actions and Locations, Up: Locat ions

	4052

	4053 3.6.1 Data Type of Locations

	4054 ----------------------------

	4055

	4056 Defining a data type for locations is much simpler than for semantic

	4057 values, since all tokens and groupings always use the same type.

	4058

	4059 You can specify the type of locations by defining a macro called

	4060 `YYLTYPE', just as you can specify the semantic value type by defining

	4061 a `YYSTYPE' macro (*note Value Type::). When `YYLTYPE' is not defined,

	4062 Bison uses a default structure type with four members:

	4063

	4064 typedef struct YYLTYPE

	4065 {

	4066 int first_line;

	4067 int first_column;

	4068 int last_line;

	4069 int last_column;

	4070 } YYLTYPE;

	4071

	4072 At the beginning of the parsing, Bison initializes all these fields

	4073 to 1 for `yylloc'.

	4074

	4075

	4076 File: bison.info, Node: Actions and Locations, Next: Location Default Action, Prev: Location Type, Up: Locations

	4077

	4078 3.6.2 Actions and Locations

	4079 ---------------------------

	4080

	4081 Actions are not only useful for defining language semantics, but also

	4082 for describing the behavior of the output parser with locations.

	4083

	4084 The most obvious way for building locations of syntactic groupings

	4085 is very similar to the way semantic values are computed. In a given

	4086 rule, several constructs can be used to access the locations of the

	4087 elements being matched. The location of the Nth component of the right

	4088 hand side is `@N', while the location of the left hand side grouping is

	4089 `@$'.

	4090

	4091 Here is a basic example using the default data type for locations:

	4092

	4093 exp: ...

	4094 \| exp '/' exp

	4095 {

	4096 @$.first_column = @1.first_column;

	4097 @$.first_line = @1.first_line;

	4098 @$.last_column = @3.last_column;

	4099 @$.last_line = @3.last_line;

	4100 if ($3)

	4101 $$ = $1 / $3;

	4102 else

	4103 {

	4104 $$ = 1;

	4105 fprintf (stderr,

	4106 "Division by zero, l%d,c%d-l%d,c%d",

	4107 @3.first_line, @3.first_column,

	4108 @3.last_line, @3.last_column);

	4109 }

	4110 }

	4111

	4112 As for semantic values, there is a default action for locations that

	4113 is run each time a rule is matched. It sets the beginning of `@$' to

	4114 the beginning of the first symbol, and the end of `@$' to the end of the

	4115 last symbol.

	4116

	4117 With this default action, the location tracking can be fully

	4118 automatic. The example above simply rewrites this way:

	4119

	4120 exp: ...

	4121 \| exp '/' exp

	4122 {

	4123 if ($3)

	4124 $$ = $1 / $3;

	4125 else

	4126 {

	4127 $$ = 1;

	4128 fprintf (stderr,

	4129 "Division by zero, l%d,c%d-l%d,c%d",

	4130 @3.first_line, @3.first_column,

	4131 @3.last_line, @3.last_column);

	4132 }

	4133 }

	4134

	4135 It is also possible to access the location of the lookahead token,

	4136 if any, from a semantic action. This location is stored in `yylloc'.

	4137 *Note Special Features for Use in Actions: Action Features.

	4138

	4139

	4140 File: bison.info, Node: Location Default Action, Prev: Actions and Locations, Up: Locations

	4141

	4142 3.6.3 Default Action for Locations

	4143 ----------------------------------

	4144

	4145 Actually, actions are not the best place to compute locations. Since

	4146 locations are much more general than semantic values, there is room in

	4147 the output parser to redefine the default action to take for each rule.

	4148 The `YYLLOC_DEFAULT' macro is invoked each time a rule is matched,

	4149 before the associated action is run. It is also invoked while

	4150 processing a syntax error, to compute the error's location. Before

	4151 reporting an unresolvable syntactic ambiguity, a GLR parser invokes

	4152 `YYLLOC_DEFAULT' recursively to compute the location of that ambiguity.

	4153

	4154 Most of the time, this macro is general enough to suppress location

	4155 dedicated code from semantic actions.

	4156

	4157 The `YYLLOC_DEFAULT' macro takes three parameters. The first one is

	4158 the location of the grouping (the result of the computation). When a

	4159 rule is matched, the second parameter identifies locations of all right

	4160 hand side elements of the rule being matched, and the third parameter

	4161 is the size of the rule's right hand side. When a GLR parser reports

	4162 an ambiguity, which of multiple candidate right hand sides it passes to

	4163 `YYLLOC_DEFAULT' is undefined. When processing a syntax error, the

	4164 second parameter identifies locations of the symbols that were

	4165 discarded during error processing, and the third parameter is the

	4166 number of discarded symbols.

	4167

	4168 By default, `YYLLOC_DEFAULT' is defined this way:

	4169

	4170 # define YYLLOC_DEFAULT(Current, Rhs, N) \

	4171 do \

	4172 if (N) \

	4173 { \

	4174 (Current).first_line = YYRHSLOC(Rhs, 1).first_line; \

	4175 (Current).first_column = YYRHSLOC(Rhs, 1).first_column; \

	4176 (Current).last_line = YYRHSLOC(Rhs, N).last_line; \

	4177 (Current).last_column = YYRHSLOC(Rhs, N).last_column; \

	4178 } \

	4179 else \

	4180 { \

	4181 (Current).first_line = (Current).last_line = \

	4182 YYRHSLOC(Rhs, 0).last_line; \

	4183 (Current).first_column = (Current).last_column = \

	4184 YYRHSLOC(Rhs, 0).last_column; \

	4185 } \

	4186 while (0)

	4187

	4188 where `YYRHSLOC (rhs, k)' is the location of the Kth symbol in RHS

	4189 when K is positive, and the location of the symbol just before the

	4190 reduction when K and N are both zero.

	4191

	4192 When defining `YYLLOC_DEFAULT', you should consider that:

	4193

	4194 * All arguments are free of side-effects. However, only the first

	4195 one (the result) should be modified by `YYLLOC_DEFAULT'.

	4196

	4197 * For consistency with semantic actions, valid indexes within the

	4198 right hand side range from 1 to N. When N is zero, only 0 is a

	4199 valid index, and it refers to the symbol just before the reduction.

	4200 During error processing N is always positive.

	4201

	4202 * Your macro should parenthesize its arguments, if need be, since the

	4203 actual arguments may not be surrounded by parentheses. Also, your

	4204 macro should expand to something that can be used as a single

	4205 statement when it is followed by a semicolon.

	4206

	4207

	4208 File: bison.info, Node: Declarations, Next: Multiple Parsers, Prev: Locations , Up: Grammar File

	4209

	4210 3.7 Bison Declarations

	4211 ======================

	4212

	4213 The "Bison declarations" section of a Bison grammar defines the symbols

	4214 used in formulating the grammar and the data types of semantic values.

	4215 *Note Symbols::.

	4216

	4217 All token type names (but not single-character literal tokens such as

	4218 `'+'' and `'*'') must be declared. Nonterminal symbols must be

	4219 declared if you need to specify which data type to use for the semantic

	4220 value (*note More Than One Value Type: Multiple Types.).

	4221

	4222 The first rule in the file also specifies the start symbol, by

	4223 default. If you want some other symbol to be the start symbol, you

	4224 must declare it explicitly (*note Languages and Context-Free Grammars:

	4225 Language and Grammar.).

	4226

	4227 * Menu:

	4228

	4229 * Require Decl:: Requiring a Bison version.

	4230 * Token Decl:: Declaring terminal symbols.

	4231 * Precedence Decl:: Declaring terminals with precedence and associativity.

	4232 * Union Decl:: Declaring the set of all semantic value types.

	4233 * Type Decl:: Declaring the choice of type for a nonterminal symbol.

	4234 * Initial Action Decl:: Code run before parsing starts.

	4235 * Destructor Decl:: Declaring how symbols are freed.

	4236 * Expect Decl:: Suppressing warnings about parsing conflicts.

	4237 * Start Decl:: Specifying the start symbol.

	4238 * Pure Decl:: Requesting a reentrant parser.

	4239 * Push Decl:: Requesting a push parser.

	4240 * Decl Summary:: Table of all Bison declarations.

	4241

	4242

	4243 File: bison.info, Node: Require Decl, Next: Token Decl, Up: Declarations

	4244

	4245 3.7.1 Require a Version of Bison

	4246 --------------------------------

	4247

	4248 You may require the minimum version of Bison to process the grammar. If

	4249 the requirement is not met, `bison' exits with an error (exit status

	4250 63).

	4251

	4252 %require "VERSION"

	4253

	4254

	4255 File: bison.info, Node: Token Decl, Next: Precedence Decl, Prev: Require Decl , Up: Declarations

	4256

	4257 3.7.2 Token Type Names

	4258 ----------------------

	4259

	4260 The basic way to declare a token type name (terminal symbol) is as

	4261 follows:

	4262

	4263 %token NAME

	4264

	4265 Bison will convert this into a `#define' directive in the parser, so

	4266 that the function `yylex' (if it is in this file) can use the name NAME

	4267 to stand for this token type's code.

	4268

	4269 Alternatively, you can use `%left', `%right', or `%nonassoc' instead

	4270 of `%token', if you wish to specify associativity and precedence.

	4271 *Note Operator Precedence: Precedence Decl.

	4272

	4273 You can explicitly specify the numeric code for a token type by

	4274 appending a nonnegative decimal or hexadecimal integer value in the

	4275 field immediately following the token name:

	4276

	4277 %token NUM 300

	4278 %token XNUM 0x12d // a GNU extension

	4279

	4280 It is generally best, however, to let Bison choose the numeric codes for

	4281 all token types. Bison will automatically select codes that don't

	4282 conflict with each other or with normal characters.

	4283

	4284 In the event that the stack type is a union, you must augment the

	4285 `%token' or other token declaration to include the data type

	4286 alternative delimited by angle-brackets (*note More Than One Value

	4287 Type: Multiple Types.).

	4288

	4289 For example:

	4290

	4291 %union { /* define stack type */

	4292 double val;

	4293 symrec *tptr;

	4294 }

	4295 %token <val> NUM /* define token NUM and its type */

	4296

	4297 You can associate a literal string token with a token type name by

	4298 writing the literal string at the end of a `%token' declaration which

	4299 declares the name. For example:

	4300

	4301 %token arrow "=>"

	4302

	4303 For example, a grammar for the C language might specify these names with

	4304 equivalent literal string tokens:

	4305

	4306 %token <operator> OR "\|\|"

	4307 %token <operator> LE 134 "<="

	4308 %left OR "<="

	4309

	4310 Once you equate the literal string and the token name, you can use them

	4311 interchangeably in further declarations or the grammar rules. The

	4312 `yylex' function can use the token name or the literal string to obtain

	4313 the token type code number (*note Calling Convention::). Syntax error

	4314 messages passed to `yyerror' from the parser will reference the literal

	4315 string instead of the token name.

	4316

	4317 The token numbered as 0 corresponds to end of file; the following

	4318 line allows for nicer error messages referring to "end of file" instead

	4319 of "$end":

	4320

	4321 %token END 0 "end of file"

	4322

	4323

	4324 File: bison.info, Node: Precedence Decl, Next: Union Decl, Prev: Token Decl, Up: Declarations

	4325

	4326 3.7.3 Operator Precedence

	4327 -------------------------

	4328

	4329 Use the `%left', `%right' or `%nonassoc' declaration to declare a token

	4330 and specify its precedence and associativity, all at once. These are

	4331 called "precedence declarations". *Note Operator Precedence:

	4332 Precedence, for general information on operator precedence.

	4333

	4334 The syntax of a precedence declaration is nearly the same as that of

	4335 `%token': either

	4336

	4337 %left SYMBOLS...

	4338

	4339 or

	4340

	4341 %left <TYPE> SYMBOLS...

	4342

	4343 And indeed any of these declarations serves the purposes of `%token'.

	4344 But in addition, they specify the associativity and relative precedence

	4345 for all the SYMBOLS:

	4346

	4347 * The associativity of an operator OP determines how repeated uses

	4348 of the operator nest: whether `X OP Y OP Z' is parsed by grouping

	4349 X with Y first or by grouping Y with Z first. `%left' specifies

	4350 left-associativity (grouping X with Y first) and `%right'

	4351 specifies right-associativity (grouping Y with Z first).

	4352 `%nonassoc' specifies no associativity, which means that `X OP Y

	4353 OP Z' is considered a syntax error.

	4354

	4355 * The precedence of an operator determines how it nests with other

	4356 operators. All the tokens declared in a single precedence

	4357 declaration have equal precedence and nest together according to

	4358 their associativity. When two tokens declared in different

	4359 precedence declarations associate, the one declared later has the

	4360 higher precedence and is grouped first.

	4361

	4362 For backward compatibility, there is a confusing difference between

	4363 the argument lists of `%token' and precedence declarations. Only a

	4364 `%token' can associate a literal string with a token type name. A

	4365 precedence declaration always interprets a literal string as a

	4366 reference to a separate token. For example:

	4367

	4368 %left OR "<=" // Does not declare an alias.

	4369 %left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=".

	4370

	4371

	4372 File: bison.info, Node: Union Decl, Next: Type Decl, Prev: Precedence Decl, Up: Declarations

	4373

	4374 3.7.4 The Collection of Value Types

	4375 -----------------------------------

	4376

	4377 The `%union' declaration specifies the entire collection of possible

	4378 data types for semantic values. The keyword `%union' is followed by

	4379 braced code containing the same thing that goes inside a `union' in C.

	4380

	4381 For example:

	4382

	4383 %union {

	4384 double val;

	4385 symrec *tptr;

	4386 }

	4387

	4388 This says that the two alternative types are `double' and `symrec *'.

	4389 They are given names `val' and `tptr'; these names are used in the

	4390 `%token' and `%type' declarations to pick one of the types for a

	4391 terminal or nonterminal symbol (*note Nonterminal Symbols: Type Decl.).

	4392

	4393 As an extension to POSIX, a tag is allowed after the `union'. For

	4394 example:

	4395

	4396 %union value {

	4397 double val;

	4398 symrec *tptr;

	4399 }

	4400

	4401 specifies the union tag `value', so the corresponding C type is `union

	4402 value'. If you do not specify a tag, it defaults to `YYSTYPE'.

	4403

	4404 As another extension to POSIX, you may specify multiple `%union'

	4405 declarations; their contents are concatenated. However, only the first

	4406 `%union' declaration can specify a tag.

	4407

	4408 Note that, unlike making a `union' declaration in C, you need not

	4409 write a semicolon after the closing brace.

	4410

	4411 Instead of `%union', you can define and use your own union type

	4412 `YYSTYPE' if your grammar contains at least one `<TYPE>' tag. For

	4413 example, you can put the following into a header file `parser.h':

	4414

	4415 union YYSTYPE {

	4416 double val;

	4417 symrec *tptr;

	4418 };

	4419 typedef union YYSTYPE YYSTYPE;

	4420

	4421 and then your grammar can use the following instead of `%union':

	4422

	4423 %{

	4424 #include "parser.h"

	4425 %}

	4426 %type <val> expr

	4427 %token <tptr> ID

	4428

	4429

	4430 File: bison.info, Node: Type Decl, Next: Initial Action Decl, Prev: Union Dec l, Up: Declarations

	4431

	4432 3.7.5 Nonterminal Symbols

	4433 -------------------------

	4434

	4435 When you use `%union' to specify multiple value types, you must declare

	4436 the value type of each nonterminal symbol for which values are used.

	4437 This is done with a `%type' declaration, like this:

	4438

	4439 %type <TYPE> NONTERMINAL...

	4440

	4441 Here NONTERMINAL is the name of a nonterminal symbol, and TYPE is the

	4442 name given in the `%union' to the alternative that you want (*note The

	4443 Collection of Value Types: Union Decl.). You can give any number of

	4444 nonterminal symbols in the same `%type' declaration, if they have the

	4445 same value type. Use spaces to separate the symbol names.

	4446

	4447 You can also declare the value type of a terminal symbol. To do

	4448 this, use the same `<TYPE>' construction in a declaration for the

	4449 terminal symbol. All kinds of token declarations allow `<TYPE>'.

	4450

	4451

	4452 File: bison.info, Node: Initial Action Decl, Next: Destructor Decl, Prev: Typ e Decl, Up: Declarations

	4453

	4454 3.7.6 Performing Actions before Parsing

	4455 ---------------------------------------

	4456

	4457 Sometimes your parser needs to perform some initializations before

	4458 parsing. The `%initial-action' directive allows for such arbitrary

	4459 code.

	4460

	4461 -- Directive: %initial-action { CODE }

	4462 Declare that the braced CODE must be invoked before parsing each

	4463 time `yyparse' is called. The CODE may use `$$' and `@$' --

	4464 initial value and location of the lookahead -- and the

	4465 `%parse-param'.

	4466

	4467 For instance, if your locations use a file name, you may use

	4468

	4469 %parse-param { char const *file_name };

	4470 %initial-action

	4471 {

	4472 @$.initialize (file_name);

	4473 };

	4474

	4475

	4476 File: bison.info, Node: Destructor Decl, Next: Expect Decl, Prev: Initial Act ion Decl, Up: Declarations

	4477

	4478 3.7.7 Freeing Discarded Symbols

	4479 -------------------------------

	4480

	4481 During error recovery (*note Error Recovery::), symbols already pushed

	4482 on the stack and tokens coming from the rest of the file are discarded

	4483 until the parser falls on its feet. If the parser runs out of memory,

	4484 or if it returns via `YYABORT' or `YYACCEPT', all the symbols on the

	4485 stack must be discarded. Even if the parser succeeds, it must discard

	4486 the start symbol.

	4487

	4488 When discarded symbols convey heap based information, this memory is

	4489 lost. While this behavior can be tolerable for batch parsers, such as

	4490 in traditional compilers, it is unacceptable for programs like shells or

	4491 protocol implementations that may parse and execute indefinitely.

	4492

	4493 The `%destructor' directive defines code that is called when a

	4494 symbol is automatically discarded.

	4495

	4496 -- Directive: %destructor { CODE } SYMBOLS

	4497 Invoke the braced CODE whenever the parser discards one of the

	4498 SYMBOLS. Within CODE, `$$' designates the semantic value

	4499 associated with the discarded symbol, and `@$' designates its

	4500 location. The additional parser parameters are also available

	4501 (*note The Parser Function `yyparse': Parser Function.).

	4502

	4503 When a symbol is listed among SYMBOLS, its `%destructor' is called

	4504 a per-symbol `%destructor'. You may also define a per-type

	4505 `%destructor' by listing a semantic type tag among SYMBOLS. In

	4506 that case, the parser will invoke this CODE whenever it discards

	4507 any grammar symbol that has that semantic type tag unless that

	4508 symbol has its own per-symbol `%destructor'.

	4509

	4510 Finally, you can define two different kinds of default

	4511 `%destructor's. (These default forms are experimental. More user

	4512 feedback will help to determine whether they should become

	4513 permanent features.) You can place each of `<*>' and `<>' in the

	4514 SYMBOLS list of exactly one `%destructor' declaration in your

	4515 grammar file. The parser will invoke the CODE associated with one

	4516 of these whenever it discards any user-defined grammar symbol that

	4517 has no per-symbol and no per-type `%destructor'. The parser uses

	4518 the CODE for `<*>' in the case of such a grammar symbol for which

	4519 you have formally declared a semantic type tag (`%type' counts as

	4520 such a declaration, but `$<tag>$' does not). The parser uses the

	4521 CODE for `<>' in the case of such a grammar symbol that has no

	4522 declared semantic type tag.

	4523

	4524 For example:

	4525

	4526 %union { char *string; }

	4527 %token <string> STRING1

	4528 %token <string> STRING2

	4529 %type <string> string1

	4530 %type <string> string2

	4531 %union { char character; }

	4532 %token <character> CHR

	4533 %type <character> chr

	4534 %token TAGLESS

	4535

	4536 %destructor { } <character>

	4537 %destructor { free ($$); } <*>

	4538 %destructor { free ($$); printf ("%d", @$.first_line); } STRING1 string1

	4539 %destructor { printf ("Discarding tagless symbol.\n"); } <>

	4540

	4541 guarantees that, when the parser discards any user-defined symbol that

	4542 has a semantic type tag other than `<character>', it passes its

	4543 semantic value to `free' by default. However, when the parser discards

	4544 a `STRING1' or a `string1', it also prints its line number to `stdout'.

	4545 It performs only the second `%destructor' in this case, so it invokes

	4546 `free' only once. Finally, the parser merely prints a message whenever

	4547 it discards any symbol, such as `TAGLESS', that has no semantic type

	4548 tag.

	4549

	4550 A Bison-generated parser invokes the default `%destructor's only for

	4551 user-defined as opposed to Bison-defined symbols. For example, the

	4552 parser will not invoke either kind of default `%destructor' for the

	4553 special Bison-defined symbols `$accept', `$undefined', or `$end' (*note

	4554 Bison Symbols: Table of Symbols.), none of which you can reference in

	4555 your grammar. It also will not invoke either for the `error' token

	4556 (*note error: Table of Symbols.), which is always defined by Bison

	4557 regardless of whether you reference it in your grammar. However, it

	4558 may invoke one of them for the end token (token 0) if you redefine it

	4559 from `$end' to, for example, `END':

	4560

	4561 %token END 0

	4562

	4563 Finally, Bison will never invoke a `%destructor' for an unreferenced

	4564 mid-rule semantic value (*note Actions in Mid-Rule: Mid-Rule Actions.).

	4565 That is, Bison does not consider a mid-rule to have a semantic value if

	4566 you do not reference `$$' in the mid-rule's action or `$N' (where N is

	4567 the RHS symbol position of the mid-rule) in any later action in that

	4568 rule. However, if you do reference either, the Bison-generated parser

	4569 will invoke the `<>' `%destructor' whenever it discards the mid-rule

	4570 symbol.

	4571

	4572

	4573 "Discarded symbols" are the following:

	4574

	4575 * stacked symbols popped during the first phase of error recovery,

	4576

	4577 * incoming terminals during the second phase of error recovery,

	4578

	4579 * the current lookahead and the entire stack (except the current

	4580 right-hand side symbols) when the parser returns immediately, and

	4581

	4582 * the start symbol, when the parser succeeds.

	4583

	4584 The parser can "return immediately" because of an explicit call to

	4585 `YYABORT' or `YYACCEPT', or failed error recovery, or memory exhaustion.

	4586

	4587 Right-hand side symbols of a rule that explicitly triggers a syntax

	4588 error via `YYERROR' are not discarded automatically. As a rule of

	4589 thumb, destructors are invoked only when user actions cannot manage the

	4590 memory.

	4591

	4592

	4593 File: bison.info, Node: Expect Decl, Next: Start Decl, Prev: Destructor Decl, Up: Declarations

	4594

	4595 3.7.8 Suppressing Conflict Warnings

	4596 -----------------------------------

	4597

	4598 Bison normally warns if there are any conflicts in the grammar (*note

	4599 Shift/Reduce Conflicts: Shift/Reduce.), but most real grammars have

	4600 harmless shift/reduce conflicts which are resolved in a predictable way

	4601 and would be difficult to eliminate. It is desirable to suppress the

	4602 warning about these conflicts unless the number of conflicts changes.

	4603 You can do this with the `%expect' declaration.

	4604

	4605 The declaration looks like this:

	4606

	4607 %expect N

	4608

	4609 Here N is a decimal integer. The declaration says there should be N

	4610 shift/reduce conflicts and no reduce/reduce conflicts. Bison reports

	4611 an error if the number of shift/reduce conflicts differs from N, or if

	4612 there are any reduce/reduce conflicts.

	4613

	4614 For normal LALR(1) parsers, reduce/reduce conflicts are more

	4615 serious, and should be eliminated entirely. Bison will always report

	4616 reduce/reduce conflicts for these parsers. With GLR parsers, however,

	4617 both kinds of conflicts are routine; otherwise, there would be no need

	4618 to use GLR parsing. Therefore, it is also possible to specify an

	4619 expected number of reduce/reduce conflicts in GLR parsers, using the

	4620 declaration:

	4621

	4622 %expect-rr N

	4623

	4624 In general, using `%expect' involves these steps:

	4625

	4626 * Compile your grammar without `%expect'. Use the `-v' option to

	4627 get a verbose list of where the conflicts occur. Bison will also

	4628 print the number of conflicts.

	4629

	4630 * Check each of the conflicts to make sure that Bison's default

	4631 resolution is what you really want. If not, rewrite the grammar

	4632 and go back to the beginning.

	4633

	4634 * Add an `%expect' declaration, copying the number N from the number

	4635 which Bison printed. With GLR parsers, add an `%expect-rr'

	4636 declaration as well.

	4637

	4638 Now Bison will warn you if you introduce an unexpected conflict, but

	4639 will keep silent otherwise.

	4640

	4641

	4642 File: bison.info, Node: Start Decl, Next: Pure Decl, Prev: Expect Decl, Up: Declarations

	4643

	4644 3.7.9 The Start-Symbol

	4645 ----------------------

	4646

	4647 Bison assumes by default that the start symbol for the grammar is the

	4648 first nonterminal specified in the grammar specification section. The

	4649 programmer may override this restriction with the `%start' declaration

	4650 as follows:

	4651

	4652 %start SYMBOL

	4653

	4654

	4655 File: bison.info, Node: Pure Decl, Next: Push Decl, Prev: Start Decl, Up: De clarations

	4656

	4657 3.7.10 A Pure (Reentrant) Parser

	4658 --------------------------------

	4659

	4660 A "reentrant" program is one which does not alter in the course of

	4661 execution; in other words, it consists entirely of "pure" (read-only)

	4662 code. Reentrancy is important whenever asynchronous execution is

	4663 possible; for example, a nonreentrant program may not be safe to call

	4664 from a signal handler. In systems with multiple threads of control, a

	4665 nonreentrant program must be called only within interlocks.

	4666

	4667 Normally, Bison generates a parser which is not reentrant. This is

	4668 suitable for most uses, and it permits compatibility with Yacc. (The

	4669 standard Yacc interfaces are inherently nonreentrant, because they use

	4670 statically allocated variables for communication with `yylex',

	4671 including `yylval' and `yylloc'.)

	4672

	4673 Alternatively, you can generate a pure, reentrant parser. The Bison

	4674 declaration `%define api.pure' says that you want the parser to be

	4675 reentrant. It looks like this:

	4676

	4677 %define api.pure

	4678

	4679 The result is that the communication variables `yylval' and `yylloc'

	4680 become local variables in `yyparse', and a different calling convention

	4681 is used for the lexical analyzer function `yylex'. *Note Calling

	4682 Conventions for Pure Parsers: Pure Calling, for the details of this.

	4683 The variable `yynerrs' becomes local in `yyparse' in pull mode but it

	4684 becomes a member of yypstate in push mode. (*note The Error Reporting

	4685 Function `yyerror': Error Reporting.). The convention for calling

	4686 `yyparse' itself is unchanged.

	4687

	4688 Whether the parser is pure has nothing to do with the grammar rules.

	4689 You can generate either a pure parser or a nonreentrant parser from any

	4690 valid grammar.

	4691

	4692

	4693 File: bison.info, Node: Push Decl, Next: Decl Summary, Prev: Pure Decl, Up: Declarations

	4694

	4695 3.7.11 A Push Parser

	4696 --------------------

	4697

	4698 (The current push parsing interface is experimental and may evolve.

	4699 More user feedback will help to stabilize it.)

	4700

	4701 A pull parser is called once and it takes control until all its input

	4702 is completely parsed. A push parser, on the other hand, is called each

	4703 time a new token is made available.

	4704

	4705 A push parser is typically useful when the parser is part of a main

	4706 event loop in the client's application. This is typically a

	4707 requirement of a GUI, when the main event loop needs to be triggered

	4708 within a certain time period.

	4709

	4710 Normally, Bison generates a pull parser. The following Bison

	4711 declaration says that you want the parser to be a push parser (*note

	4712 %define api.push_pull: Decl Summary.):

	4713

	4714 %define api.push_pull "push"

	4715

	4716 In almost all cases, you want to ensure that your push parser is also

	4717 a pure parser (*note A Pure (Reentrant) Parser: Pure Decl.). The only

	4718 time you should create an impure push parser is to have backwards

	4719 compatibility with the impure Yacc pull mode interface. Unless you know

	4720 what you are doing, your declarations should look like this:

	4721

	4722 %define api.pure

	4723 %define api.push_pull "push"

	4724

	4725 There is a major notable functional difference between the pure push

	4726 parser and the impure push parser. It is acceptable for a pure push

	4727 parser to have many parser instances, of the same type of parser, in

	4728 memory at the same time. An impure push parser should only use one

	4729 parser at a time.

	4730

	4731 When a push parser is selected, Bison will generate some new symbols

	4732 in the generated parser. `yypstate' is a structure that the generated

	4733 parser uses to store the parser's state. `yypstate_new' is the

	4734 function that will create a new parser instance. `yypstate_delete'

	4735 will free the resources associated with the corresponding parser

	4736 instance. Finally, `yypush_parse' is the function that should be

	4737 called whenever a token is available to provide the parser. A trivial

	4738 example of using a pure push parser would look like this:

	4739

	4740 int status;

	4741 yypstate *ps = yypstate_new ();

	4742 do {

	4743 status = yypush_parse (ps, yylex (), NULL);

	4744 } while (status == YYPUSH_MORE);

	4745 yypstate_delete (ps);

	4746

	4747 If the user decided to use an impure push parser, a few things about

	4748 the generated parser will change. The `yychar' variable becomes a

	4749 global variable instead of a variable in the `yypush_parse' function.

	4750 For this reason, the signature of the `yypush_parse' function is

	4751 changed to remove the token as a parameter. A nonreentrant push parser

	4752 example would thus look like this:

	4753

	4754 extern int yychar;

	4755 int status;

	4756 yypstate *ps = yypstate_new ();

	4757 do {

	4758 yychar = yylex ();

	4759 status = yypush_parse (ps);

	4760 } while (status == YYPUSH_MORE);

	4761 yypstate_delete (ps);

	4762

	4763 That's it. Notice the next token is put into the global variable

	4764 `yychar' for use by the next invocation of the `yypush_parse' function.

	4765

	4766 Bison also supports both the push parser interface along with the

	4767 pull parser interface in the same generated parser. In order to get

	4768 this functionality, you should replace the `%define api.push_pull

	4769 "push"' declaration with the `%define api.push_pull "both"'

	4770 declaration. Doing this will create all of the symbols mentioned

	4771 earlier along with the two extra symbols, `yyparse' and `yypull_parse'.

	4772 `yyparse' can be used exactly as it normally would be used. However,

	4773 the user should note that it is implemented in the generated parser by

	4774 calling `yypull_parse'. This makes the `yyparse' function that is

	4775 generated with the `%define api.push_pull "both"' declaration slower

	4776 than the normal `yyparse' function. If the user calls the

	4777 `yypull_parse' function it will parse the rest of the input stream. It

	4778 is possible to `yypush_parse' tokens to select a subgrammar and then

	4779 `yypull_parse' the rest of the input stream. If you would like to

	4780 switch back and forth between between parsing styles, you would have to

	4781 write your own `yypull_parse' function that knows when to quit looking

	4782 for input. An example of using the `yypull_parse' function would look

	4783 like this:

	4784

	4785 yypstate *ps = yypstate_new ();

	4786 yypull_parse (ps); /* Will call the lexer */

	4787 yypstate_delete (ps);

	4788

	4789 Adding the `%define api.pure' declaration does exactly the same

	4790 thing to the generated parser with `%define api.push_pull "both"' as it

	4791 did for `%define api.push_pull "push"'.

	4792

	4793

	4794 File: bison.info, Node: Decl Summary, Prev: Push Decl, Up: Declarations

	4795

	4796 3.7.12 Bison Declaration Summary

	4797 --------------------------------

	4798

	4799 Here is a summary of the declarations used to define a grammar:

	4800

	4801 -- Directive: %union

	4802 Declare the collection of data types that semantic values may have

	4803 (*note The Collection of Value Types: Union Decl.).

	4804

	4805 -- Directive: %token

	4806 Declare a terminal symbol (token type name) with no precedence or

	4807 associativity specified (*note Token Type Names: Token Decl.).

	4808

	4809 -- Directive: %right

	4810 Declare a terminal symbol (token type name) that is

	4811 right-associative (*note Operator Precedence: Precedence Decl.).

	4812

	4813 -- Directive: %left

	4814 Declare a terminal symbol (token type name) that is

	4815 left-associative (*note Operator Precedence: Precedence Decl.).

	4816

	4817 -- Directive: %nonassoc

	4818 Declare a terminal symbol (token type name) that is nonassociative

	4819 (*note Operator Precedence: Precedence Decl.). Using it in a way

	4820 that would be associative is a syntax error.

	4821

	4822 -- Directive: %type

	4823 Declare the type of semantic values for a nonterminal symbol

	4824 (*note Nonterminal Symbols: Type Decl.).

	4825

	4826 -- Directive: %start

	4827 Specify the grammar's start symbol (*note The Start-Symbol: Start

	4828 Decl.).

	4829

	4830 -- Directive: %expect

	4831 Declare the expected number of shift-reduce conflicts (*note

	4832 Suppressing Conflict Warnings: Expect Decl.).

	4833

	4834

	4835 In order to change the behavior of `bison', use the following

	4836 directives:

	4837

	4838 -- Directive: %code {CODE}

	4839 This is the unqualified form of the `%code' directive. It inserts

	4840 CODE verbatim at a language-dependent default location in the

	4841 output(1).

	4842

	4843 For C/C++, the default location is the parser source code file

	4844 after the usual contents of the parser header file. Thus, `%code'

	4845 replaces the traditional Yacc prologue, `%{CODE%}', for most

	4846 purposes. For a detailed discussion, see *Note Prologue

	4847 Alternatives::.

	4848

	4849 For Java, the default location is inside the parser class.

	4850

	4851 (Like all the Yacc prologue alternatives, this directive is

	4852 experimental. More user feedback will help to determine whether

	4853 it should become a permanent feature.)

	4854

	4855 -- Directive: %code QUALIFIER {CODE}

	4856 This is the qualified form of the `%code' directive. If you need

	4857 to specify location-sensitive verbatim CODE that does not belong

	4858 at the default location selected by the unqualified `%code' form,

	4859 use this form instead.

	4860

	4861 QUALIFIER identifies the purpose of CODE and thus the location(s)

	4862 where Bison should generate it. Not all values of QUALIFIER are

	4863 available for all target languages:

	4864

	4865 * requires

	4866

	4867 * Language(s): C, C++

	4868

	4869 * Purpose: This is the best place to write dependency code

	4870 required for `YYSTYPE' and `YYLTYPE'. In other words,

	4871 it's the best place to define types referenced in

	4872 `%union' directives, and it's the best place to override

	4873 Bison's default `YYSTYPE' and `YYLTYPE' definitions.

	4874

	4875 * Location(s): The parser header file and the parser

	4876 source code file before the Bison-generated `YYSTYPE'

	4877 and `YYLTYPE' definitions.

	4878

	4879 * provides

	4880

	4881 * Language(s): C, C++

	4882

	4883 * Purpose: This is the best place to write additional

	4884 definitions and declarations that should be provided to

	4885 other modules.

	4886

	4887 * Location(s): The parser header file and the parser

	4888 source code file after the Bison-generated `YYSTYPE',

	4889 `YYLTYPE', and token definitions.

	4890

	4891 * top

	4892

	4893 * Language(s): C, C++

	4894

	4895 * Purpose: The unqualified `%code' or `%code requires'

	4896 should usually be more appropriate than `%code top'.

	4897 However, occasionally it is necessary to insert code

	4898 much nearer the top of the parser source code file. For

	4899 example:

	4900

	4901 %code top {

	4902 #define _GNU_SOURCE

	4903 #include <stdio.h>

	4904 }

	4905

	4906 * Location(s): Near the top of the parser source code file.

	4907

	4908 * imports

	4909

	4910 * Language(s): Java

	4911

	4912 * Purpose: This is the best place to write Java import

	4913 directives.

	4914

	4915 * Location(s): The parser Java file after any Java package

	4916 directive and before any class definitions.

	4917

	4918 (Like all the Yacc prologue alternatives, this directive is

	4919 experimental. More user feedback will help to determine whether

	4920 it should become a permanent feature.)

	4921

	4922 For a detailed discussion of how to use `%code' in place of the

	4923 traditional Yacc prologue for C/C++, see *Note Prologue

	4924 Alternatives::.

	4925

	4926 -- Directive: %debug

	4927 In the parser file, define the macro `YYDEBUG' to 1 if it is not

	4928 already defined, so that the debugging facilities are compiled.

	4929 *Note Tracing Your Parser: Tracing.

	4930

	4931 -- Directive: %define VARIABLE

	4932 -- Directive: %define VARIABLE "VALUE"

	4933 Define a variable to adjust Bison's behavior. The possible

	4934 choices for VARIABLE, as well as their meanings, depend on the

	4935 selected target language and/or the parser skeleton (*note

	4936 %language: Decl Summary, *note %skeleton: Decl Summary.).

	4937

	4938 Bison will warn if a VARIABLE is defined multiple times.

	4939

	4940 Omitting `"VALUE"' is always equivalent to specifying it as `""'.

	4941

	4942 Some VARIABLEs may be used as Booleans. In this case, Bison will

	4943 complain if the variable definition does not meet one of the

	4944 following four conditions:

	4945

	4946 1. `"VALUE"' is `"true"'

	4947

	4948 2. `"VALUE"' is omitted (or is `""'). This is equivalent to

	4949 `"true"'.

	4950

	4951 3. `"VALUE"' is `"false"'.

	4952

	4953 4. VARIABLE is never defined. In this case, Bison selects a

	4954 default value, which may depend on the selected target

	4955 language and/or parser skeleton.

	4956

	4957 Some of the accepted VARIABLEs are:

	4958

	4959 * api.pure

	4960

	4961 * Language(s): C

	4962

	4963 * Purpose: Request a pure (reentrant) parser program.

	4964 *Note A Pure (Reentrant) Parser: Pure Decl.

	4965

	4966 * Accepted Values: Boolean

	4967

	4968 * Default Value: `"false"'

	4969

	4970 * api.push_pull

	4971

	4972 * Language(s): C (LALR(1) only)

	4973

	4974 * Purpose: Requests a pull parser, a push parser, or both.

	4975 *Note A Push Parser: Push Decl. (The current push

	4976 parsing interface is experimental and may evolve. More

	4977 user feedback will help to stabilize it.)

	4978

	4979 * Accepted Values: `"pull"', `"push"', `"both"'

	4980

	4981 * Default Value: `"pull"'

	4982

	4983 * lr.keep_unreachable_states

	4984

	4985 * Language(s): all

	4986

	4987 * Purpose: Requests that Bison allow unreachable parser

	4988 states to remain in the parser tables. Bison considers

	4989 a state to be unreachable if there exists no sequence of

	4990 transitions from the start state to that state. A state

	4991 can become unreachable during conflict resolution if

	4992 Bison disables a shift action leading to it from a

	4993 predecessor state. Keeping unreachable states is

	4994 sometimes useful for analysis purposes, but they are

	4995 useless in the generated parser.

	4996

	4997 * Accepted Values: Boolean

	4998

	4999 * Default Value: `"false"'

	5000

	5001 * Caveats:

	5002

	5003 * Unreachable states may contain conflicts and may

	5004 use rules not used in any other state. Thus,

	5005 keeping unreachable states may induce warnings that

	5006 are irrelevant to your parser's behavior, and it

	5007 may eliminate warnings that are relevant. Of

	5008 course, the change in warnings may actually be

	5009 relevant to a parser table analysis that wants to

	5010 keep unreachable states, so this behavior will

	5011 likely remain in future Bison releases.

	5012

	5013 * While Bison is able to remove unreachable states,

	5014 it is not guaranteed to remove other kinds of

	5015 useless states. Specifically, when Bison disables

	5016 reduce actions during conflict resolution, some

	5017 goto actions may become useless, and thus some

	5018 additional states may become useless. If Bison

	5019 were to compute which goto actions were useless and

	5020 then disable those actions, it could identify such

	5021 states as unreachable and then remove those states.

	5022 However, Bison does not compute which goto actions

	5023 are useless.

	5024

	5025 * namespace

	5026

	5027 * Languages(s): C++

	5028

	5029 * Purpose: Specifies the namespace for the parser class.

	5030 For example, if you specify:

	5031

	5032 %define namespace "foo::bar"

	5033

	5034 Bison uses `foo::bar' verbatim in references such as:

	5035

	5036 foo::bar::parser::semantic_type

	5037

	5038 However, to open a namespace, Bison removes any leading

	5039 `::' and then splits on any remaining occurrences:

	5040

	5041 namespace foo { namespace bar {

	5042 class position;

	5043 class location;

	5044 } }

	5045

	5046 * Accepted Values: Any absolute or relative C++ namespace

	5047 reference without a trailing `"::"'. For example,

	5048 `"foo"' or `"::foo::bar"'.

	5049

	5050 * Default Value: The value specified by `%name-prefix',

	5051 which defaults to `yy'. This usage of `%name-prefix' is

	5052 for backward compatibility and can be confusing since

	5053 `%name-prefix' also specifies the textual prefix for the

	5054 lexical analyzer function. Thus, if you specify

	5055 `%name-prefix', it is best to also specify `%define

	5056 namespace' so that `%name-prefix' _only_ affects the

	5057 lexical analyzer function. For example, if you specify:

	5058

	5059 %define namespace "foo"

	5060 %name-prefix "bar::"

	5061

	5062 The parser namespace is `foo' and `yylex' is referenced

	5063 as `bar::lex'.

	5064

	5065

	5066 -- Directive: %defines

	5067 Write a header file containing macro definitions for the token type

	5068 names defined in the grammar as well as a few other declarations.

	5069 If the parser output file is named `NAME.c' then this file is

	5070 named `NAME.h'.

	5071

	5072 For C parsers, the output header declares `YYSTYPE' unless

	5073 `YYSTYPE' is already defined as a macro or you have used a

	5074 `<TYPE>' tag without using `%union'. Therefore, if you are using

	5075 a `%union' (*note More Than One Value Type: Multiple Types.) with

	5076 components that require other definitions, or if you have defined

	5077 a `YYSTYPE' macro or type definition (*note Data Types of Semantic

	5078 Values: Value Type.), you need to arrange for these definitions to

	5079 be propagated to all modules, e.g., by putting them in a

	5080 prerequisite header that is included both by your parser and by

	5081 any other module that needs `YYSTYPE'.

	5082

	5083 Unless your parser is pure, the output header declares `yylval' as

	5084 an external variable. *Note A Pure (Reentrant) Parser: Pure Decl.

	5085

	5086 If you have also used locations, the output header declares

	5087 `YYLTYPE' and `yylloc' using a protocol similar to that of the

	5088 `YYSTYPE' macro and `yylval'. *Note Tracking Locations: Locations.

	5089

	5090 This output file is normally essential if you wish to put the

	5091 definition of `yylex' in a separate source file, because `yylex'

	5092 typically needs to be able to refer to the above-mentioned

	5093 declarations and to the token type codes. *Note Semantic Values

	5094 of Tokens: Token Values.

	5095

	5096 If you have declared `%code requires' or `%code provides', the

	5097 output header also contains their code. *Note %code: Decl Summary.

	5098

	5099 -- Directive: %defines DEFINES-FILE

	5100 Same as above, but save in the file DEFINES-FILE.

	5101

	5102 -- Directive: %destructor

	5103 Specify how the parser should reclaim the memory associated to

	5104 discarded symbols. *Note Freeing Discarded Symbols: Destructor

	5105 Decl.

	5106

	5107 -- Directive: %file-prefix "PREFIX"

	5108 Specify a prefix to use for all Bison output file names. The

	5109 names are chosen as if the input file were named `PREFIX.y'.

	5110

	5111 -- Directive: %language "LANGUAGE"

	5112 Specify the programming language for the generated parser.

	5113 Currently supported languages include C, C++, and Java. LANGUAGE

	5114 is case-insensitive.

	5115

	5116 This directive is experimental and its effect may be modified in

	5117 future releases.

	5118

	5119 -- Directive: %locations

	5120 Generate the code processing the locations (*note Special Features

	5121 for Use in Actions: Action Features.). This mode is enabled as

	5122 soon as the grammar uses the special `@N' tokens, but if your

	5123 grammar does not use it, using `%locations' allows for more

	5124 accurate syntax error messages.

	5125

	5126 -- Directive: %name-prefix "PREFIX"

	5127 Rename the external symbols used in the parser so that they start

	5128 with PREFIX instead of `yy'. The precise list of symbols renamed

	5129 in C parsers is `yyparse', `yylex', `yyerror', `yynerrs',

	5130 `yylval', `yychar', `yydebug', and (if locations are used)

	5131 `yylloc'. If you use a push parser, `yypush_parse',

	5132 `yypull_parse', `yypstate', `yypstate_new' and `yypstate_delete'

	5133 will also be renamed. For example, if you use `%name-prefix

	5134 "c_"', the names become `c_parse', `c_lex', and so on. For C++

	5135 parsers, see the `%define namespace' documentation in this section.

	5136 *Note Multiple Parsers in the Same Program: Multiple Parsers.

	5137

	5138 -- Directive: %no-lines

	5139 Don't generate any `#line' preprocessor commands in the parser

	5140 file. Ordinarily Bison writes these commands in the parser file

	5141 so that the C compiler and debuggers will associate errors and

	5142 object code with your source file (the grammar file). This

	5143 directive causes them to associate errors with the parser file,

	5144 treating it an independent source file in its own right.

	5145

	5146 -- Directive: %output "FILE"

	5147 Specify FILE for the parser file.

	5148

	5149 -- Directive: %pure-parser

	5150 Deprecated version of `%define api.pure' (*note %define: Decl

	5151 Summary.), for which Bison is more careful to warn about

	5152 unreasonable usage.

	5153

	5154 -- Directive: %require "VERSION"

	5155 Require version VERSION or higher of Bison. *Note Require a

	5156 Version of Bison: Require Decl.

	5157

	5158 -- Directive: %skeleton "FILE"

	5159 Specify the skeleton to use.

	5160

	5161 If FILE does not contain a `/', FILE is the name of a skeleton

	5162 file in the Bison installation directory. If it does, FILE is an

	5163 absolute file name or a file name relative to the directory of the

	5164 grammar file. This is similar to how most shells resolve commands.

	5165

	5166 -- Directive: %token-table

	5167 Generate an array of token names in the parser file. The name of

	5168 the array is `yytname'; `yytname[I]' is the name of the token

	5169 whose internal Bison token code number is I. The first three

	5170 elements of `yytname' correspond to the predefined tokens `"$end"',

	5171 `"error"', and `"$undefined"'; after these come the symbols

	5172 defined in the grammar file.

	5173

	5174 The name in the table includes all the characters needed to

	5175 represent the token in Bison. For single-character literals and

	5176 literal strings, this includes the surrounding quoting characters

	5177 and any escape sequences. For example, the Bison single-character

	5178 literal `'+'' corresponds to a three-character name, represented

	5179 in C as `"'+'"'; and the Bison two-character literal string `"\\/"'

	5180 corresponds to a five-character name, represented in C as

	5181 `"\"\\\\/\""'.

	5182

	5183 When you specify `%token-table', Bison also generates macro

	5184 definitions for macros `YYNTOKENS', `YYNNTS', and `YYNRULES', and

	5185 `YYNSTATES':

	5186

	5187 `YYNTOKENS'

	5188 The highest token number, plus one.

	5189

	5190 `YYNNTS'

	5191 The number of nonterminal symbols.

	5192

	5193 `YYNRULES'

	5194 The number of grammar rules,

	5195

	5196 `YYNSTATES'

	5197 The number of parser states (*note Parser States::).

	5198

	5199 -- Directive: %verbose

	5200 Write an extra output file containing verbose descriptions of the

	5201 parser states and what is done for each type of lookahead token in

	5202 that state. *Note Understanding Your Parser: Understanding, for

	5203 more information.

	5204

	5205 -- Directive: %yacc

	5206 Pretend the option `--yacc' was given, i.e., imitate Yacc,

	5207 including its naming conventions. *Note Bison Options::, for more.

	5208

	5209 ---------- Footnotes ----------

	5210

	5211 (1) The default location is actually skeleton-dependent; writers

	5212 of non-standard skeletons however should choose the default location

	5213 consistently with the behavior of the standard Bison skeletons.

	5214

	5215

	5216 File: bison.info, Node: Multiple Parsers, Prev: Declarations, Up: Grammar Fil e

	5217

	5218 3.8 Multiple Parsers in the Same Program

	5219 ========================================

	5220

	5221 Most programs that use Bison parse only one language and therefore

	5222 contain only one Bison parser. But what if you want to parse more than

	5223 one language with the same program? Then you need to avoid a name

	5224 conflict between different definitions of `yyparse', `yylval', and so

	5225 on.

	5226

	5227 The easy way to do this is to use the option `-p PREFIX' (*note

	5228 Invoking Bison: Invocation.). This renames the interface functions and

	5229 variables of the Bison parser to start with PREFIX instead of `yy'.

	5230 You can use this to give each parser distinct names that do not

	5231 conflict.

	5232

	5233 The precise list of symbols renamed is `yyparse', `yylex',

	5234 `yyerror', `yynerrs', `yylval', `yylloc', `yychar' and `yydebug'. If

	5235 you use a push parser, `yypush_parse', `yypull_parse', `yypstate',

	5236 `yypstate_new' and `yypstate_delete' will also be renamed. For

	5237 example, if you use `-p c', the names become `cparse', `clex', and so

	5238 on.

	5239

	5240 *All the other variables and macros associated with Bison are not

	5241 renamed.* These others are not global; there is no conflict if the same

	5242 name is used in different parsers. For example, `YYSTYPE' is not

	5243 renamed, but defining this in different ways in different parsers causes

	5244 no trouble (*note Data Types of Semantic Values: Value Type.).

	5245

	5246 The `-p' option works by adding macro definitions to the beginning

	5247 of the parser source file, defining `yyparse' as `PREFIXparse', and so

	5248 on. This effectively substitutes one name for the other in the entire

	5249 parser file.

	5250

	5251

	5252 File: bison.info, Node: Interface, Next: Algorithm, Prev: Grammar File, Up: Top

	5253

	5254 4 Parser C-Language Interface

	5255 *****************************

	5256

	5257 The Bison parser is actually a C function named `yyparse'. Here we

	5258 describe the interface conventions of `yyparse' and the other functions

	5259 that it needs to use.

	5260

	5261 Keep in mind that the parser uses many C identifiers starting with

	5262 `yy' and `YY' for internal purposes. If you use such an identifier

	5263 (aside from those in this manual) in an action or in epilogue in the

	5264 grammar file, you are likely to run into trouble.

	5265

	5266 * Menu:

	5267

	5268 * Parser Function:: How to call `yyparse' and what it returns.

	5269 * Push Parser Function:: How to call `yypush_parse' and what it returns.

	5270 * Pull Parser Function:: How to call `yypull_parse' and what it returns.

	5271 * Parser Create Function:: How to call `yypstate_new' and what it returns.

	5272 * Parser Delete Function:: How to call `yypstate_delete' and what it returns.

	5273 * Lexical:: You must supply a function `yylex'

	5274 which reads tokens.

	5275 * Error Reporting:: You must supply a function `yyerror'.

	5276 * Action Features:: Special features for use in actions.

	5277 * Internationalization:: How to let the parser speak in the user's

	5278 native language.

	5279

	5280

	5281 File: bison.info, Node: Parser Function, Next: Push Parser Function, Up: Inte rface

	5282

	5283 4.1 The Parser Function `yyparse'

	5284 =================================

	5285

	5286 You call the function `yyparse' to cause parsing to occur. This

	5287 function reads tokens, executes actions, and ultimately returns when it

	5288 encounters end-of-input or an unrecoverable syntax error. You can also

	5289 write an action which directs `yyparse' to return immediately without

	5290 reading further.

	5291

	5292 -- Function: int yyparse (void)

	5293 The value returned by `yyparse' is 0 if parsing was successful

	5294 (return is due to end-of-input).

	5295

	5296 The value is 1 if parsing failed because of invalid input, i.e.,

	5297 input that contains a syntax error or that causes `YYABORT' to be

	5298 invoked.

	5299

	5300 The value is 2 if parsing failed due to memory exhaustion.

	5301

	5302 In an action, you can cause immediate return from `yyparse' by using

	5303 these macros:

	5304

	5305 -- Macro: YYACCEPT

	5306 Return immediately with value 0 (to report success).

	5307

	5308 -- Macro: YYABORT

	5309 Return immediately with value 1 (to report failure).

	5310

	5311 If you use a reentrant parser, you can optionally pass additional

	5312 parameter information to it in a reentrant way. To do so, use the

	5313 declaration `%parse-param':

	5314

	5315 -- Directive: %parse-param {ARGUMENT-DECLARATION}

	5316 Declare that an argument declared by the braced-code

	5317 ARGUMENT-DECLARATION is an additional `yyparse' argument. The

	5318 ARGUMENT-DECLARATION is used when declaring functions or

	5319 prototypes. The last identifier in ARGUMENT-DECLARATION must be

	5320 the argument name.

	5321

	5322 Here's an example. Write this in the parser:

	5323

	5324 %parse-param {int *nastiness}

	5325 %parse-param {int *randomness}

	5326

	5327 Then call the parser like this:

	5328

	5329 {

	5330 int nastiness, randomness;

	5331 ... /* Store proper data in `nastiness' and `randomness'. */

	5332 value = yyparse (&nastiness, &randomness);

	5333 ...

	5334 }

	5335

	5336 In the grammar actions, use expressions like this to refer to the data:

	5337

	5338 exp: ... { ...; *randomness += 1; ... }

	5339

	5340

	5341 File: bison.info, Node: Push Parser Function, Next: Pull Parser Function, Pre v: Parser Function, Up: Interface

	5342

	5343 4.2 The Push Parser Function `yypush_parse'

	5344 ===========================================

	5345

	5346 (The current push parsing interface is experimental and may evolve.

	5347 More user feedback will help to stabilize it.)

	5348

	5349 You call the function `yypush_parse' to parse a single token. This

	5350 function is available if either the `%define api.push_pull "push"' or

	5351 `%define api.push_pull "both"' declaration is used. *Note A Push

	5352 Parser: Push Decl.

	5353

	5354 -- Function: int yypush_parse (yypstate *yyps)

	5355 The value returned by `yypush_parse' is the same as for yyparse

	5356 with the following exception. `yypush_parse' will return

	5357 YYPUSH_MORE if more input is required to finish parsing the

	5358 grammar.

	5359

	5360

	5361 File: bison.info, Node: Pull Parser Function, Next: Parser Create Function, P rev: Push Parser Function, Up: Interface

	5362

	5363 4.3 The Pull Parser Function `yypull_parse'

	5364 ===========================================

	5365

	5366 (The current push parsing interface is experimental and may evolve.

	5367 More user feedback will help to stabilize it.)

	5368

	5369 You call the function `yypull_parse' to parse the rest of the input

	5370 stream. This function is available if the `%define api.push_pull

	5371 "both"' declaration is used. *Note A Push Parser: Push Decl.

	5372

	5373 -- Function: int yypull_parse (yypstate *yyps)

	5374 The value returned by `yypull_parse' is the same as for `yyparse'.

	5375

	5376

	5377 File: bison.info, Node: Parser Create Function, Next: Parser Delete Function, Prev: Pull Parser Function, Up: Interface

	5378

	5379 4.4 The Parser Create Function `yystate_new'

	5380 ============================================

	5381

	5382 (The current push parsing interface is experimental and may evolve.

	5383 More user feedback will help to stabilize it.)

	5384

	5385 You call the function `yypstate_new' to create a new parser instance.

	5386 This function is available if either the `%define api.push_pull "push"'

	5387 or `%define api.push_pull "both"' declaration is used. *Note A Push

	5388 Parser: Push Decl.

	5389

	5390 -- Function: yypstate *yypstate_new (void)

	5391 The fuction will return a valid parser instance if there was

	5392 memory available or 0 if no memory was available. In impure mode,

	5393 it will also return 0 if a parser instance is currently allocated.

	5394

	5395

	5396 File: bison.info, Node: Parser Delete Function, Next: Lexical, Prev: Parser C reate Function, Up: Interface

	5397

	5398 4.5 The Parser Delete Function `yystate_delete'

	5399 ===============================================

	5400

	5401 (The current push parsing interface is experimental and may evolve.

	5402 More user feedback will help to stabilize it.)

	5403

	5404 You call the function `yypstate_delete' to delete a parser instance.

	5405 function is available if either the `%define api.push_pull "push"' or

	5406 `%define api.push_pull "both"' declaration is used. *Note A Push

	5407 Parser: Push Decl.

	5408

	5409 -- Function: void yypstate_delete (yypstate *yyps)

	5410 This function will reclaim the memory associated with a parser

	5411 instance. After this call, you should no longer attempt to use

	5412 the parser instance.

	5413

	5414

	5415 File: bison.info, Node: Lexical, Next: Error Reporting, Prev: Parser Delete F unction, Up: Interface

	5416

	5417 4.6 The Lexical Analyzer Function `yylex'

	5418 =========================================

	5419

	5420 The "lexical analyzer" function, `yylex', recognizes tokens from the

	5421 input stream and returns them to the parser. Bison does not create

	5422 this function automatically; you must write it so that `yyparse' can

	5423 call it. The function is sometimes referred to as a lexical scanner.

	5424

	5425 In simple programs, `yylex' is often defined at the end of the Bison

	5426 grammar file. If `yylex' is defined in a separate source file, you

	5427 need to arrange for the token-type macro definitions to be available

	5428 there. To do this, use the `-d' option when you run Bison, so that it

	5429 will write these macro definitions into a separate header file

	5430 `NAME.tab.h' which you can include in the other source files that need

	5431 it. *Note Invoking Bison: Invocation.

	5432

	5433 * Menu:

	5434

	5435 * Calling Convention:: How `yyparse' calls `yylex'.

	5436 * Token Values:: How `yylex' must return the semantic value

	5437 of the token it has read.

	5438 * Token Locations:: How `yylex' must return the text location

	5439 (line number, etc.) of the token, if the

	5440 actions want that.

	5441 * Pure Calling:: How the calling convention differs in a pure parser

	5442 (*note A Pure (Reentrant) Parser: Pure Decl.).

	5443

	5444

	5445 File: bison.info, Node: Calling Convention, Next: Token Values, Up: Lexical

	5446

	5447 4.6.1 Calling Convention for `yylex'

	5448 ------------------------------------

	5449

	5450 The value that `yylex' returns must be the positive numeric code for

	5451 the type of token it has just found; a zero or negative value signifies

	5452 end-of-input.

	5453

	5454 When a token is referred to in the grammar rules by a name, that name

	5455 in the parser file becomes a C macro whose definition is the proper

	5456 numeric code for that token type. So `yylex' can use the name to

	5457 indicate that type. *Note Symbols::.

	5458

	5459 When a token is referred to in the grammar rules by a character

	5460 literal, the numeric code for that character is also the code for the

	5461 token type. So `yylex' can simply return that character code, possibly

	5462 converted to `unsigned char' to avoid sign-extension. The null

	5463 character must not be used this way, because its code is zero and that

	5464 signifies end-of-input.

	5465

	5466 Here is an example showing these things:

	5467

	5468 int

	5469 yylex (void)

	5470 {

	5471 ...

	5472 if (c == EOF) /* Detect end-of-input. */

	5473 return 0;

	5474 ...

	5475 if (c == '+' \|\| c == '-')

	5476 return c; /* Assume token type for `+' is '+'. */

	5477 ...

	5478 return INT; /* Return the type of the token. */

	5479 ...

	5480 }

	5481

	5482 This interface has been designed so that the output from the `lex'

	5483 utility can be used without change as the definition of `yylex'.

	5484

	5485 If the grammar uses literal string tokens, there are two ways that

	5486 `yylex' can determine the token type codes for them:

	5487

	5488 * If the grammar defines symbolic token names as aliases for the

	5489 literal string tokens, `yylex' can use these symbolic names like

	5490 all others. In this case, the use of the literal string tokens in

	5491 the grammar file has no effect on `yylex'.

	5492

	5493 * `yylex' can find the multicharacter token in the `yytname' table.

	5494 The index of the token in the table is the token type's code. The

	5495 name of a multicharacter token is recorded in `yytname' with a

	5496 double-quote, the token's characters, and another double-quote.

	5497 The token's characters are escaped as necessary to be suitable as

	5498 input to Bison.

	5499

	5500 Here's code for looking up a multicharacter token in `yytname',

	5501 assuming that the characters of the token are stored in

	5502 `token_buffer', and assuming that the token does not contain any

	5503 characters like `"' that require escaping.

	5504

	5505 for (i = 0; i < YYNTOKENS; i++)

	5506 {

	5507 if (yytname[i] != 0

	5508 && yytname[i][0] == '"'

	5509 && ! strncmp (yytname[i] + 1, token_buffer,

	5510 strlen (token_buffer))

	5511 && yytname[i][strlen (token_buffer) + 1] == '"'

	5512 && yytname[i][strlen (token_buffer) + 2] == 0)

	5513 break;

	5514 }

	5515

	5516 The `yytname' table is generated only if you use the

	5517 `%token-table' declaration. *Note Decl Summary::.

	5518

	5519

	5520 File: bison.info, Node: Token Values, Next: Token Locations, Prev: Calling Co nvention, Up: Lexical

	5521

	5522 4.6.2 Semantic Values of Tokens

	5523 -------------------------------

	5524

	5525 In an ordinary (nonreentrant) parser, the semantic value of the token

	5526 must be stored into the global variable `yylval'. When you are using

	5527 just one data type for semantic values, `yylval' has that type. Thus,

	5528 if the type is `int' (the default), you might write this in `yylex':

	5529

	5530 ...

	5531 yylval = value; /* Put value onto Bison stack. */

	5532 return INT; /* Return the type of the token. */

	5533 ...

	5534

	5535 When you are using multiple data types, `yylval''s type is a union

	5536 made from the `%union' declaration (*note The Collection of Value

	5537 Types: Union Decl.). So when you store a token's value, you must use

	5538 the proper member of the union. If the `%union' declaration looks like

	5539 this:

	5540

	5541 %union {

	5542 int intval;

	5543 double val;

	5544 symrec *tptr;

	5545 }

	5546

	5547 then the code in `yylex' might look like this:

	5548

	5549 ...

	5550 yylval.intval = value; /* Put value onto Bison stack. */

	5551 return INT; /* Return the type of the token. */

	5552 ...

	5553

	5554

	5555 File: bison.info, Node: Token Locations, Next: Pure Calling, Prev: Token Valu es, Up: Lexical

	5556

	5557 4.6.3 Textual Locations of Tokens

	5558 ---------------------------------

	5559

	5560 If you are using the `@N'-feature (*note Tracking Locations:

	5561 Locations.) in actions to keep track of the textual locations of tokens

	5562 and groupings, then you must provide this information in `yylex'. The

	5563 function `yyparse' expects to find the textual location of a token just

	5564 parsed in the global variable `yylloc'. So `yylex' must store the

	5565 proper data in that variable.

	5566

	5567 By default, the value of `yylloc' is a structure and you need only

	5568 initialize the members that are going to be used by the actions. The

	5569 four members are called `first_line', `first_column', `last_line' and

	5570 `last_column'. Note that the use of this feature makes the parser

	5571 noticeably slower.

	5572

	5573 The data type of `yylloc' has the name `YYLTYPE'.

	5574

	5575

	5576 File: bison.info, Node: Pure Calling, Prev: Token Locations, Up: Lexical

	5577

	5578 4.6.4 Calling Conventions for Pure Parsers

	5579 ------------------------------------------

	5580

	5581 When you use the Bison declaration `%define api.pure' to request a

	5582 pure, reentrant parser, the global communication variables `yylval' and

	5583 `yylloc' cannot be used. (*Note A Pure (Reentrant) Parser: Pure Decl.)

	5584 In such parsers the two global variables are replaced by pointers

	5585 passed as arguments to `yylex'. You must declare them as shown here,

	5586 and pass the information back by storing it through those pointers.

	5587

	5588 int

	5589 yylex (YYSTYPE lvalp, YYLTYPE llocp)

	5590 {

	5591 ...

	5592 lvalp = value; / Put value onto Bison stack. */

	5593 return INT; /* Return the type of the token. */

	5594 ...

	5595 }

	5596

	5597 If the grammar file does not use the `@' constructs to refer to

	5598 textual locations, then the type `YYLTYPE' will not be defined. In

	5599 this case, omit the second argument; `yylex' will be called with only

	5600 one argument.

	5601

	5602 If you wish to pass the additional parameter data to `yylex', use

	5603 `%lex-param' just like `%parse-param' (*note Parser Function::).

	5604

	5605 -- Directive: lex-param {ARGUMENT-DECLARATION}

	5606 Declare that the braced-code ARGUMENT-DECLARATION is an additional

	5607 `yylex' argument declaration.

	5608

	5609 For instance:

	5610

	5611 %parse-param {int *nastiness}

	5612 %lex-param {int *nastiness}

	5613 %parse-param {int *randomness}

	5614

	5615 results in the following signature:

	5616

	5617 int yylex (int *nastiness);

	5618 int yyparse (int nastiness, int randomness);

	5619

	5620 If `%define api.pure' is added:

	5621

	5622 int yylex (YYSTYPE lvalp, int nastiness);

	5623 int yyparse (int nastiness, int randomness);

	5624

	5625 and finally, if both `%define api.pure' and `%locations' are used:

	5626

	5627 int yylex (YYSTYPE lvalp, YYLTYPE llocp, int *nastiness);

	5628 int yyparse (int nastiness, int randomness);

	5629

	5630

	5631 File: bison.info, Node: Error Reporting, Next: Action Features, Prev: Lexical , Up: Interface

	5632

	5633 4.7 The Error Reporting Function `yyerror'

	5634 ==========================================

	5635

	5636 The Bison parser detects a "syntax error" or "parse error" whenever it

	5637 reads a token which cannot satisfy any syntax rule. An action in the

	5638 grammar can also explicitly proclaim an error, using the macro

	5639 `YYERROR' (*note Special Features for Use in Actions: Action Features.).

	5640

	5641 The Bison parser expects to report the error by calling an error

	5642 reporting function named `yyerror', which you must supply. It is

	5643 called by `yyparse' whenever a syntax error is found, and it receives

	5644 one argument. For a syntax error, the string is normally

	5645 `"syntax error"'.

	5646

	5647 If you invoke the directive `%error-verbose' in the Bison

	5648 declarations section (*note The Bison Declarations Section: Bison

	5649 Declarations.), then Bison provides a more verbose and specific error

	5650 message string instead of just plain `"syntax error"'.

	5651

	5652 The parser can detect one other kind of error: memory exhaustion.

	5653 This can happen when the input contains constructions that are very

	5654 deeply nested. It isn't likely you will encounter this, since the Bison

	5655 parser normally extends its stack automatically up to a very large

	5656 limit. But if memory is exhausted, `yyparse' calls `yyerror' in the

	5657 usual fashion, except that the argument string is `"memory exhausted"'.

	5658

	5659 In some cases diagnostics like `"syntax error"' are translated

	5660 automatically from English to some other language before they are

	5661 passed to `yyerror'. *Note Internationalization::.

	5662

	5663 The following definition suffices in simple programs:

	5664

	5665 void

	5666 yyerror (char const *s)

	5667 {

	5668 fprintf (stderr, "%s\n", s);

	5669 }

	5670

	5671 After `yyerror' returns to `yyparse', the latter will attempt error

	5672 recovery if you have written suitable error recovery grammar rules

	5673 (*note Error Recovery::). If recovery is impossible, `yyparse' will

	5674 immediately return 1.

	5675

	5676 Obviously, in location tracking pure parsers, `yyerror' should have

	5677 an access to the current location. This is indeed the case for the GLR

	5678 parsers, but not for the Yacc parser, for historical reasons. I.e., if

	5679 `%locations %define api.pure' is passed then the prototypes for

	5680 `yyerror' are:

	5681

	5682 void yyerror (char const msg); / Yacc parsers. */

	5683 void yyerror (YYLTYPE locp, char const msg); /* GLR parsers. */

	5684

	5685 If `%parse-param {int *nastiness}' is used, then:

	5686

	5687 void yyerror (int nastiness, char const msg); /* Yacc parsers. */

	5688 void yyerror (int nastiness, char const msg); /* GLR parsers. */

	5689

	5690 Finally, GLR and Yacc parsers share the same `yyerror' calling

	5691 convention for absolutely pure parsers, i.e., when the calling

	5692 convention of `yylex' _and_ the calling convention of `%define

	5693 api.pure' are pure. I.e.:

	5694

	5695 /* Location tracking. */

	5696 %locations

	5697 /* Pure yylex. */

	5698 %define api.pure

	5699 %lex-param {int *nastiness}

	5700 /* Pure yyparse. */

	5701 %parse-param {int *nastiness}

	5702 %parse-param {int *randomness}

	5703

	5704 results in the following signatures for all the parser kinds:

	5705

	5706 int yylex (YYSTYPE lvalp, YYLTYPE llocp, int *nastiness);

	5707 int yyparse (int nastiness, int randomness);

	5708 void yyerror (YYLTYPE *locp,

	5709 int nastiness, int randomness,

	5710 char const *msg);

	5711

	5712 The prototypes are only indications of how the code produced by Bison

	5713 uses `yyerror'. Bison-generated code always ignores the returned

	5714 value, so `yyerror' can return any type, including `void'. Also,

	5715 `yyerror' can be a variadic function; that is why the message is always

	5716 passed last.

	5717

	5718 Traditionally `yyerror' returns an `int' that is always ignored, but

	5719 this is purely for historical reasons, and `void' is preferable since

	5720 it more accurately describes the return type for `yyerror'.

	5721

	5722 The variable `yynerrs' contains the number of syntax errors reported

	5723 so far. Normally this variable is global; but if you request a pure

	5724 parser (*note A Pure (Reentrant) Parser: Pure Decl.) then it is a

	5725 local variable which only the actions can access.

	5726

	5727

	5728 File: bison.info, Node: Action Features, Next: Internationalization, Prev: Er ror Reporting, Up: Interface

	5729

	5730 4.8 Special Features for Use in Actions

	5731 =======================================

	5732

	5733 Here is a table of Bison constructs, variables and macros that are

	5734 useful in actions.

	5735

	5736 -- Variable: $$

	5737 Acts like a variable that contains the semantic value for the

	5738 grouping made by the current rule. *Note Actions::.

	5739

	5740 -- Variable: $N

	5741 Acts like a variable that contains the semantic value for the Nth

	5742 component of the current rule. *Note Actions::.

	5743

	5744 -- Variable: $<TYPEALT>$

	5745 Like `$$' but specifies alternative TYPEALT in the union specified

	5746 by the `%union' declaration. *Note Data Types of Values in

	5747 Actions: Action Types.

	5748

	5749 -- Variable: $<TYPEALT>N

	5750 Like `$N' but specifies alternative TYPEALT in the union specified

	5751 by the `%union' declaration. *Note Data Types of Values in

	5752 Actions: Action Types.

	5753

	5754 -- Macro: YYABORT;

	5755 Return immediately from `yyparse', indicating failure. *Note The

	5756 Parser Function `yyparse': Parser Function.

	5757

	5758 -- Macro: YYACCEPT;

	5759 Return immediately from `yyparse', indicating success. *Note The

	5760 Parser Function `yyparse': Parser Function.

	5761

	5762 -- Macro: YYBACKUP (TOKEN, VALUE);

	5763 Unshift a token. This macro is allowed only for rules that reduce

	5764 a single value, and only when there is no lookahead token. It is

	5765 also disallowed in GLR parsers. It installs a lookahead token

	5766 with token type TOKEN and semantic value VALUE; then it discards

	5767 the value that was going to be reduced by this rule.

	5768

	5769 If the macro is used when it is not valid, such as when there is a

	5770 lookahead token already, then it reports a syntax error with a

	5771 message `cannot back up' and performs ordinary error recovery.

	5772

	5773 In either case, the rest of the action is not executed.

	5774

	5775 -- Macro: YYEMPTY

	5776 Value stored in `yychar' when there is no lookahead token.

	5777

	5778 -- Macro: YYEOF

	5779 Value stored in `yychar' when the lookahead is the end of the input

	5780 stream.

	5781

	5782 -- Macro: YYERROR;

	5783 Cause an immediate syntax error. This statement initiates error

	5784 recovery just as if the parser itself had detected an error;

	5785 however, it does not call `yyerror', and does not print any

	5786 message. If you want to print an error message, call `yyerror'

	5787 explicitly before the `YYERROR;' statement. *Note Error

	5788 Recovery::.

	5789

	5790 -- Macro: YYRECOVERING

	5791 The expression `YYRECOVERING ()' yields 1 when the parser is

	5792 recovering from a syntax error, and 0 otherwise. *Note Error

	5793 Recovery::.

	5794

	5795 -- Variable: yychar

	5796 Variable containing either the lookahead token, or `YYEOF' when the

	5797 lookahead is the end of the input stream, or `YYEMPTY' when no

	5798 lookahead has been performed so the next token is not yet known.

	5799 Do not modify `yychar' in a deferred semantic action (*note GLR

	5800 Semantic Actions::). *Note Lookahead Tokens: Lookahead.

	5801

	5802 -- Macro: yyclearin;

	5803 Discard the current lookahead token. This is useful primarily in

	5804 error rules. Do not invoke `yyclearin' in a deferred semantic

	5805 action (note GLR Semantic Actions::). Note Error Recovery::.

	5806

	5807 -- Macro: yyerrok;

	5808 Resume generating error messages immediately for subsequent syntax

	5809 errors. This is useful primarily in error rules. *Note Error

	5810 Recovery::.

	5811

	5812 -- Variable: yylloc

	5813 Variable containing the lookahead token location when `yychar' is

	5814 not set to `YYEMPTY' or `YYEOF'. Do not modify `yylloc' in a

	5815 deferred semantic action (note GLR Semantic Actions::). Note

	5816 Actions and Locations: Actions and Locations.

	5817

	5818 -- Variable: yylval

	5819 Variable containing the lookahead token semantic value when

	5820 `yychar' is not set to `YYEMPTY' or `YYEOF'. Do not modify

	5821 `yylval' in a deferred semantic action (*note GLR Semantic

	5822 Actions::). *Note Actions: Actions.

	5823

	5824 -- Value: @$

	5825 Acts like a structure variable containing information on the

	5826 textual location of the grouping made by the current rule. *Note

	5827 Tracking Locations: Locations.

	5828

	5829

	5830 -- Value: @N

	5831 Acts like a structure variable containing information on the

	5832 textual location of the Nth component of the current rule. *Note

	5833 Tracking Locations: Locations.

	5834

	5835

	5836 File: bison.info, Node: Internationalization, Prev: Action Features, Up: Inte rface

	5837

	5838 4.9 Parser Internationalization

	5839 ===============================

	5840

	5841 A Bison-generated parser can print diagnostics, including error and

	5842 tracing messages. By default, they appear in English. However, Bison

	5843 also supports outputting diagnostics in the user's native language. To

	5844 make this work, the user should set the usual environment variables.

	5845 *Note The User's View: (gettext)Users. For example, the shell command

	5846 `export LC_ALL=fr_CA.UTF-8' might set the user's locale to French

	5847 Canadian using the UTF-8 encoding. The exact set of available locales

	5848 depends on the user's installation.

	5849

	5850 The maintainer of a package that uses a Bison-generated parser

	5851 enables the internationalization of the parser's output through the

	5852 following steps. Here we assume a package that uses GNU Autoconf and

	5853 GNU Automake.

	5854

	5855 1. Into the directory containing the GNU Autoconf macros used by the

	5856 package--often called `m4'--copy the `bison-i18n.m4' file

	5857 installed by Bison under `share/aclocal/bison-i18n.m4' in Bison's

	5858 installation directory. For example:

	5859

	5860 cp /usr/local/share/aclocal/bison-i18n.m4 m4/bison-i18n.m4

	5861

	5862 2. In the top-level `configure.ac', after the `AM_GNU_GETTEXT'

	5863 invocation, add an invocation of `BISON_I18N'. This macro is

	5864 defined in the file `bison-i18n.m4' that you copied earlier. It

	5865 causes `configure' to find the value of the `BISON_LOCALEDIR'

	5866 variable, and it defines the source-language symbol `YYENABLE_NLS'

	5867 to enable translations in the Bison-generated parser.

	5868

	5869 3. In the `main' function of your program, designate the directory

	5870 containing Bison's runtime message catalog, through a call to

	5871 `bindtextdomain' with domain name `bison-runtime'. For example:

	5872

	5873 bindtextdomain ("bison-runtime", BISON_LOCALEDIR);

	5874

	5875 Typically this appears after any other call `bindtextdomain

	5876 (PACKAGE, LOCALEDIR)' that your package already has. Here we rely

	5877 on `BISON_LOCALEDIR' to be defined as a string through the

	5878 `Makefile'.

	5879

	5880 4. In the `Makefile.am' that controls the compilation of the `main'

	5881 function, make `BISON_LOCALEDIR' available as a C preprocessor

	5882 macro, either in `DEFS' or in `AM_CPPFLAGS'. For example:

	5883

	5884 DEFS = @DEFS@ -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"'

	5885

	5886 or:

	5887

	5888 AM_CPPFLAGS = -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"'

	5889

	5890 5. Finally, invoke the command `autoreconf' to generate the build

	5891 infrastructure.

	5892

	5893

	5894 File: bison.info, Node: Algorithm, Next: Error Recovery, Prev: Interface, Up : Top

	5895

	5896 5 The Bison Parser Algorithm

	5897 ****************************

	5898

	5899 As Bison reads tokens, it pushes them onto a stack along with their

	5900 semantic values. The stack is called the "parser stack". Pushing a

	5901 token is traditionally called "shifting".

	5902

	5903 For example, suppose the infix calculator has read `1 + 5 *', with a

	5904 `3' to come. The stack will have four elements, one for each token

	5905 that was shifted.

	5906

	5907 But the stack does not always have an element for each token read.

	5908 When the last N tokens and groupings shifted match the components of a

	5909 grammar rule, they can be combined according to that rule. This is

	5910 called "reduction". Those tokens and groupings are replaced on the

	5911 stack by a single grouping whose symbol is the result (left hand side)

	5912 of that rule. Running the rule's action is part of the process of

	5913 reduction, because this is what computes the semantic value of the

	5914 resulting grouping.

	5915

	5916 For example, if the infix calculator's parser stack contains this:

	5917

	5918 1 + 5 * 3

	5919

	5920 and the next input token is a newline character, then the last three

	5921 elements can be reduced to 15 via the rule:

	5922

	5923 expr: expr '*' expr;

	5924

	5925 Then the stack contains just these three elements:

	5926

	5927 1 + 15

	5928

	5929 At this point, another reduction can be made, resulting in the single

	5930 value 16. Then the newline token can be shifted.

	5931

	5932 The parser tries, by shifts and reductions, to reduce the entire

	5933 input down to a single grouping whose symbol is the grammar's

	5934 start-symbol (*note Languages and Context-Free Grammars: Language and

	5935 Grammar.).

	5936

	5937 This kind of parser is known in the literature as a bottom-up parser.

	5938

	5939 * Menu:

	5940

	5941 * Lookahead:: Parser looks one token ahead when deciding what to do.

	5942 * Shift/Reduce:: Conflicts: when either shifting or reduction is valid.

	5943 * Precedence:: Operator precedence works by resolving conflicts.

	5944 * Contextual Precedence:: When an operator's precedence depends on context.

	5945 * Parser States:: The parser is a finite-state-machine with stack.

	5946 * Reduce/Reduce:: When two rules are applicable in the same situation.

	5947 * Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.

	5948 * Generalized LR Parsing:: Parsing arbitrary context-free grammars.

	5949 * Memory Management:: What happens when memory is exhausted. How to avoid it.

	5950

	5951

	5952 File: bison.info, Node: Lookahead, Next: Shift/Reduce, Up: Algorithm

	5953

	5954 5.1 Lookahead Tokens

	5955 ====================

	5956

	5957 The Bison parser does _not_ always reduce immediately as soon as the

	5958 last N tokens and groupings match a rule. This is because such a

	5959 simple strategy is inadequate to handle most languages. Instead, when a

	5960 reduction is possible, the parser sometimes "looks ahead" at the next

	5961 token in order to decide what to do.

	5962

	5963 When a token is read, it is not immediately shifted; first it

	5964 becomes the "lookahead token", which is not on the stack. Now the

	5965 parser can perform one or more reductions of tokens and groupings on

	5966 the stack, while the lookahead token remains off to the side. When no

	5967 more reductions should take place, the lookahead token is shifted onto

	5968 the stack. This does not mean that all possible reductions have been

	5969 done; depending on the token type of the lookahead token, some rules

	5970 may choose to delay their application.

	5971

	5972 Here is a simple case where lookahead is needed. These three rules

	5973 define expressions which contain binary addition operators and postfix

	5974 unary factorial operators (`!'), and allow parentheses for grouping.

	5975

	5976 expr: term '+' expr

	5977 \| term

	5978 ;

	5979

	5980 term: '(' expr ')'

	5981 \| term '!'

	5982 \| NUMBER

	5983 ;

	5984

	5985 Suppose that the tokens `1 + 2' have been read and shifted; what

	5986 should be done? If the following token is `)', then the first three

	5987 tokens must be reduced to form an `expr'. This is the only valid

	5988 course, because shifting the `)' would produce a sequence of symbols

	5989 `term ')'', and no rule allows this.

	5990

	5991 If the following token is `!', then it must be shifted immediately so

	5992 that `2 !' can be reduced to make a `term'. If instead the parser were

	5993 to reduce before shifting, `1 + 2' would become an `expr'. It would

	5994 then be impossible to shift the `!' because doing so would produce on

	5995 the stack the sequence of symbols `expr '!''. No rule allows that

	5996 sequence.

	5997

	5998 The lookahead token is stored in the variable `yychar'. Its

	5999 semantic value and location, if any, are stored in the variables

	6000 `yylval' and `yylloc'. *Note Special Features for Use in Actions:

	6001 Action Features.

	6002

	6003

	6004 File: bison.info, Node: Shift/Reduce, Next: Precedence, Prev: Lookahead, Up: Algorithm

	6005

	6006 5.2 Shift/Reduce Conflicts

	6007 ==========================

	6008

	6009 Suppose we are parsing a language which has if-then and if-then-else

	6010 statements, with a pair of rules like this:

	6011

	6012 if_stmt:

	6013 IF expr THEN stmt

	6014 \| IF expr THEN stmt ELSE stmt

	6015 ;

	6016

	6017 Here we assume that `IF', `THEN' and `ELSE' are terminal symbols for

	6018 specific keyword tokens.

	6019

	6020 When the `ELSE' token is read and becomes the lookahead token, the

	6021 contents of the stack (assuming the input is valid) are just right for

	6022 reduction by the first rule. But it is also legitimate to shift the

	6023 `ELSE', because that would lead to eventual reduction by the second

	6024 rule.

	6025

	6026 This situation, where either a shift or a reduction would be valid,

	6027 is called a "shift/reduce conflict". Bison is designed to resolve

	6028 these conflicts by choosing to shift, unless otherwise directed by

	6029 operator precedence declarations. To see the reason for this, let's

	6030 contrast it with the other alternative.

	6031

	6032 Since the parser prefers to shift the `ELSE', the result is to attach

	6033 the else-clause to the innermost if-statement, making these two inputs

	6034 equivalent:

	6035

	6036 if x then if y then win (); else lose;

	6037

	6038 if x then do; if y then win (); else lose; end;

	6039

	6040 But if the parser chose to reduce when possible rather than shift,

	6041 the result would be to attach the else-clause to the outermost

	6042 if-statement, making these two inputs equivalent:

	6043

	6044 if x then if y then win (); else lose;

	6045

	6046 if x then do; if y then win (); end; else lose;

	6047

	6048 The conflict exists because the grammar as written is ambiguous:

	6049 either parsing of the simple nested if-statement is legitimate. The

	6050 established convention is that these ambiguities are resolved by

	6051 attaching the else-clause to the innermost if-statement; this is what

	6052 Bison accomplishes by choosing to shift rather than reduce. (It would

	6053 ideally be cleaner to write an unambiguous grammar, but that is very

	6054 hard to do in this case.) This particular ambiguity was first

	6055 encountered in the specifications of Algol 60 and is called the

	6056 "dangling `else'" ambiguity.

	6057

	6058 To avoid warnings from Bison about predictable, legitimate

	6059 shift/reduce conflicts, use the `%expect N' declaration. There will be

	6060 no warning as long as the number of shift/reduce conflicts is exactly N.

	6061 *Note Suppressing Conflict Warnings: Expect Decl.

	6062

	6063 The definition of `if_stmt' above is solely to blame for the

	6064 conflict, but the conflict does not actually appear without additional

	6065 rules. Here is a complete Bison input file that actually manifests the

	6066 conflict:

	6067

	6068 %token IF THEN ELSE variable

	6069 %%

	6070 stmt: expr

	6071 \| if_stmt

	6072 ;

	6073

	6074 if_stmt:

	6075 IF expr THEN stmt

	6076 \| IF expr THEN stmt ELSE stmt

	6077 ;

	6078

	6079 expr: variable

	6080 ;

	6081

	6082

	6083 File: bison.info, Node: Precedence, Next: Contextual Precedence, Prev: Shift/ Reduce, Up: Algorithm

	6084

	6085 5.3 Operator Precedence

	6086 =======================

	6087

	6088 Another situation where shift/reduce conflicts appear is in arithmetic

	6089 expressions. Here shifting is not always the preferred resolution; the

	6090 Bison declarations for operator precedence allow you to specify when to

	6091 shift and when to reduce.

	6092

	6093 * Menu:

	6094

	6095 * Why Precedence:: An example showing why precedence is needed.

	6096 * Using Precedence:: How to specify precedence in Bison grammars.

	6097 * Precedence Examples:: How these features are used in the previous example.

	6098 * How Precedence:: How they work.

	6099

	6100

	6101 File: bison.info, Node: Why Precedence, Next: Using Precedence, Up: Precedenc e

	6102

	6103 5.3.1 When Precedence is Needed

	6104 -------------------------------

	6105

	6106 Consider the following ambiguous grammar fragment (ambiguous because the

	6107 input `1 - 2 * 3' can be parsed in two different ways):

	6108

	6109 expr: expr '-' expr

	6110 \| expr '*' expr

	6111 \| expr '<' expr

	6112 \| '(' expr ')'

	6113 ...

	6114 ;

	6115

	6116 Suppose the parser has seen the tokens `1', `-' and `2'; should it

	6117 reduce them via the rule for the subtraction operator? It depends on

	6118 the next token. Of course, if the next token is `)', we must reduce;

	6119 shifting is invalid because no single rule can reduce the token

	6120 sequence `- 2 )' or anything starting with that. But if the next token

	6121 is `*' or `<', we have a choice: either shifting or reduction would

	6122 allow the parse to complete, but with different results.

	6123

	6124 To decide which one Bison should do, we must consider the results.

	6125 If the next operator token OP is shifted, then it must be reduced first

	6126 in order to permit another opportunity to reduce the difference. The

	6127 result is (in effect) `1 - (2 OP 3)'. On the other hand, if the

	6128 subtraction is reduced before shifting OP, the result is

	6129 `(1 - 2) OP 3'. Clearly, then, the choice of shift or reduce should

	6130 depend on the relative precedence of the operators `-' and OP: `*'

	6131 should be shifted first, but not `<'.

	6132

	6133 What about input such as `1 - 2 - 5'; should this be `(1 - 2) - 5'

	6134 or should it be `1 - (2 - 5)'? For most operators we prefer the

	6135 former, which is called "left association". The latter alternative,

	6136 "right association", is desirable for assignment operators. The choice

	6137 of left or right association is a matter of whether the parser chooses

	6138 to shift or reduce when the stack contains `1 - 2' and the lookahead

	6139 token is `-': shifting makes right-associativity.

	6140

	6141

	6142 File: bison.info, Node: Using Precedence, Next: Precedence Examples, Prev: Wh y Precedence, Up: Precedence

	6143

	6144 5.3.2 Specifying Operator Precedence

	6145 ------------------------------------

	6146

	6147 Bison allows you to specify these choices with the operator precedence

	6148 declarations `%left' and `%right'. Each such declaration contains a

	6149 list of tokens, which are operators whose precedence and associativity

	6150 is being declared. The `%left' declaration makes all those operators

	6151 left-associative and the `%right' declaration makes them

	6152 right-associative. A third alternative is `%nonassoc', which declares

	6153 that it is a syntax error to find the same operator twice "in a row".

	6154

	6155 The relative precedence of different operators is controlled by the

	6156 order in which they are declared. The first `%left' or `%right'

	6157 declaration in the file declares the operators whose precedence is

	6158 lowest, the next such declaration declares the operators whose

	6159 precedence is a little higher, and so on.

	6160

	6161

	6162 File: bison.info, Node: Precedence Examples, Next: How Precedence, Prev: Usin g Precedence, Up: Precedence

	6163

	6164 5.3.3 Precedence Examples

	6165 -------------------------

	6166

	6167 In our example, we would want the following declarations:

	6168

	6169 %left '<'

	6170 %left '-'

	6171 %left '*'

	6172

	6173 In a more complete example, which supports other operators as well,

	6174 we would declare them in groups of equal precedence. For example,

	6175 `'+'' is declared with `'-'':

	6176

	6177 %left '<' '>' '=' NE LE GE

	6178 %left '+' '-'

	6179 %left '*' '/'

	6180

	6181 (Here `NE' and so on stand for the operators for "not equal" and so on.

	6182 We assume that these tokens are more than one character long and

	6183 therefore are represented by names, not character literals.)

	6184

	6185

	6186 File: bison.info, Node: How Precedence, Prev: Precedence Examples, Up: Preced ence

	6187

	6188 5.3.4 How Precedence Works

	6189 --------------------------

	6190

	6191 The first effect of the precedence declarations is to assign precedence

	6192 levels to the terminal symbols declared. The second effect is to assign

	6193 precedence levels to certain rules: each rule gets its precedence from

	6194 the last terminal symbol mentioned in the components. (You can also

	6195 specify explicitly the precedence of a rule. *Note Context-Dependent

	6196 Precedence: Contextual Precedence.)

	6197

	6198 Finally, the resolution of conflicts works by comparing the

	6199 precedence of the rule being considered with that of the lookahead

	6200 token. If the token's precedence is higher, the choice is to shift.

	6201 If the rule's precedence is higher, the choice is to reduce. If they

	6202 have equal precedence, the choice is made based on the associativity of

	6203 that precedence level. The verbose output file made by `-v' (*note

	6204 Invoking Bison: Invocation.) says how each conflict was resolved.

	6205

	6206 Not all rules and not all tokens have precedence. If either the

	6207 rule or the lookahead token has no precedence, then the default is to

	6208 shift.

	6209

	6210

	6211 File: bison.info, Node: Contextual Precedence, Next: Parser States, Prev: Pre cedence, Up: Algorithm

	6212

	6213 5.4 Context-Dependent Precedence

	6214 ================================

	6215

	6216 Often the precedence of an operator depends on the context. This sounds

	6217 outlandish at first, but it is really very common. For example, a minus

	6218 sign typically has a very high precedence as a unary operator, and a

	6219 somewhat lower precedence (lower than multiplication) as a binary

	6220 operator.

	6221

	6222 The Bison precedence declarations, `%left', `%right' and

	6223 `%nonassoc', can only be used once for a given token; so a token has

	6224 only one precedence declared in this way. For context-dependent

	6225 precedence, you need to use an additional mechanism: the `%prec'

	6226 modifier for rules.

	6227

	6228 The `%prec' modifier declares the precedence of a particular rule by

	6229 specifying a terminal symbol whose precedence should be used for that

	6230 rule. It's not necessary for that symbol to appear otherwise in the

	6231 rule. The modifier's syntax is:

	6232

	6233 %prec TERMINAL-SYMBOL

	6234

	6235 and it is written after the components of the rule. Its effect is to

	6236 assign the rule the precedence of TERMINAL-SYMBOL, overriding the

	6237 precedence that would be deduced for it in the ordinary way. The

	6238 altered rule precedence then affects how conflicts involving that rule

	6239 are resolved (*note Operator Precedence: Precedence.).

	6240

	6241 Here is how `%prec' solves the problem of unary minus. First,

	6242 declare a precedence for a fictitious terminal symbol named `UMINUS'.

	6243 There are no tokens of this type, but the symbol serves to stand for its

	6244 precedence:

	6245

	6246 ...

	6247 %left '+' '-'

	6248 %left '*'

	6249 %left UMINUS

	6250

	6251 Now the precedence of `UMINUS' can be used in specific rules:

	6252

	6253 exp: ...

	6254 \| exp '-' exp

	6255 ...

	6256 \| '-' exp %prec UMINUS

	6257

	6258

	6259 File: bison.info, Node: Parser States, Next: Reduce/Reduce, Prev: Contextual Precedence, Up: Algorithm

	6260

	6261 5.5 Parser States

	6262 =================

	6263

	6264 The function `yyparse' is implemented using a finite-state machine.

	6265 The values pushed on the parser stack are not simply token type codes;

	6266 they represent the entire sequence of terminal and nonterminal symbols

	6267 at or near the top of the stack. The current state collects all the

	6268 information about previous input which is relevant to deciding what to

	6269 do next.

	6270

	6271 Each time a lookahead token is read, the current parser state

	6272 together with the type of lookahead token are looked up in a table.

	6273 This table entry can say, "Shift the lookahead token." In this case,

	6274 it also specifies the new parser state, which is pushed onto the top of

	6275 the parser stack. Or it can say, "Reduce using rule number N." This

	6276 means that a certain number of tokens or groupings are taken off the

	6277 top of the stack, and replaced by one grouping. In other words, that

	6278 number of states are popped from the stack, and one new state is pushed.

	6279

	6280 There is one other alternative: the table can say that the lookahead

	6281 token is erroneous in the current state. This causes error processing

	6282 to begin (*note Error Recovery::).

	6283

	6284

	6285 File: bison.info, Node: Reduce/Reduce, Next: Mystery Conflicts, Prev: Parser States, Up: Algorithm

	6286

	6287 5.6 Reduce/Reduce Conflicts

	6288 ===========================

	6289

	6290 A reduce/reduce conflict occurs if there are two or more rules that

	6291 apply to the same sequence of input. This usually indicates a serious

	6292 error in the grammar.

	6293

	6294 For example, here is an erroneous attempt to define a sequence of

	6295 zero or more `word' groupings.

	6296

	6297 sequence: /* empty */

	6298 { printf ("empty sequence\n"); }

	6299 \| maybeword

	6300 \| sequence word

	6301 { printf ("added word %s\n", $2); }

	6302 ;

	6303

	6304 maybeword: /* empty */

	6305 { printf ("empty maybeword\n"); }

	6306 \| word

	6307 { printf ("single word %s\n", $1); }

	6308 ;

	6309

	6310 The error is an ambiguity: there is more than one way to parse a single

	6311 `word' into a `sequence'. It could be reduced to a `maybeword' and

	6312 then into a `sequence' via the second rule. Alternatively,

	6313 nothing-at-all could be reduced into a `sequence' via the first rule,

	6314 and this could be combined with the `word' using the third rule for

	6315 `sequence'.

	6316

	6317 There is also more than one way to reduce nothing-at-all into a

	6318 `sequence'. This can be done directly via the first rule, or

	6319 indirectly via `maybeword' and then the second rule.

	6320

	6321 You might think that this is a distinction without a difference,

	6322 because it does not change whether any particular input is valid or

	6323 not. But it does affect which actions are run. One parsing order runs

	6324 the second rule's action; the other runs the first rule's action and

	6325 the third rule's action. In this example, the output of the program

	6326 changes.

	6327

	6328 Bison resolves a reduce/reduce conflict by choosing to use the rule

	6329 that appears first in the grammar, but it is very risky to rely on

	6330 this. Every reduce/reduce conflict must be studied and usually

	6331 eliminated. Here is the proper way to define `sequence':

	6332

	6333 sequence: /* empty */

	6334 { printf ("empty sequence\n"); }

	6335 \| sequence word

	6336 { printf ("added word %s\n", $2); }

	6337 ;

	6338

	6339 Here is another common error that yields a reduce/reduce conflict:

	6340

	6341 sequence: /* empty */

	6342 \| sequence words

	6343 \| sequence redirects

	6344 ;

	6345

	6346 words: /* empty */

	6347 \| words word

	6348 ;

	6349

	6350 redirects:/* empty */

	6351 \| redirects redirect

	6352 ;

	6353

	6354 The intention here is to define a sequence which can contain either

	6355 `word' or `redirect' groupings. The individual definitions of

	6356 `sequence', `words' and `redirects' are error-free, but the three

	6357 together make a subtle ambiguity: even an empty input can be parsed in

	6358 infinitely many ways!

	6359

	6360 Consider: nothing-at-all could be a `words'. Or it could be two

	6361 `words' in a row, or three, or any number. It could equally well be a

	6362 `redirects', or two, or any number. Or it could be a `words' followed

	6363 by three `redirects' and another `words'. And so on.

	6364

	6365 Here are two ways to correct these rules. First, to make it a

	6366 single level of sequence:

	6367

	6368 sequence: /* empty */

	6369 \| sequence word

	6370 \| sequence redirect

	6371 ;

	6372

	6373 Second, to prevent either a `words' or a `redirects' from being

	6374 empty:

	6375

	6376 sequence: /* empty */

	6377 \| sequence words

	6378 \| sequence redirects

	6379 ;

	6380

	6381 words: word

	6382 \| words word

	6383 ;

	6384

	6385 redirects:redirect

	6386 \| redirects redirect

	6387 ;

	6388

	6389

	6390 File: bison.info, Node: Mystery Conflicts, Next: Generalized LR Parsing, Prev : Reduce/Reduce, Up: Algorithm

	6391

	6392 5.7 Mysterious Reduce/Reduce Conflicts

	6393 ======================================

	6394

	6395 Sometimes reduce/reduce conflicts can occur that don't look warranted.

	6396 Here is an example:

	6397

	6398 %token ID

	6399

	6400 %%

	6401 def: param_spec return_spec ','

	6402 ;

	6403 param_spec:

	6404 type

	6405 \| name_list ':' type

	6406 ;

	6407 return_spec:

	6408 type

	6409 \| name ':' type

	6410 ;

	6411 type: ID

	6412 ;

	6413 name: ID

	6414 ;

	6415 name_list:

	6416 name

	6417 \| name ',' name_list

	6418 ;

	6419

	6420 It would seem that this grammar can be parsed with only a single

	6421 token of lookahead: when a `param_spec' is being read, an `ID' is a

	6422 `name' if a comma or colon follows, or a `type' if another `ID'

	6423 follows. In other words, this grammar is LR(1).

	6424

	6425 However, Bison, like most parser generators, cannot actually handle

	6426 all LR(1) grammars. In this grammar, two contexts, that after an `ID'

	6427 at the beginning of a `param_spec' and likewise at the beginning of a

	6428 `return_spec', are similar enough that Bison assumes they are the same.

	6429 They appear similar because the same set of rules would be active--the

	6430 rule for reducing to a `name' and that for reducing to a `type'. Bison

	6431 is unable to determine at that stage of processing that the rules would

	6432 require different lookahead tokens in the two contexts, so it makes a

	6433 single parser state for them both. Combining the two contexts causes a

	6434 conflict later. In parser terminology, this occurrence means that the

	6435 grammar is not LALR(1).

	6436

	6437 In general, it is better to fix deficiencies than to document them.

	6438 But this particular deficiency is intrinsically hard to fix; parser

	6439 generators that can handle LR(1) grammars are hard to write and tend to

	6440 produce parsers that are very large. In practice, Bison is more useful

	6441 as it is now.

	6442

	6443 When the problem arises, you can often fix it by identifying the two

	6444 parser states that are being confused, and adding something to make them

	6445 look distinct. In the above example, adding one rule to `return_spec'

	6446 as follows makes the problem go away:

	6447

	6448 %token BOGUS

	6449 ...

	6450 %%

	6451 ...

	6452 return_spec:

	6453 type

	6454 \| name ':' type

	6455 /* This rule is never used. */

	6456 \| ID BOGUS

	6457 ;

	6458

	6459 This corrects the problem because it introduces the possibility of an

	6460 additional active rule in the context after the `ID' at the beginning of

	6461 `return_spec'. This rule is not active in the corresponding context in

	6462 a `param_spec', so the two contexts receive distinct parser states. As

	6463 long as the token `BOGUS' is never generated by `yylex', the added rule

	6464 cannot alter the way actual input is parsed.

	6465

	6466 In this particular example, there is another way to solve the

	6467 problem: rewrite the rule for `return_spec' to use `ID' directly

	6468 instead of via `name'. This also causes the two confusing contexts to

	6469 have different sets of active rules, because the one for `return_spec'

	6470 activates the altered rule for `return_spec' rather than the one for

	6471 `name'.

	6472

	6473 param_spec:

	6474 type

	6475 \| name_list ':' type

	6476 ;

	6477 return_spec:

	6478 type

	6479 \| ID ':' type

	6480 ;

	6481

	6482 For a more detailed exposition of LALR(1) parsers and parser

	6483 generators, please see: Frank DeRemer and Thomas Pennello, Efficient

	6484 Computation of LALR(1) Look-Ahead Sets, `ACM Transactions on

	6485 Programming Languages and Systems', Vol. 4, No. 4 (October 1982), pp.

	6486 615-649 `http://doi.acm.org/10.1145/69622.357187'.

	6487

	6488

	6489 File: bison.info, Node: Generalized LR Parsing, Next: Memory Management, Prev : Mystery Conflicts, Up: Algorithm

	6490

	6491 5.8 Generalized LR (GLR) Parsing

	6492 ================================

	6493

	6494 Bison produces _deterministic_ parsers that choose uniquely when to

	6495 reduce and which reduction to apply based on a summary of the preceding

	6496 input and on one extra token of lookahead. As a result, normal Bison

	6497 handles a proper subset of the family of context-free languages.

	6498 Ambiguous grammars, since they have strings with more than one possible

	6499 sequence of reductions cannot have deterministic parsers in this sense.

	6500 The same is true of languages that require more than one symbol of

	6501 lookahead, since the parser lacks the information necessary to make a

	6502 decision at the point it must be made in a shift-reduce parser.

	6503 Finally, as previously mentioned (*note Mystery Conflicts::), there are

	6504 languages where Bison's particular choice of how to summarize the input

	6505 seen so far loses necessary information.

	6506

	6507 When you use the `%glr-parser' declaration in your grammar file,

	6508 Bison generates a parser that uses a different algorithm, called

	6509 Generalized LR (or GLR). A Bison GLR parser uses the same basic

	6510 algorithm for parsing as an ordinary Bison parser, but behaves

	6511 differently in cases where there is a shift-reduce conflict that has not

	6512 been resolved by precedence rules (*note Precedence::) or a

	6513 reduce-reduce conflict. When a GLR parser encounters such a situation,

	6514 it effectively _splits_ into a several parsers, one for each possible

	6515 shift or reduction. These parsers then proceed as usual, consuming

	6516 tokens in lock-step. Some of the stacks may encounter other conflicts

	6517 and split further, with the result that instead of a sequence of states,

	6518 a Bison GLR parsing stack is what is in effect a tree of states.

	6519

	6520 In effect, each stack represents a guess as to what the proper parse

	6521 is. Additional input may indicate that a guess was wrong, in which case

	6522 the appropriate stack silently disappears. Otherwise, the semantics

	6523 actions generated in each stack are saved, rather than being executed

	6524 immediately. When a stack disappears, its saved semantic actions never

	6525 get executed. When a reduction causes two stacks to become equivalent,

	6526 their sets of semantic actions are both saved with the state that

	6527 results from the reduction. We say that two stacks are equivalent when

	6528 they both represent the same sequence of states, and each pair of

	6529 corresponding states represents a grammar symbol that produces the same

	6530 segment of the input token stream.

	6531

	6532 Whenever the parser makes a transition from having multiple states

	6533 to having one, it reverts to the normal LALR(1) parsing algorithm,

	6534 after resolving and executing the saved-up actions. At this

	6535 transition, some of the states on the stack will have semantic values

	6536 that are sets (actually multisets) of possible actions. The parser

	6537 tries to pick one of the actions by first finding one whose rule has

	6538 the highest dynamic precedence, as set by the `%dprec' declaration.

	6539 Otherwise, if the alternative actions are not ordered by precedence,

	6540 but there the same merging function is declared for both rules by the

	6541 `%merge' declaration, Bison resolves and evaluates both and then calls

	6542 the merge function on the result. Otherwise, it reports an ambiguity.

	6543

	6544 It is possible to use a data structure for the GLR parsing tree that

	6545 permits the processing of any LALR(1) grammar in linear time (in the

	6546 size of the input), any unambiguous (not necessarily LALR(1)) grammar in

	6547 quadratic worst-case time, and any general (possibly ambiguous)

	6548 context-free grammar in cubic worst-case time. However, Bison currently

	6549 uses a simpler data structure that requires time proportional to the

	6550 length of the input times the maximum number of stacks required for any

	6551 prefix of the input. Thus, really ambiguous or nondeterministic

	6552 grammars can require exponential time and space to process. Such badly

	6553 behaving examples, however, are not generally of practical interest.

	6554 Usually, nondeterminism in a grammar is local--the parser is "in doubt"

	6555 only for a few tokens at a time. Therefore, the current data structure

	6556 should generally be adequate. On LALR(1) portions of a grammar, in

	6557 particular, it is only slightly slower than with the default Bison

	6558 parser.

	6559

	6560 For a more detailed exposition of GLR parsers, please see: Elizabeth

	6561 Scott, Adrian Johnstone and Shamsa Sadaf Hussain, Tomita-Style

	6562 Generalised LR Parsers, Royal Holloway, University of London,

	6563 Department of Computer Science, TR-00-12,

	6564 `http://www.cs.rhul.ac.uk/research/languages/publications/tomita_style_1.ps',

	6565 (2000-12-24).

	6566

	6567

	6568 File: bison.info, Node: Memory Management, Prev: Generalized LR Parsing, Up: Algorithm

	6569

	6570 5.9 Memory Management, and How to Avoid Memory Exhaustion

	6571 =========================================================

	6572

	6573 The Bison parser stack can run out of memory if too many tokens are

	6574 shifted and not reduced. When this happens, the parser function

	6575 `yyparse' calls `yyerror' and then returns 2.

	6576

	6577 Because Bison parsers have growing stacks, hitting the upper limit

	6578 usually results from using a right recursion instead of a left

	6579 recursion, *Note Recursive Rules: Recursion.

	6580

	6581 By defining the macro `YYMAXDEPTH', you can control how deep the

	6582 parser stack can become before memory is exhausted. Define the macro

	6583 with a value that is an integer. This value is the maximum number of

	6584 tokens that can be shifted (and not reduced) before overflow.

	6585

	6586 The stack space allowed is not necessarily allocated. If you

	6587 specify a large value for `YYMAXDEPTH', the parser normally allocates a

	6588 small stack at first, and then makes it bigger by stages as needed.

	6589 This increasing allocation happens automatically and silently.

	6590 Therefore, you do not need to make `YYMAXDEPTH' painfully small merely

	6591 to save space for ordinary inputs that do not need much stack.

	6592

	6593 However, do not allow `YYMAXDEPTH' to be a value so large that

	6594 arithmetic overflow could occur when calculating the size of the stack

	6595 space. Also, do not allow `YYMAXDEPTH' to be less than `YYINITDEPTH'.

	6596

	6597 The default value of `YYMAXDEPTH', if you do not define it, is 10000.

	6598

	6599 You can control how much stack is allocated initially by defining the

	6600 macro `YYINITDEPTH' to a positive integer. For the C LALR(1) parser,

	6601 this value must be a compile-time constant unless you are assuming C99

	6602 or some other target language or compiler that allows variable-length

	6603 arrays. The default is 200.

	6604

	6605 Do not allow `YYINITDEPTH' to be greater than `YYMAXDEPTH'.

	6606

	6607 Because of semantical differences between C and C++, the LALR(1)

	6608 parsers in C produced by Bison cannot grow when compiled by C++

	6609 compilers. In this precise case (compiling a C parser as C++) you are

	6610 suggested to grow `YYINITDEPTH'. The Bison maintainers hope to fix

	6611 this deficiency in a future release.

	6612

	6613

	6614 File: bison.info, Node: Error Recovery, Next: Context Dependency, Prev: Algor ithm, Up: Top

	6615

	6616 6 Error Recovery

	6617 ****************

	6618

	6619 It is not usually acceptable to have a program terminate on a syntax

	6620 error. For example, a compiler should recover sufficiently to parse the

	6621 rest of the input file and check it for errors; a calculator should

	6622 accept another expression.

	6623

	6624 In a simple interactive command parser where each input is one line,

	6625 it may be sufficient to allow `yyparse' to return 1 on error and have

	6626 the caller ignore the rest of the input line when that happens (and

	6627 then call `yyparse' again). But this is inadequate for a compiler,

	6628 because it forgets all the syntactic context leading up to the error.

	6629 A syntax error deep within a function in the compiler input should not

	6630 cause the compiler to treat the following line like the beginning of a

	6631 source file.

	6632

	6633 You can define how to recover from a syntax error by writing rules to

	6634 recognize the special token `error'. This is a terminal symbol that is

	6635 always defined (you need not declare it) and reserved for error

	6636 handling. The Bison parser generates an `error' token whenever a

	6637 syntax error happens; if you have provided a rule to recognize this

	6638 token in the current context, the parse can continue.

	6639

	6640 For example:

	6641

	6642 stmnts: /* empty string */

	6643 \| stmnts '\n'

	6644 \| stmnts exp '\n'

	6645 \| stmnts error '\n'

	6646

	6647 The fourth rule in this example says that an error followed by a

	6648 newline makes a valid addition to any `stmnts'.

	6649

	6650 What happens if a syntax error occurs in the middle of an `exp'? The

	6651 error recovery rule, interpreted strictly, applies to the precise

	6652 sequence of a `stmnts', an `error' and a newline. If an error occurs in

	6653 the middle of an `exp', there will probably be some additional tokens

	6654 and subexpressions on the stack after the last `stmnts', and there will

	6655 be tokens to read before the next newline. So the rule is not

	6656 applicable in the ordinary way.

	6657

	6658 But Bison can force the situation to fit the rule, by discarding

	6659 part of the semantic context and part of the input. First it discards

	6660 states and objects from the stack until it gets back to a state in

	6661 which the `error' token is acceptable. (This means that the

	6662 subexpressions already parsed are discarded, back to the last complete

	6663 `stmnts'.) At this point the `error' token can be shifted. Then, if

	6664 the old lookahead token is not acceptable to be shifted next, the

	6665 parser reads tokens and discards them until it finds a token which is

	6666 acceptable. In this example, Bison reads and discards input until the

	6667 next newline so that the fourth rule can apply. Note that discarded

	6668 symbols are possible sources of memory leaks, see *Note Freeing

	6669 Discarded Symbols: Destructor Decl, for a means to reclaim this memory.

	6670

	6671 The choice of error rules in the grammar is a choice of strategies

	6672 for error recovery. A simple and useful strategy is simply to skip the

	6673 rest of the current input line or current statement if an error is

	6674 detected:

	6675

	6676 stmnt: error ';' /* On error, skip until ';' is read. */

	6677

	6678 It is also useful to recover to the matching close-delimiter of an

	6679 opening-delimiter that has already been parsed. Otherwise the

	6680 close-delimiter will probably appear to be unmatched, and generate

	6681 another, spurious error message:

	6682

	6683 primary: '(' expr ')'

	6684 \| '(' error ')'

	6685 ...

	6686 ;

	6687

	6688 Error recovery strategies are necessarily guesses. When they guess

	6689 wrong, one syntax error often leads to another. In the above example,

	6690 the error recovery rule guesses that an error is due to bad input

	6691 within one `stmnt'. Suppose that instead a spurious semicolon is

	6692 inserted in the middle of a valid `stmnt'. After the error recovery

	6693 rule recovers from the first error, another syntax error will be found

	6694 straightaway, since the text following the spurious semicolon is also

	6695 an invalid `stmnt'.

	6696

	6697 To prevent an outpouring of error messages, the parser will output

	6698 no error message for another syntax error that happens shortly after

	6699 the first; only after three consecutive input tokens have been

	6700 successfully shifted will error messages resume.

	6701

	6702 Note that rules which accept the `error' token may have actions, just

	6703 as any other rules can.

	6704

	6705 You can make error messages resume immediately by using the macro

	6706 `yyerrok' in an action. If you do this in the error rule's action, no

	6707 error messages will be suppressed. This macro requires no arguments;

	6708 `yyerrok;' is a valid C statement.

	6709

	6710 The previous lookahead token is reanalyzed immediately after an

	6711 error. If this is unacceptable, then the macro `yyclearin' may be used

	6712 to clear this token. Write the statement `yyclearin;' in the error

	6713 rule's action. *Note Special Features for Use in Actions: Action

	6714 Features.

	6715

	6716 For example, suppose that on a syntax error, an error handling

	6717 routine is called that advances the input stream to some point where

	6718 parsing should once again commence. The next symbol returned by the

	6719 lexical scanner is probably correct. The previous lookahead token

	6720 ought to be discarded with `yyclearin;'.

	6721

	6722 The expression `YYRECOVERING ()' yields 1 when the parser is

	6723 recovering from a syntax error, and 0 otherwise. Syntax error

	6724 diagnostics are suppressed while recovering from a syntax error.

	6725

	6726

	6727 File: bison.info, Node: Context Dependency, Next: Debugging, Prev: Error Reco very, Up: Top

	6728

	6729 7 Handling Context Dependencies

	6730 *******************************

	6731

	6732 The Bison paradigm is to parse tokens first, then group them into larger

	6733 syntactic units. In many languages, the meaning of a token is affected

	6734 by its context. Although this violates the Bison paradigm, certain

	6735 techniques (known as "kludges") may enable you to write Bison parsers

	6736 for such languages.

	6737

	6738 * Menu:

	6739

	6740 * Semantic Tokens:: Token parsing can depend on the semantic context.

	6741 * Lexical Tie-ins:: Token parsing can depend on the syntactic context.

	6742 * Tie-in Recovery:: Lexical tie-ins have implications for how

	6743 error recovery rules must be written.

	6744

	6745 (Actually, "kludge" means any technique that gets its job done but is

	6746 neither clean nor robust.)

	6747

	6748

	6749 File: bison.info, Node: Semantic Tokens, Next: Lexical Tie-ins, Up: Context D ependency

	6750

	6751 7.1 Semantic Info in Token Types

	6752 ================================

	6753

	6754 The C language has a context dependency: the way an identifier is used

	6755 depends on what its current meaning is. For example, consider this:

	6756

	6757 foo (x);

	6758

	6759 This looks like a function call statement, but if `foo' is a typedef

	6760 name, then this is actually a declaration of `x'. How can a Bison

	6761 parser for C decide how to parse this input?

	6762

	6763 The method used in GNU C is to have two different token types,

	6764 `IDENTIFIER' and `TYPENAME'. When `yylex' finds an identifier, it

	6765 looks up the current declaration of the identifier in order to decide

	6766 which token type to return: `TYPENAME' if the identifier is declared as

	6767 a typedef, `IDENTIFIER' otherwise.

	6768

	6769 The grammar rules can then express the context dependency by the

	6770 choice of token type to recognize. `IDENTIFIER' is accepted as an

	6771 expression, but `TYPENAME' is not. `TYPENAME' can start a declaration,

	6772 but `IDENTIFIER' cannot. In contexts where the meaning of the

	6773 identifier is _not_ significant, such as in declarations that can

	6774 shadow a typedef name, either `TYPENAME' or `IDENTIFIER' is

	6775 accepted--there is one rule for each of the two token types.

	6776

	6777 This technique is simple to use if the decision of which kinds of

	6778 identifiers to allow is made at a place close to where the identifier is

	6779 parsed. But in C this is not always so: C allows a declaration to

	6780 redeclare a typedef name provided an explicit type has been specified

	6781 earlier:

	6782

	6783 typedef int foo, bar;

	6784 int baz (void)

	6785 {

	6786 static bar (bar); /* redeclare `bar' as static variable */

	6787 extern foo foo (foo); /* redeclare `foo' as function */

	6788 return foo (bar);

	6789 }

	6790

	6791 Unfortunately, the name being declared is separated from the

	6792 declaration construct itself by a complicated syntactic structure--the

	6793 "declarator".

	6794

	6795 As a result, part of the Bison parser for C needs to be duplicated,

	6796 with all the nonterminal names changed: once for parsing a declaration

	6797 in which a typedef name can be redefined, and once for parsing a

	6798 declaration in which that can't be done. Here is a part of the

	6799 duplication, with actions omitted for brevity:

	6800

	6801 initdcl:

	6802 declarator maybeasm '='

	6803 init

	6804 \| declarator maybeasm

	6805 ;

	6806

	6807 notype_initdcl:

	6808 notype_declarator maybeasm '='

	6809 init

	6810 \| notype_declarator maybeasm

	6811 ;

	6812

	6813 Here `initdcl' can redeclare a typedef name, but `notype_initdcl'

	6814 cannot. The distinction between `declarator' and `notype_declarator'

	6815 is the same sort of thing.

	6816

	6817 There is some similarity between this technique and a lexical tie-in

	6818 (described next), in that information which alters the lexical analysis

	6819 is changed during parsing by other parts of the program. The

	6820 difference is here the information is global, and is used for other

	6821 purposes in the program. A true lexical tie-in has a special-purpose

	6822 flag controlled by the syntactic context.

	6823

	6824

	6825 File: bison.info, Node: Lexical Tie-ins, Next: Tie-in Recovery, Prev: Semanti c Tokens, Up: Context Dependency

	6826

	6827 7.2 Lexical Tie-ins

	6828 ===================

	6829

	6830 One way to handle context-dependency is the "lexical tie-in": a flag

	6831 which is set by Bison actions, whose purpose is to alter the way tokens

	6832 are parsed.

	6833

	6834 For example, suppose we have a language vaguely like C, but with a

	6835 special construct `hex (HEX-EXPR)'. After the keyword `hex' comes an

	6836 expression in parentheses in which all integers are hexadecimal. In

	6837 particular, the token `a1b' must be treated as an integer rather than

	6838 as an identifier if it appears in that context. Here is how you can do

	6839 it:

	6840

	6841 %{

	6842 int hexflag;

	6843 int yylex (void);

	6844 void yyerror (char const *);

	6845 %}

	6846 %%

	6847 ...

	6848 expr: IDENTIFIER

	6849 \| constant

	6850 \| HEX '('

	6851 { hexflag = 1; }

	6852 expr ')'

	6853 { hexflag = 0;

	6854 $$ = $4; }

	6855 \| expr '+' expr

	6856 { $$ = make_sum ($1, $3); }

	6857 ...

	6858 ;

	6859

	6860 constant:

	6861 INTEGER

	6862 \| STRING

	6863 ;

	6864

	6865 Here we assume that `yylex' looks at the value of `hexflag'; when it is

	6866 nonzero, all integers are parsed in hexadecimal, and tokens starting

	6867 with letters are parsed as integers if possible.

	6868

	6869 The declaration of `hexflag' shown in the prologue of the parser file

	6870 is needed to make it accessible to the actions (*note The Prologue:

	6871 Prologue.). You must also write the code in `yylex' to obey the flag.

	6872

	6873

	6874 File: bison.info, Node: Tie-in Recovery, Prev: Lexical Tie-ins, Up: Context D ependency

	6875

	6876 7.3 Lexical Tie-ins and Error Recovery

	6877 ======================================

	6878

	6879 Lexical tie-ins make strict demands on any error recovery rules you

	6880 have. *Note Error Recovery::.

	6881

	6882 The reason for this is that the purpose of an error recovery rule is

	6883 to abort the parsing of one construct and resume in some larger

	6884 construct. For example, in C-like languages, a typical error recovery

	6885 rule is to skip tokens until the next semicolon, and then start a new

	6886 statement, like this:

	6887

	6888 stmt: expr ';'

	6889 \| IF '(' expr ')' stmt { ... }

	6890 ...

	6891 error ';'

	6892 { hexflag = 0; }

	6893 ;

	6894

	6895 If there is a syntax error in the middle of a `hex (EXPR)'

	6896 construct, this error rule will apply, and then the action for the

	6897 completed `hex (EXPR)' will never run. So `hexflag' would remain set

	6898 for the entire rest of the input, or until the next `hex' keyword,

	6899 causing identifiers to be misinterpreted as integers.

	6900

	6901 To avoid this problem the error recovery rule itself clears

	6902 `hexflag'.

	6903

	6904 There may also be an error recovery rule that works within

	6905 expressions. For example, there could be a rule which applies within

	6906 parentheses and skips to the close-parenthesis:

	6907

	6908 expr: ...

	6909 \| '(' expr ')'

	6910 { $$ = $2; }

	6911 \| '(' error ')'

	6912 ...

	6913

	6914 If this rule acts within the `hex' construct, it is not going to

	6915 abort that construct (since it applies to an inner level of parentheses

	6916 within the construct). Therefore, it should not clear the flag: the

	6917 rest of the `hex' construct should be parsed with the flag still in

	6918 effect.

	6919

	6920 What if there is an error recovery rule which might abort out of the

	6921 `hex' construct or might not, depending on circumstances? There is no

	6922 way you can write the action to determine whether a `hex' construct is

	6923 being aborted or not. So if you are using a lexical tie-in, you had

	6924 better make sure your error recovery rules are not of this kind. Each

	6925 rule must be such that you can be sure that it always will, or always

	6926 won't, have to clear the flag.

	6927

	6928

	6929 File: bison.info, Node: Debugging, Next: Invocation, Prev: Context Dependency , Up: Top

	6930

	6931 8 Debugging Your Parser

	6932 ***********************

	6933

	6934 Developing a parser can be a challenge, especially if you don't

	6935 understand the algorithm (*note The Bison Parser Algorithm:

	6936 Algorithm.). Even so, sometimes a detailed description of the automaton

	6937 can help (*note Understanding Your Parser: Understanding.), or tracing

	6938 the execution of the parser can give some insight on why it behaves

	6939 improperly (*note Tracing Your Parser: Tracing.).

	6940

	6941 * Menu:

	6942

	6943 * Understanding:: Understanding the structure of your parser.

	6944 * Tracing:: Tracing the execution of your parser.

	6945

	6946

	6947 File: bison.info, Node: Understanding, Next: Tracing, Up: Debugging

	6948

	6949 8.1 Understanding Your Parser

	6950 =============================

	6951

	6952 As documented elsewhere (*note The Bison Parser Algorithm: Algorithm.)

	6953 Bison parsers are "shift/reduce automata". In some cases (much more

	6954 frequent than one would hope), looking at this automaton is required to

	6955 tune or simply fix a parser. Bison provides two different

	6956 representation of it, either textually or graphically (as a DOT file).

	6957

	6958 The textual file is generated when the options `--report' or

	6959 `--verbose' are specified, see *Note Invoking Bison: Invocation. Its

	6960 name is made by removing `.tab.c' or `.c' from the parser output file

	6961 name, and adding `.output' instead. Therefore, if the input file is

	6962 `foo.y', then the parser file is called `foo.tab.c' by default. As a

	6963 consequence, the verbose output file is called `foo.output'.

	6964

	6965 The following grammar file, `calc.y', will be used in the sequel:

	6966

	6967 %token NUM STR

	6968 %left '+' '-'

	6969 %left '*'

	6970 %%

	6971 exp: exp '+' exp

	6972 \| exp '-' exp

	6973 \| exp '*' exp

	6974 \| exp '/' exp

	6975 \| NUM

	6976 ;

	6977 useless: STR;

	6978 %%

	6979

	6980 `bison' reports:

	6981

	6982 calc.y: warning: 1 nonterminal and 1 rule useless in grammar

	6983 calc.y:11.1-7: warning: nonterminal useless in grammar: useless

	6984 calc.y:11.10-12: warning: rule useless in grammar: useless: STR

	6985 calc.y: conflicts: 7 shift/reduce

	6986

	6987 When given `--report=state', in addition to `calc.tab.c', it creates

	6988 a file `calc.output' with contents detailed below. The order of the

	6989 output and the exact presentation might vary, but the interpretation is

	6990 the same.

	6991

	6992 The first section includes details on conflicts that were solved

	6993 thanks to precedence and/or associativity:

	6994

	6995 Conflict in state 8 between rule 2 and token '+' resolved as reduce.

	6996 Conflict in state 8 between rule 2 and token '-' resolved as reduce.

	6997 Conflict in state 8 between rule 2 and token '*' resolved as shift.

	6998 ...

	6999

	7000

	7001 The next section lists states that still have conflicts.

	7002

	7003 State 8 conflicts: 1 shift/reduce

	7004 State 9 conflicts: 1 shift/reduce

	7005 State 10 conflicts: 1 shift/reduce

	7006 State 11 conflicts: 4 shift/reduce

	7007

	7008 The next section reports useless tokens, nonterminal and rules. Useless

	7009 nonterminals and rules are removed in order to produce a smaller parser,

	7010 but useless tokens are preserved, since they might be used by the

	7011 scanner (note the difference between "useless" and "unused" below):

	7012

	7013 Nonterminals useless in grammar:

	7014 useless

	7015

	7016 Terminals unused in grammar:

	7017 STR

	7018

	7019 Rules useless in grammar:

	7020 #6 useless: STR;

	7021

	7022 The next section reproduces the exact grammar that Bison used:

	7023

	7024 Grammar

	7025

	7026 Number, Line, Rule

	7027 0 5 $accept -> exp $end

	7028 1 5 exp -> exp '+' exp

	7029 2 6 exp -> exp '-' exp

	7030 3 7 exp -> exp '*' exp

	7031 4 8 exp -> exp '/' exp

	7032 5 9 exp -> NUM

	7033

	7034 and reports the uses of the symbols:

	7035

	7036 Terminals, with rules where they appear

	7037

	7038 $end (0) 0

	7039 '*' (42) 3

	7040 '+' (43) 1

	7041 '-' (45) 2

	7042 '/' (47) 4

	7043 error (256)

	7044 NUM (258) 5

	7045

	7046 Nonterminals, with rules where they appear

	7047

	7048 $accept (8)

	7049 on left: 0

	7050 exp (9)

	7051 on left: 1 2 3 4 5, on right: 0 1 2 3 4

	7052

	7053 Bison then proceeds onto the automaton itself, describing each state

	7054 with it set of "items", also known as "pointed rules". Each item is a

	7055 production rule together with a point (marked by `.') that the input

	7056 cursor.

	7057

	7058 state 0

	7059

	7060 $accept -> . exp $ (rule 0)

	7061

	7062 NUM shift, and go to state 1

	7063

	7064 exp go to state 2

	7065

	7066 This reads as follows: "state 0 corresponds to being at the very

	7067 beginning of the parsing, in the initial rule, right before the start

	7068 symbol (here, `exp'). When the parser returns to this state right

	7069 after having reduced a rule that produced an `exp', the control flow

	7070 jumps to state 2. If there is no such transition on a nonterminal

	7071 symbol, and the lookahead is a `NUM', then this token is shifted on the

	7072 parse stack, and the control flow jumps to state 1. Any other

	7073 lookahead triggers a syntax error."

	7074

	7075 Even though the only active rule in state 0 seems to be rule 0, the

	7076 report lists `NUM' as a lookahead token because `NUM' can be at the

	7077 beginning of any rule deriving an `exp'. By default Bison reports the

	7078 so-called "core" or "kernel" of the item set, but if you want to see

	7079 more detail you can invoke `bison' with `--report=itemset' to list all

	7080 the items, include those that can be derived:

	7081

	7082 state 0

	7083

	7084 $accept -> . exp $ (rule 0)

	7085 exp -> . exp '+' exp (rule 1)

	7086 exp -> . exp '-' exp (rule 2)

	7087 exp -> . exp '*' exp (rule 3)

	7088 exp -> . exp '/' exp (rule 4)

	7089 exp -> . NUM (rule 5)

	7090

	7091 NUM shift, and go to state 1

	7092

	7093 exp go to state 2

	7094

	7095 In the state 1...

	7096

	7097 state 1

	7098

	7099 exp -> NUM . (rule 5)

	7100

	7101 $default reduce using rule 5 (exp)

	7102

	7103 the rule 5, `exp: NUM;', is completed. Whatever the lookahead token

	7104 (`$default'), the parser will reduce it. If it was coming from state

	7105 0, then, after this reduction it will return to state 0, and will jump

	7106 to state 2 (`exp: go to state 2').

	7107

	7108 state 2

	7109

	7110 $accept -> exp . $ (rule 0)

	7111 exp -> exp . '+' exp (rule 1)

	7112 exp -> exp . '-' exp (rule 2)

	7113 exp -> exp . '*' exp (rule 3)

	7114 exp -> exp . '/' exp (rule 4)

	7115

	7116 $ shift, and go to state 3

	7117 '+' shift, and go to state 4

	7118 '-' shift, and go to state 5

	7119 '*' shift, and go to state 6

	7120 '/' shift, and go to state 7

	7121

	7122 In state 2, the automaton can only shift a symbol. For instance,

	7123 because of the item `exp -> exp . '+' exp', if the lookahead if `+', it

	7124 will be shifted on the parse stack, and the automaton control will jump

	7125 to state 4, corresponding to the item `exp -> exp '+' . exp'. Since

	7126 there is no default action, any other token than those listed above

	7127 will trigger a syntax error.

	7128

	7129 The state 3 is named the "final state", or the "accepting state":

	7130

	7131 state 3

	7132

	7133 $accept -> exp $ . (rule 0)

	7134

	7135 $default accept

	7136

	7137 the initial rule is completed (the start symbol and the end of input

	7138 were read), the parsing exits successfully.

	7139

	7140 The interpretation of states 4 to 7 is straightforward, and is left

	7141 to the reader.

	7142

	7143 state 4

	7144

	7145 exp -> exp '+' . exp (rule 1)

	7146

	7147 NUM shift, and go to state 1

	7148

	7149 exp go to state 8

	7150

	7151 state 5

	7152

	7153 exp -> exp '-' . exp (rule 2)

	7154

	7155 NUM shift, and go to state 1

	7156

	7157 exp go to state 9

	7158

	7159 state 6

	7160

	7161 exp -> exp '*' . exp (rule 3)

	7162

	7163 NUM shift, and go to state 1

	7164

	7165 exp go to state 10

	7166

	7167 state 7

	7168

	7169 exp -> exp '/' . exp (rule 4)

	7170

	7171 NUM shift, and go to state 1

	7172

	7173 exp go to state 11

	7174

	7175 As was announced in beginning of the report, `State 8 conflicts: 1

	7176 shift/reduce':

	7177

	7178 state 8

	7179

	7180 exp -> exp . '+' exp (rule 1)

	7181 exp -> exp '+' exp . (rule 1)

	7182 exp -> exp . '-' exp (rule 2)

	7183 exp -> exp . '*' exp (rule 3)

	7184 exp -> exp . '/' exp (rule 4)

	7185

	7186 '*' shift, and go to state 6

	7187 '/' shift, and go to state 7

	7188

	7189 '/' [reduce using rule 1 (exp)]

	7190 $default reduce using rule 1 (exp)

	7191

	7192 Indeed, there are two actions associated to the lookahead `/':

	7193 either shifting (and going to state 7), or reducing rule 1. The

	7194 conflict means that either the grammar is ambiguous, or the parser lacks

	7195 information to make the right decision. Indeed the grammar is

	7196 ambiguous, as, since we did not specify the precedence of `/', the

	7197 sentence `NUM + NUM / NUM' can be parsed as `NUM + (NUM / NUM)', which

	7198 corresponds to shifting `/', or as `(NUM + NUM) / NUM', which

	7199 corresponds to reducing rule 1.

	7200

	7201 Because in LALR(1) parsing a single decision can be made, Bison

	7202 arbitrarily chose to disable the reduction, see *Note Shift/Reduce

	7203 Conflicts: Shift/Reduce. Discarded actions are reported in between

	7204 square brackets.

	7205

	7206 Note that all the previous states had a single possible action:

	7207 either shifting the next token and going to the corresponding state, or

	7208 reducing a single rule. In the other cases, i.e., when shifting _and_

	7209 reducing is possible or when _several_ reductions are possible, the

	7210 lookahead is required to select the action. State 8 is one such state:

	7211 if the lookahead is `*' or `/' then the action is shifting, otherwise

	7212 the action is reducing rule 1. In other words, the first two items,

	7213 corresponding to rule 1, are not eligible when the lookahead token is

	7214 `', since we specified that `' has higher precedence than `+'. More

	7215 generally, some items are eligible only with some set of possible

	7216 lookahead tokens. When run with `--report=lookahead', Bison specifies

	7217 these lookahead tokens:

	7218

	7219 state 8

	7220

	7221 exp -> exp . '+' exp (rule 1)

	7222 exp -> exp '+' exp . [$, '+', '-', '/'] (rule 1)

	7223 exp -> exp . '-' exp (rule 2)

	7224 exp -> exp . '*' exp (rule 3)

	7225 exp -> exp . '/' exp (rule 4)

	7226

	7227 '*' shift, and go to state 6

	7228 '/' shift, and go to state 7

	7229

	7230 '/' [reduce using rule 1 (exp)]

	7231 $default reduce using rule 1 (exp)

	7232

	7233 The remaining states are similar:

	7234

	7235 state 9

	7236

	7237 exp -> exp . '+' exp (rule 1)

	7238 exp -> exp . '-' exp (rule 2)

	7239 exp -> exp '-' exp . (rule 2)

	7240 exp -> exp . '*' exp (rule 3)

	7241 exp -> exp . '/' exp (rule 4)

	7242

	7243 '*' shift, and go to state 6

	7244 '/' shift, and go to state 7

	7245

	7246 '/' [reduce using rule 2 (exp)]

	7247 $default reduce using rule 2 (exp)

	7248

	7249 state 10

	7250

	7251 exp -> exp . '+' exp (rule 1)

	7252 exp -> exp . '-' exp (rule 2)

	7253 exp -> exp . '*' exp (rule 3)

	7254 exp -> exp '*' exp . (rule 3)

	7255 exp -> exp . '/' exp (rule 4)

	7256

	7257 '/' shift, and go to state 7

	7258

	7259 '/' [reduce using rule 3 (exp)]

	7260 $default reduce using rule 3 (exp)

	7261

	7262 state 11

	7263

	7264 exp -> exp . '+' exp (rule 1)

	7265 exp -> exp . '-' exp (rule 2)

	7266 exp -> exp . '*' exp (rule 3)

	7267 exp -> exp . '/' exp (rule 4)

	7268 exp -> exp '/' exp . (rule 4)

	7269

	7270 '+' shift, and go to state 4

	7271 '-' shift, and go to state 5

	7272 '*' shift, and go to state 6

	7273 '/' shift, and go to state 7

	7274

	7275 '+' [reduce using rule 4 (exp)]

	7276 '-' [reduce using rule 4 (exp)]

	7277 '*' [reduce using rule 4 (exp)]

	7278 '/' [reduce using rule 4 (exp)]

	7279 $default reduce using rule 4 (exp)

	7280

	7281 Observe that state 11 contains conflicts not only due to the lack of

	7282 precedence of `/' with respect to `+', `-', and `*', but also because

	7283 the associativity of `/' is not specified.

	7284

	7285

	7286 File: bison.info, Node: Tracing, Prev: Understanding, Up: Debugging

	7287

	7288 8.2 Tracing Your Parser

	7289 =======================

	7290

	7291 If a Bison grammar compiles properly but doesn't do what you want when

	7292 it runs, the `yydebug' parser-trace feature can help you figure out why.

	7293

	7294 There are several means to enable compilation of trace facilities:

	7295

	7296 the macro `YYDEBUG'

	7297 Define the macro `YYDEBUG' to a nonzero value when you compile the

	7298 parser. This is compliant with POSIX Yacc. You could use

	7299 `-DYYDEBUG=1' as a compiler option or you could put `#define

	7300 YYDEBUG 1' in the prologue of the grammar file (*note The

	7301 Prologue: Prologue.).

	7302

	7303 the option `-t', `--debug'

	7304 Use the `-t' option when you run Bison (*note Invoking Bison:

	7305 Invocation.). This is POSIX compliant too.

	7306

	7307 the directive `%debug'

	7308 Add the `%debug' directive (*note Bison Declaration Summary: Decl

	7309 Summary.). This is a Bison extension, which will prove useful

	7310 when Bison will output parsers for languages that don't use a

	7311 preprocessor. Unless POSIX and Yacc portability matter to you,

	7312 this is the preferred solution.

	7313

	7314 We suggest that you always enable the debug option so that debugging

	7315 is always possible.

	7316

	7317 The trace facility outputs messages with macro calls of the form

	7318 `YYFPRINTF (stderr, FORMAT, ARGS)' where FORMAT and ARGS are the usual

	7319 `printf' format and variadic arguments. If you define `YYDEBUG' to a

	7320 nonzero value but do not define `YYFPRINTF', `<stdio.h>' is

	7321 automatically included and `YYFPRINTF' is defined to `fprintf'.

	7322

	7323 Once you have compiled the program with trace facilities, the way to

	7324 request a trace is to store a nonzero value in the variable `yydebug'.

	7325 You can do this by making the C code do it (in `main', perhaps), or you

	7326 can alter the value with a C debugger.

	7327

	7328 Each step taken by the parser when `yydebug' is nonzero produces a

	7329 line or two of trace information, written on `stderr'. The trace

	7330 messages tell you these things:

	7331

	7332 * Each time the parser calls `yylex', what kind of token was read.

	7333

	7334 * Each time a token is shifted, the depth and complete contents of

	7335 the state stack (*note Parser States::).

	7336

	7337 * Each time a rule is reduced, which rule it is, and the complete

	7338 contents of the state stack afterward.

	7339

	7340 To make sense of this information, it helps to refer to the listing

	7341 file produced by the Bison `-v' option (*note Invoking Bison:

	7342 Invocation.). This file shows the meaning of each state in terms of

	7343 positions in various rules, and also what each state will do with each

	7344 possible input token. As you read the successive trace messages, you

	7345 can see that the parser is functioning according to its specification in

	7346 the listing file. Eventually you will arrive at the place where

	7347 something undesirable happens, and you will see which parts of the

	7348 grammar are to blame.

	7349

	7350 The parser file is a C program and you can use C debuggers on it,

	7351 but it's not easy to interpret what it is doing. The parser function

	7352 is a finite-state machine interpreter, and aside from the actions it

	7353 executes the same code over and over. Only the values of variables

	7354 show where in the grammar it is working.

	7355

	7356 The debugging information normally gives the token type of each token

	7357 read, but not its semantic value. You can optionally define a macro

	7358 named `YYPRINT' to provide a way to print the value. If you define

	7359 `YYPRINT', it should take three arguments. The parser will pass a

	7360 standard I/O stream, the numeric code for the token type, and the token

	7361 value (from `yylval').

	7362

	7363 Here is an example of `YYPRINT' suitable for the multi-function

	7364 calculator (*note Declarations for `mfcalc': Mfcalc Declarations.):

	7365

	7366 %{

	7367 static void print_token_value (FILE *, int, YYSTYPE);

	7368 #define YYPRINT(file, type, value) print_token_value (file, type, value)

	7369 %}

	7370

	7371 ... %% ... %% ...

	7372

	7373 static void

	7374 print_token_value (FILE *file, int type, YYSTYPE value)

	7375 {

	7376 if (type == VAR)

	7377 fprintf (file, "%s", value.tptr->name);

	7378 else if (type == NUM)

	7379 fprintf (file, "%d", value.val);

	7380 }

	7381

	7382

	7383 File: bison.info, Node: Invocation, Next: Other Languages, Prev: Debugging, Up: Top

	7384

	7385 9 Invoking Bison

	7386 ****************

	7387

	7388 The usual way to invoke Bison is as follows:

	7389

	7390 bison INFILE

	7391

	7392 Here INFILE is the grammar file name, which usually ends in `.y'.

	7393 The parser file's name is made by replacing the `.y' with `.tab.c' and

	7394 removing any leading directory. Thus, the `bison foo.y' file name

	7395 yields `foo.tab.c', and the `bison hack/foo.y' file name yields

	7396 `foo.tab.c'. It's also possible, in case you are writing C++ code

	7397 instead of C in your grammar file, to name it `foo.ypp' or `foo.y++'.

	7398 Then, the output files will take an extension like the given one as

	7399 input (respectively `foo.tab.cpp' and `foo.tab.c++'). This feature

	7400 takes effect with all options that manipulate file names like `-o' or

	7401 `-d'.

	7402

	7403 For example :

	7404

	7405 bison -d INFILE.YXX

	7406 will produce `infile.tab.cxx' and `infile.tab.hxx', and

	7407

	7408 bison -d -o OUTPUT.C++ INFILE.Y

	7409 will produce `output.c++' and `outfile.h++'.

	7410

	7411 For compatibility with POSIX, the standard Bison distribution also

	7412 contains a shell script called `yacc' that invokes Bison with the `-y'

	7413 option.

	7414

	7415 * Menu:

	7416

	7417 * Bison Options:: All the options described in detail,

	7418 in alphabetical order by short options.

	7419 * Option Cross Key:: Alphabetical list of long options.

	7420 * Yacc Library:: Yacc-compatible `yylex' and `main'.

	7421

	7422

	7423 File: bison.info, Node: Bison Options, Next: Option Cross Key, Up: Invocation

	7424

	7425 9.1 Bison Options

	7426 =================

	7427

	7428 Bison supports both traditional single-letter options and mnemonic long

	7429 option names. Long option names are indicated with `--' instead of

	7430 `-'. Abbreviations for option names are allowed as long as they are

	7431 unique. When a long option takes an argument, like `--file-prefix',

	7432 connect the option name and the argument with `='.

	7433

	7434 Here is a list of options that can be used with Bison, alphabetized

	7435 by short option. It is followed by a cross key alphabetized by long

	7436 option.

	7437

	7438 Operations modes:

	7439 `-h'

	7440 `--help'

	7441 Print a summary of the command-line options to Bison and exit.

	7442

	7443 `-V'

	7444 `--version'

	7445 Print the version number of Bison and exit.

	7446

	7447 `--print-localedir'

	7448 Print the name of the directory containing locale-dependent data.

	7449

	7450 `--print-datadir'

	7451 Print the name of the directory containing skeletons and XSLT.

	7452

	7453 `-y'

	7454 `--yacc'

	7455 Act more like the traditional Yacc command. This can cause

	7456 different diagnostics to be generated, and may change behavior in

	7457 other minor ways. Most importantly, imitate Yacc's output file

	7458 name conventions, so that the parser output file is called

	7459 `y.tab.c', and the other outputs are called `y.output' and

	7460 `y.tab.h'. Also, if generating an LALR(1) parser in C, generate

	7461 `#define' statements in addition to an `enum' to associate token

	7462 numbers with token names. Thus, the following shell script can

	7463 substitute for Yacc, and the Bison distribution contains such a

	7464 script for compatibility with POSIX:

	7465

	7466 #! /bin/sh

	7467 bison -y "$@"

	7468

	7469 The `-y'/`--yacc' option is intended for use with traditional Yacc

	7470 grammars. If your grammar uses a Bison extension like

	7471 `%glr-parser', Bison might not be Yacc-compatible even if this

	7472 option is specified.

	7473

	7474 `-W'

	7475 `--warnings'

	7476 Output warnings falling in CATEGORY. CATEGORY can be one of:

	7477 `midrule-values'

	7478 Warn about mid-rule values that are set but not used within

	7479 any of the actions of the parent rule. For example, warn

	7480 about unused `$2' in:

	7481

	7482 exp: '1' { $$ = 1; } '+' exp { $$ = $1 + $4; };

	7483

	7484 Also warn about mid-rule values that are used but not set.

	7485 For example, warn about unset `$$' in the mid-rule action in:

	7486

	7487 exp: '1' { $1 = 1; } '+' exp { $$ = $2 + $4; };

	7488

	7489 These warnings are not enabled by default since they

	7490 sometimes prove to be false alarms in existing grammars

	7491 employing the Yacc constructs `$0' or `$-N' (where N is some

	7492 positive integer).

	7493

	7494 `yacc'

	7495 Incompatibilities with POSIX Yacc.

	7496

	7497 `all'

	7498 All the warnings.

	7499

	7500 `none'

	7501 Turn off all the warnings.

	7502

	7503 `error'

	7504 Treat warnings as errors.

	7505

	7506 A category can be turned off by prefixing its name with `no-'. For

	7507 instance, `-Wno-syntax' will hide the warnings about unused

	7508 variables.

	7509

	7510 Tuning the parser:

	7511

	7512 `-t'

	7513 `--debug'

	7514 In the parser file, define the macro `YYDEBUG' to 1 if it is not

	7515 already defined, so that the debugging facilities are compiled.

	7516 *Note Tracing Your Parser: Tracing.

	7517

	7518 `-L LANGUAGE'

	7519 `--language=LANGUAGE'

	7520 Specify the programming language for the generated parser, as if

	7521 `%language' was specified (*note Bison Declaration Summary: Decl

	7522 Summary.). Currently supported languages include C, C++, and Java.

	7523 LANGUAGE is case-insensitive.

	7524

	7525 This option is experimental and its effect may be modified in

	7526 future releases.

	7527

	7528 `--locations'

	7529 Pretend that `%locations' was specified. *Note Decl Summary::.

	7530

	7531 `-p PREFIX'

	7532 `--name-prefix=PREFIX'

	7533 Pretend that `%name-prefix "PREFIX"' was specified. *Note Decl

	7534 Summary::.

	7535

	7536 `-l'

	7537 `--no-lines'

	7538 Don't put any `#line' preprocessor commands in the parser file.

	7539 Ordinarily Bison puts them in the parser file so that the C

	7540 compiler and debuggers will associate errors with your source

	7541 file, the grammar file. This option causes them to associate

	7542 errors with the parser file, treating it as an independent source

	7543 file in its own right.

	7544

	7545 `-S FILE'

	7546 `--skeleton=FILE'

	7547 Specify the skeleton to use, similar to `%skeleton' (*note Bison

	7548 Declaration Summary: Decl Summary.).

	7549

	7550 If FILE does not contain a `/', FILE is the name of a skeleton

	7551 file in the Bison installation directory. If it does, FILE is an

	7552 absolute file name or a file name relative to the current working

	7553 directory. This is similar to how most shells resolve commands.

	7554

	7555 `-k'

	7556 `--token-table'

	7557 Pretend that `%token-table' was specified. *Note Decl Summary::.

	7558

	7559 Adjust the output:

	7560

	7561 `--defines[=FILE]'

	7562 Pretend that `%defines' was specified, i.e., write an extra output

	7563 file containing macro definitions for the token type names defined

	7564 in the grammar, as well as a few other declarations. *Note Decl

	7565 Summary::.

	7566

	7567 `-d'

	7568 This is the same as `--defines' except `-d' does not accept a FILE

	7569 argument since POSIX Yacc requires that `-d' can be bundled with

	7570 other short options.

	7571

	7572 `-b FILE-PREFIX'

	7573 `--file-prefix=PREFIX'

	7574 Pretend that `%file-prefix' was specified, i.e., specify prefix to

	7575 use for all Bison output file names. *Note Decl Summary::.

	7576

	7577 `-r THINGS'

	7578 `--report=THINGS'

	7579 Write an extra output file containing verbose description of the

	7580 comma separated list of THINGS among:

	7581

	7582 `state'

	7583 Description of the grammar, conflicts (resolved and

	7584 unresolved), and LALR automaton.

	7585

	7586 `lookahead'

	7587 Implies `state' and augments the description of the automaton

	7588 with each rule's lookahead set.

	7589

	7590 `itemset'

	7591 Implies `state' and augments the description of the automaton

	7592 with the full set of items for each state, instead of its

	7593 core only.

	7594

	7595 `--report-file=FILE'

	7596 Specify the FILE for the verbose description.

	7597

	7598 `-v'

	7599 `--verbose'

	7600 Pretend that `%verbose' was specified, i.e., write an extra output

	7601 file containing verbose descriptions of the grammar and parser.

	7602 *Note Decl Summary::.

	7603

	7604 `-o FILE'

	7605 `--output=FILE'

	7606 Specify the FILE for the parser file.

	7607

	7608 The other output files' names are constructed from FILE as

	7609 described under the `-v' and `-d' options.

	7610

	7611 `-g[FILE]'

	7612 `--graph[=FILE]'

	7613 Output a graphical representation of the LALR(1) grammar automaton

	7614 computed by Bison, in Graphviz (http://www.graphviz.org/) DOT

	7615 (http://www.graphviz.org/doc/info/lang.html) format. `FILE' is

	7616 optional. If omitted and the grammar file is `foo.y', the output

	7617 file will be `foo.dot'.

	7618

	7619 `-x[FILE]'

	7620 `--xml[=FILE]'

	7621 Output an XML report of the LALR(1) automaton computed by Bison.

	7622 `FILE' is optional. If omitted and the grammar file is `foo.y',

	7623 the output file will be `foo.xml'. (The current XML schema is

	7624 experimental and may evolve. More user feedback will help to

	7625 stabilize it.)

	7626

	7627

	7628 File: bison.info, Node: Option Cross Key, Next: Yacc Library, Prev: Bison Opt ions, Up: Invocation

	7629

	7630 9.2 Option Cross Key

	7631 ====================

	7632

	7633 Here is a list of options, alphabetized by long option, to help you find

	7634 the corresponding short option.

	7635

	7636 Long Option Short Option

	7637 -------------------------------------------------

	7638 `--debug' `-t'

	7639 `--defines=[FILE]'

	7640 `--file-prefix=PREFIX' `-b' PREFIX

	7641 `--graph=[FILE]' `-g' [FILE]

	7642 `--help' `-h'

	7643 `--language=LANGUAGE' `-L' LANGUAGE

	7644 `--locations'

	7645 `--name-prefix=PREFIX' `-p' PREFIX

	7646 `--no-lines' `-l'

	7647 `--output=FILE' `-o' FILE

	7648 `--print-datadir'

	7649 `--print-localedir'

	7650 `--report-file=FILE'

	7651 `--report=THINGS' `-r' THINGS

	7652 `--skeleton=FILE' `-S' FILE

	7653 `--token-table' `-k'

	7654 `--verbose' `-v'

	7655 `--version' `-V'

	7656 `--warnings' `-W'

	7657 `--xml=[FILE]' `-x' [FILE]

	7658 `--yacc' `-y'

	7659

	7660

	7661 File: bison.info, Node: Yacc Library, Prev: Option Cross Key, Up: Invocation

	7662

	7663 9.3 Yacc Library

	7664 ================

	7665

	7666 The Yacc library contains default implementations of the `yyerror' and

	7667 `main' functions. These default implementations are normally not

	7668 useful, but POSIX requires them. To use the Yacc library, link your

	7669 program with the `-ly' option. Note that Bison's implementation of the

	7670 Yacc library is distributed under the terms of the GNU General Public

	7671 License (*note Copying::).

	7672

	7673 If you use the Yacc library's `yyerror' function, you should declare

	7674 `yyerror' as follows:

	7675

	7676 int yyerror (char const *);

	7677

	7678 Bison ignores the `int' value returned by this `yyerror'. If you

	7679 use the Yacc library's `main' function, your `yyparse' function should

	7680 have the following type signature:

	7681

	7682 int yyparse (void);

	7683

	7684

	7685 File: bison.info, Node: Other Languages, Next: FAQ, Prev: Invocation, Up: To p

	7686

	7687 10 Parsers Written In Other Languages

	7688 *************************************

	7689

	7690 * Menu:

	7691

	7692 * C++ Parsers:: The interface to generate C++ parser classes

	7693 * Java Parsers:: The interface to generate Java parser classes

	7694

	7695

	7696 File: bison.info, Node: C++ Parsers, Next: Java Parsers, Up: Other Languages

	7697

	7698 10.1 C++ Parsers

	7699 ================

	7700

	7701 * Menu:

	7702

	7703 * C++ Bison Interface:: Asking for C++ parser generation

	7704 * C++ Semantic Values:: %union vs. C++

	7705 * C++ Location Values:: The position and location classes

	7706 * C++ Parser Interface:: Instantiating and running the parser

	7707 * C++ Scanner Interface:: Exchanges between yylex and parse

	7708 * A Complete C++ Example:: Demonstrating their use

	7709

	7710

	7711 File: bison.info, Node: C++ Bison Interface, Next: C++ Semantic Values, Up: C ++ Parsers

	7712

	7713 10.1.1 C++ Bison Interface

	7714 --------------------------

	7715

	7716 The C++ LALR(1) parser is selected using the skeleton directive,

	7717 `%skeleton "lalr1.c"', or the synonymous command-line option

	7718 `--skeleton=lalr1.c'. *Note Decl Summary::.

	7719

	7720 When run, `bison' will create several entities in the `yy' namespace. Use

	7721 the `%define namespace' directive to change the namespace name, see

	7722 *Note Decl Summary::. The various classes are generated in the

	7723 following files:

	7724

	7725 `position.hh'

	7726 `location.hh'

	7727 The definition of the classes `position' and `location', used for

	7728 location tracking. *Note C++ Location Values::.

	7729

	7730 `stack.hh'

	7731 An auxiliary class `stack' used by the parser.

	7732

	7733 `FILE.hh'

	7734 `FILE.cc'

	7735 (Assuming the extension of the input file was `.yy'.) The

	7736 declaration and implementation of the C++ parser class. The

	7737 basename and extension of these two files follow the same rules as

	7738 with regular C parsers (*note Invocation::).

	7739

	7740 The header is _mandatory_; you must either pass `-d'/`--defines'

	7741 to `bison', or use the `%defines' directive.

	7742

	7743 All these files are documented using Doxygen; run `doxygen' for a

	7744 complete and accurate documentation.

	7745

	7746

	7747 File: bison.info, Node: C++ Semantic Values, Next: C++ Location Values, Prev: C++ Bison Interface, Up: C++ Parsers

	7748

	7749 10.1.2 C++ Semantic Values

	7750 --------------------------

	7751

	7752 The `%union' directive works as for C, see *Note The Collection of

	7753 Value Types: Union Decl. In particular it produces a genuine

	7754 `union'(1), which have a few specific features in C++.

	7755 - The type `YYSTYPE' is defined but its use is discouraged: rather

	7756 you should refer to the parser's encapsulated type

	7757 `yy::parser::semantic_type'.

	7758

	7759 - Non POD (Plain Old Data) types cannot be used. C++ forbids any

	7760 instance of classes with constructors in unions: only _pointers_

	7761 to such objects are allowed.

	7762

	7763 Because objects have to be stored via pointers, memory is not

	7764 reclaimed automatically: using the `%destructor' directive is the only

	7765 means to avoid leaks. *Note Freeing Discarded Symbols: Destructor Decl.

	7766

	7767 ---------- Footnotes ----------

	7768

	7769 (1) In the future techniques to allow complex types within

	7770 pseudo-unions (similar to Boost variants) might be implemented to

	7771 alleviate these issues.

	7772

	7773

	7774 File: bison.info, Node: C++ Location Values, Next: C++ Parser Interface, Prev : C++ Semantic Values, Up: C++ Parsers

	7775

	7776 10.1.3 C++ Location Values

	7777 --------------------------

	7778

	7779 When the directive `%locations' is used, the C++ parser supports

	7780 location tracking, see *Note Locations Overview: Locations. Two

	7781 auxiliary classes define a `position', a single point in a file, and a

	7782 `location', a range composed of a pair of `position's (possibly

	7783 spanning several files).

	7784

	7785 -- Method on position: std::string* file

	7786 The name of the file. It will always be handled as a pointer, the

	7787 parser will never duplicate nor deallocate it. As an experimental

	7788 feature you may change it to `TYPE*' using `%define filename_type

	7789 "TYPE"'.

	7790

	7791 -- Method on position: unsigned int line

	7792 The line, starting at 1.

	7793

	7794 -- Method on position: unsigned int lines (int HEIGHT = 1)

	7795 Advance by HEIGHT lines, resetting the column number.

	7796

	7797 -- Method on position: unsigned int column

	7798 The column, starting at 0.

	7799

	7800 -- Method on position: unsigned int columns (int WIDTH = 1)

	7801 Advance by WIDTH columns, without changing the line number.

	7802

	7803 -- Method on position: position& operator+= (position& POS, int WIDTH)

	7804 -- Method on position: position operator+ (const position& POS, int

	7805 WIDTH)

	7806 -- Method on position: position& operator-= (const position& POS, int

	7807 WIDTH)

	7808 -- Method on position: position operator- (position& POS, int WIDTH)

	7809 Various forms of syntactic sugar for `columns'.

	7810

	7811 -- Method on position: position operator<< (std::ostream O, const

	7812 position& P)

	7813 Report P on O like this: `FILE:LINE.COLUMN', or `LINE.COLUMN' if

	7814 FILE is null.

	7815

	7816 -- Method on location: position begin

	7817 -- Method on location: position end

	7818 The first, inclusive, position of the range, and the first beyond.

	7819

	7820 -- Method on location: unsigned int columns (int WIDTH = 1)

	7821 -- Method on location: unsigned int lines (int HEIGHT = 1)

	7822 Advance the `end' position.

	7823

	7824 -- Method on location: location operator+ (const location& BEGIN,

	7825 const location& END)

	7826 -- Method on location: location operator+ (const location& BEGIN, int

	7827 WIDTH)

	7828 -- Method on location: location operator+= (const location& LOC, int

	7829 WIDTH)

	7830 Various forms of syntactic sugar.

	7831

	7832 -- Method on location: void step ()

	7833 Move `begin' onto `end'.

	7834

	7835

	7836 File: bison.info, Node: C++ Parser Interface, Next: C++ Scanner Interface, Pr ev: C++ Location Values, Up: C++ Parsers

	7837

	7838 10.1.4 C++ Parser Interface

	7839 ---------------------------

	7840

	7841 The output files `OUTPUT.hh' and `OUTPUT.cc' declare and define the

	7842 parser class in the namespace `yy'. The class name defaults to

	7843 `parser', but may be changed using `%define parser_class_name "NAME"'.

	7844 The interface of this class is detailed below. It can be extended

	7845 using the `%parse-param' feature: its semantics is slightly changed

	7846 since it describes an additional member of the parser class, and an

	7847 additional argument for its constructor.

	7848

	7849 -- Type of parser: semantic_value_type

	7850 -- Type of parser: location_value_type

	7851 The types for semantics value and locations.

	7852

	7853 -- Method on parser: parser (TYPE1 ARG1, ...)

	7854 Build a new parser object. There are no arguments by default,

	7855 unless `%parse-param {TYPE1 ARG1}' was used.

	7856

	7857 -- Method on parser: int parse ()

	7858 Run the syntactic analysis, and return 0 on success, 1 otherwise.

	7859

	7860 -- Method on parser: std::ostream& debug_stream ()

	7861 -- Method on parser: void set_debug_stream (std::ostream& O)

	7862 Get or set the stream used for tracing the parsing. It defaults to

	7863 `std::cerr'.

	7864

	7865 -- Method on parser: debug_level_type debug_level ()

	7866 -- Method on parser: void set_debug_level (debug_level L)

	7867 Get or set the tracing level. Currently its value is either 0, no

	7868 trace, or nonzero, full tracing.

	7869

	7870 -- Method on parser: void error (const location_type& L, const

	7871 std::string& M)

	7872 The definition for this member function must be supplied by the

	7873 user: the parser uses it to report a parser error occurring at L,

	7874 described by M.

	7875

	7876

	7877 File: bison.info, Node: C++ Scanner Interface, Next: A Complete C++ Example, Prev: C++ Parser Interface, Up: C++ Parsers

	7878

	7879 10.1.5 C++ Scanner Interface

	7880 ----------------------------

	7881

	7882 The parser invokes the scanner by calling `yylex'. Contrary to C

	7883 parsers, C++ parsers are always pure: there is no point in using the

	7884 `%define api.pure' directive. Therefore the interface is as follows.

	7885

	7886 -- Method on parser: int yylex (semantic_value_type& YYLVAL,

	7887 location_type& YYLLOC, TYPE1 ARG1, ...)

	7888 Return the next token. Its type is the return value, its semantic

	7889 value and location being YYLVAL and YYLLOC. Invocations of

	7890 `%lex-param {TYPE1 ARG1}' yield additional arguments.

	7891

	7892

	7893 File: bison.info, Node: A Complete C++ Example, Prev: C++ Scanner Interface, Up: C++ Parsers

	7894

	7895 10.1.6 A Complete C++ Example

	7896 -----------------------------

	7897

	7898 This section demonstrates the use of a C++ parser with a simple but

	7899 complete example. This example should be available on your system,

	7900 ready to compile, in the directory "../bison/examples/calc++". It

	7901 focuses on the use of Bison, therefore the design of the various C++

	7902 classes is very naive: no accessors, no encapsulation of members etc.

	7903 We will use a Lex scanner, and more precisely, a Flex scanner, to

	7904 demonstrate the various interaction. A hand written scanner is

	7905 actually easier to interface with.

	7906

	7907 * Menu:

	7908

	7909 * Calc++ --- C++ Calculator:: The specifications

	7910 * Calc++ Parsing Driver:: An active parsing context

	7911 * Calc++ Parser:: A parser class

	7912 * Calc++ Scanner:: A pure C++ Flex scanner

	7913 * Calc++ Top Level:: Conducting the band

	7914

	7915

	7916 File: bison.info, Node: Calc++ --- C++ Calculator, Next: Calc++ Parsing Driver , Up: A Complete C++ Example

	7917

	7918 10.1.6.1 Calc++ -- C++ Calculator

	7919 .................................

	7920

	7921 Of course the grammar is dedicated to arithmetics, a single expression,

	7922 possibly preceded by variable assignments. An environment containing

	7923 possibly predefined variables such as `one' and `two', is exchanged

	7924 with the parser. An example of valid input follows.

	7925

	7926 three := 3

	7927 seven := one + two * three

	7928 seven * seven

	7929

	7930

	7931 File: bison.info, Node: Calc++ Parsing Driver, Next: Calc++ Parser, Prev: Cal c++ --- C++ Calculator, Up: A Complete C++ Example

	7932

	7933 10.1.6.2 Calc++ Parsing Driver

	7934 ..............................

	7935

	7936 To support a pure interface with the parser (and the scanner) the

	7937 technique of the "parsing context" is convenient: a structure

	7938 containing all the data to exchange. Since, in addition to simply

	7939 launch the parsing, there are several auxiliary tasks to execute (open

	7940 the file for parsing, instantiate the parser etc.), we recommend

	7941 transforming the simple parsing context structure into a fully blown

	7942 "parsing driver" class.

	7943

	7944 The declaration of this driver class, `calc++-driver.hh', is as

	7945 follows. The first part includes the CPP guard and imports the

	7946 required standard library components, and the declaration of the parser

	7947 class.

	7948

	7949 #ifndef CALCXX_DRIVER_HH

	7950 # define CALCXX_DRIVER_HH

	7951 # include <string>

	7952 # include <map>

	7953 # include "calc++-parser.hh"

	7954

	7955 Then comes the declaration of the scanning function. Flex expects the

	7956 signature of `yylex' to be defined in the macro `YY_DECL', and the C++

	7957 parser expects it to be declared. We can factor both as follows.

	7958

	7959 // Tell Flex the lexer's prototype ...

	7960 # define YY_DECL \

	7961 yy::calcxx_parser::token_type \

	7962 yylex (yy::calcxx_parser::semantic_type* yylval, \

	7963 yy::calcxx_parser::location_type* yylloc, \

	7964 calcxx_driver& driver)

	7965 // ... and declare it for the parser's sake.

	7966 YY_DECL;

	7967

	7968 The `calcxx_driver' class is then declared with its most obvious

	7969 members.

	7970

	7971 // Conducting the whole scanning and parsing of Calc++.

	7972 class calcxx_driver

	7973 {

	7974 public:

	7975 calcxx_driver ();

	7976 virtual ~calcxx_driver ();

	7977

	7978 std::map<std::string, int> variables;

	7979

	7980 int result;

	7981

	7982 To encapsulate the coordination with the Flex scanner, it is useful to

	7983 have two members function to open and close the scanning phase.

	7984

	7985 // Handling the scanner.

	7986 void scan_begin ();

	7987 void scan_end ();

	7988 bool trace_scanning;

	7989

	7990 Similarly for the parser itself.

	7991

	7992 // Run the parser. Return 0 on success.

	7993 int parse (const std::string& f);

	7994 std::string file;

	7995 bool trace_parsing;

	7996

	7997 To demonstrate pure handling of parse errors, instead of simply dumping

	7998 them on the standard error output, we will pass them to the compiler

	7999 driver using the following two member functions. Finally, we close the

	8000 class declaration and CPP guard.

	8001

	8002 // Error handling.

	8003 void error (const yy::location& l, const std::string& m);

	8004 void error (const std::string& m);

	8005 };

	8006 #endif // ! CALCXX_DRIVER_HH

	8007

	8008 The implementation of the driver is straightforward. The `parse'

	8009 member function deserves some attention. The `error' functions are

	8010 simple stubs, they should actually register the located error messages

	8011 and set error state.

	8012

	8013 #include "calc++-driver.hh"

	8014 #include "calc++-parser.hh"

	8015

	8016 calcxx_driver::calcxx_driver ()

	8017 : trace_scanning (false), trace_parsing (false)

	8018 {

	8019 variables["one"] = 1;

	8020 variables["two"] = 2;

	8021 }

	8022

	8023 calcxx_driver::~calcxx_driver ()

	8024 {

	8025 }

	8026

	8027 int

	8028 calcxx_driver::parse (const std::string &f)

	8029 {

	8030 file = f;

	8031 scan_begin ();

	8032 yy::calcxx_parser parser (*this);

	8033 parser.set_debug_level (trace_parsing);

	8034 int res = parser.parse ();

	8035 scan_end ();

	8036 return res;

	8037 }

	8038

	8039 void

	8040 calcxx_driver::error (const yy::location& l, const std::string& m)

	8041 {

	8042 std::cerr << l << ": " << m << std::endl;

	8043 }

	8044

	8045 void

	8046 calcxx_driver::error (const std::string& m)

	8047 {

	8048 std::cerr << m << std::endl;

	8049 }

	8050

	8051

	8052 File: bison.info, Node: Calc++ Parser, Next: Calc++ Scanner, Prev: Calc++ Par sing Driver, Up: A Complete C++ Example

	8053

	8054 10.1.6.3 Calc++ Parser

	8055 ......................

	8056

	8057 The parser definition file `calc++-parser.yy' starts by asking for the

	8058 C++ LALR(1) skeleton, the creation of the parser header file, and

	8059 specifies the name of the parser class. Because the C++ skeleton

	8060 changed several times, it is safer to require the version you designed

	8061 the grammar for.

	8062

	8063 %skeleton "lalr1.cc" /* -- C++ -- */

	8064 %require "2.4.1"

	8065 %defines

	8066 %define parser_class_name "calcxx_parser"

	8067

	8068 Then come the declarations/inclusions needed to define the `%union'.

	8069 Because the parser uses the parsing driver and reciprocally, both

	8070 cannot include the header of the other. Because the driver's header

	8071 needs detailed knowledge about the parser class (in particular its

	8072 inner types), it is the parser's header which will simply use a forward

	8073 declaration of the driver. *Note %code: Decl Summary.

	8074

	8075 %code requires {

	8076 # include <string>

	8077 class calcxx_driver;

	8078 }

	8079

	8080 The driver is passed by reference to the parser and to the scanner.

	8081 This provides a simple but effective pure interface, not relying on

	8082 global variables.

	8083

	8084 // The parsing context.

	8085 %parse-param { calcxx_driver& driver }

	8086 %lex-param { calcxx_driver& driver }

	8087

	8088 Then we request the location tracking feature, and initialize the first

	8089 location's file name. Afterwards new locations are computed relatively

	8090 to the previous locations: the file name will be automatically

	8091 propagated.

	8092

	8093 %locations

	8094 %initial-action

	8095 {

	8096 // Initialize the initial location.

	8097 @$.begin.filename = @$.end.filename = &driver.file;

	8098 };

	8099

	8100 Use the two following directives to enable parser tracing and verbose

	8101 error messages.

	8102

	8103 %debug

	8104 %error-verbose

	8105

	8106 Semantic values cannot use "real" objects, but only pointers to them.

	8107

	8108 // Symbols.

	8109 %union

	8110 {

	8111 int ival;

	8112 std::string *sval;

	8113 };

	8114

	8115 The code between `%code {' and `}' is output in the `*.cc' file; it

	8116 needs detailed knowledge about the driver.

	8117

	8118 %code {

	8119 # include "calc++-driver.hh"

	8120 }

	8121

	8122 The token numbered as 0 corresponds to end of file; the following line

	8123 allows for nicer error messages referring to "end of file" instead of

	8124 "$end". Similarly user friendly named are provided for each symbol.

	8125 Note that the tokens names are prefixed by `TOKEN_' to avoid name

	8126 clashes.

	8127

	8128 %token END 0 "end of file"

	8129 %token ASSIGN ":="

	8130 %token <sval> IDENTIFIER "identifier"

	8131 %token <ival> NUMBER "number"

	8132 %type <ival> exp

	8133

	8134 To enable memory deallocation during error recovery, use `%destructor'.

	8135

	8136 %printer { debug_stream () << *$$; } "identifier"

	8137 %destructor { delete $$; } "identifier"

	8138

	8139 %printer { debug_stream () << $$; } <ival>

	8140

	8141 The grammar itself is straightforward.

	8142

	8143 %%

	8144 %start unit;

	8145 unit: assignments exp { driver.result = $2; };

	8146

	8147 assignments: assignments assignment {}

	8148 \| /* Nothing. */ {};

	8149

	8150 assignment:

	8151 "identifier" ":=" exp

	8152 { driver.variables[*$1] = $3; delete $1; };

	8153

	8154 %left '+' '-';

	8155 %left '*' '/';

	8156 exp: exp '+' exp { $$ = $1 + $3; }

	8157 \| exp '-' exp { $$ = $1 - $3; }

	8158 \| exp '' exp { $$ = $1 $3; }

	8159 \| exp '/' exp { $$ = $1 / $3; }

	8160 \| "identifier" { $$ = driver.variables[*$1]; delete $1; }

	8161 \| "number" { $$ = $1; };

	8162 %%

	8163

	8164 Finally the `error' member function registers the errors to the driver.

	8165

	8166 void

	8167 yy::calcxx_parser::error (const yy::calcxx_parser::location_type& l,

	8168 const std::string& m)

	8169 {

	8170 driver.error (l, m);

	8171 }

	8172

	8173

	8174 File: bison.info, Node: Calc++ Scanner, Next: Calc++ Top Level, Prev: Calc++ Parser, Up: A Complete C++ Example

	8175

	8176 10.1.6.4 Calc++ Scanner

	8177 .......................

	8178

	8179 The Flex scanner first includes the driver declaration, then the

	8180 parser's to get the set of defined tokens.

	8181

	8182 %{ /* -- C++ -- */

	8183 # include <cstdlib>

	8184 # include <errno.h>

	8185 # include <limits.h>

	8186 # include <string>

	8187 # include "calc++-driver.hh"

	8188 # include "calc++-parser.hh"

	8189

	8190 /* Work around an incompatibility in flex (at least versions

	8191 2.5.31 through 2.5.33): it generates code that does

	8192 not conform to C89. See Debian bug 333231

	8193 <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=333231>. */

	8194 # undef yywrap

	8195 # define yywrap() 1

	8196

	8197 /* By default yylex returns int, we use token_type.

	8198 Unfortunately yyterminate by default returns 0, which is

	8199 not of token_type. */

	8200 #define yyterminate() return token::END

	8201 %}

	8202

	8203 Because there is no `#include'-like feature we don't need `yywrap', we

	8204 don't need `unput' either, and we parse an actual file, this is not an

	8205 interactive session with the user. Finally we enable the scanner

	8206 tracing features.

	8207

	8208 %option noyywrap nounput batch debug

	8209

	8210 Abbreviations allow for more readable rules.

	8211

	8212 id [a-zA-Z][a-zA-Z_0-9]*

	8213 int [0-9]+

	8214 blank [ \t]

	8215

	8216 The following paragraph suffices to track locations accurately. Each

	8217 time `yylex' is invoked, the begin position is moved onto the end

	8218 position. Then when a pattern is matched, the end position is advanced

	8219 of its width. In case it matched ends of lines, the end cursor is

	8220 adjusted, and each time blanks are matched, the begin cursor is moved

	8221 onto the end cursor to effectively ignore the blanks preceding tokens.

	8222 Comments would be treated equally.

	8223

	8224 %{

	8225 # define YY_USER_ACTION yylloc->columns (yyleng);

	8226 %}

	8227 %%

	8228 %{

	8229 yylloc->step ();

	8230 %}

	8231 {blank}+ yylloc->step ();

	8232 [\n]+ yylloc->lines (yyleng); yylloc->step ();

	8233

	8234 The rules are simple, just note the use of the driver to report errors.

	8235 It is convenient to use a typedef to shorten

	8236 `yy::calcxx_parser::token::identifier' into `token::identifier' for

	8237 instance.

	8238

	8239 %{

	8240 typedef yy::calcxx_parser::token token;

	8241 %}

	8242 /* Convert ints to the actual type of tokens. */

	8243 [-+*/] return yy::calcxx_parser::token_type (yytext[0]);

	8244 ":=" return token::ASSIGN;

	8245 {int} {

	8246 errno = 0;

	8247 long n = strtol (yytext, NULL, 10);

	8248 if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE))

	8249 driver.error (*yylloc, "integer is out of range");

	8250 yylval->ival = n;

	8251 return token::NUMBER;

	8252 }

	8253 {id} yylval->sval = new std::string (yytext); return token::IDENTIFIE R;

	8254 . driver.error (*yylloc, "invalid character");

	8255 %%

	8256

	8257 Finally, because the scanner related driver's member function depend on

	8258 the scanner's data, it is simpler to implement them in this file.

	8259

	8260 void

	8261 calcxx_driver::scan_begin ()

	8262 {

	8263 yy_flex_debug = trace_scanning;

	8264 if (file == "-")

	8265 yyin = stdin;

	8266 else if (!(yyin = fopen (file.c_str (), "r")))

	8267 {

	8268 error (std::string ("cannot open ") + file);

	8269 exit (1);

	8270 }

	8271 }

	8272

	8273 void

	8274 calcxx_driver::scan_end ()

	8275 {

	8276 fclose (yyin);

	8277 }

	8278

	8279

	8280 File: bison.info, Node: Calc++ Top Level, Prev: Calc++ Scanner, Up: A Complet e C++ Example

	8281

	8282 10.1.6.5 Calc++ Top Level

	8283 .........................

	8284

	8285 The top level file, `calc++.cc', poses no problem.

	8286

	8287 #include <iostream>

	8288 #include "calc++-driver.hh"

	8289

	8290 int

	8291 main (int argc, char *argv[])

	8292 {

	8293 calcxx_driver driver;

	8294 for (++argv; argv[0]; ++argv)

	8295 if (*argv == std::string ("-p"))

	8296 driver.trace_parsing = true;

	8297 else if (*argv == std::string ("-s"))

	8298 driver.trace_scanning = true;

	8299 else if (!driver.parse (*argv))

	8300 std::cout << driver.result << std::endl;

	8301 }

	8302

	8303

	8304 File: bison.info, Node: Java Parsers, Prev: C++ Parsers, Up: Other Languages

	8305

	8306 10.2 Java Parsers

	8307 =================

	8308

	8309 * Menu:

	8310

	8311 * Java Bison Interface:: Asking for Java parser generation

	8312 * Java Semantic Values:: %type and %token vs. Java

	8313 * Java Location Values:: The position and location classes

	8314 * Java Parser Interface:: Instantiating and running the parser

	8315 * Java Scanner Interface:: Specifying the scanner for the parser

	8316 * Java Action Features:: Special features for use in actions

	8317 * Java Differences:: Differences between C/C++ and Java Grammars

	8318 * Java Declarations Summary:: List of Bison declarations used with Java

	8319

	8320

	8321 File: bison.info, Node: Java Bison Interface, Next: Java Semantic Values, Up: Java Parsers

	8322

	8323 10.2.1 Java Bison Interface

	8324 ---------------------------

	8325

	8326 (The current Java interface is experimental and may evolve. More user

	8327 feedback will help to stabilize it.)

	8328

	8329 The Java parser skeletons are selected using the `%language "Java"'

	8330 directive or the `-L java'/`--language=java' option.

	8331

	8332 When generating a Java parser, `bison BASENAME.y' will create a

	8333 single Java source file named `BASENAME.java'. Using an input file

	8334 without a `.y' suffix is currently broken. The basename of the output

	8335 file can be changed by the `%file-prefix' directive or the

	8336 `-p'/`--name-prefix' option. The entire output file name can be

	8337 changed by the `%output' directive or the `-o'/`--output' option. The

	8338 output file contains a single class for the parser.

	8339

	8340 You can create documentation for generated parsers using Javadoc.

	8341

	8342 Contrary to C parsers, Java parsers do not use global variables; the

	8343 state of the parser is always local to an instance of the parser class.

	8344 Therefore, all Java parsers are "pure", and the `%pure-parser' and

	8345 `%define api.pure' directives does not do anything when used in Java.

	8346

	8347 Push parsers are currently unsupported in Java and `%define

	8348 api.push_pull' have no effect.

	8349

	8350 GLR parsers are currently unsupported in Java. Do not use the

	8351 `glr-parser' directive.

	8352

	8353 No header file can be generated for Java parsers. Do not use the

	8354 `%defines' directive or the `-d'/`--defines' options.

	8355

	8356 Currently, support for debugging and verbose errors are always

	8357 compiled in. Thus the `%debug' and `%token-table' directives and the

	8358 `-t'/`--debug' and `-k'/`--token-table' options have no effect. This

	8359 may change in the future to eliminate unused code in the generated

	8360 parser, so use `%debug' and `%verbose-error' explicitly if needed.

	8361 Also, in the future the `%token-table' directive might enable a public

	8362 interface to access the token names and codes.

	8363

	8364

	8365 File: bison.info, Node: Java Semantic Values, Next: Java Location Values, Pre v: Java Bison Interface, Up: Java Parsers

	8366

	8367 10.2.2 Java Semantic Values

	8368 ---------------------------

	8369

	8370 There is no `%union' directive in Java parsers. Instead, the semantic

	8371 values' types (class names) should be specified in the `%type' or

	8372 `%token' directive:

	8373

	8374 %type <Expression> expr assignment_expr term factor

	8375 %type <Integer> number

	8376

	8377 By default, the semantic stack is declared to have `Object' members,

	8378 which means that the class types you specify can be of any class. To

	8379 improve the type safety of the parser, you can declare the common

	8380 superclass of all the semantic values using the `%define stype'

	8381 directive. For example, after the following declaration:

	8382

	8383 %define stype "ASTNode"

	8384

	8385 any `%type' or `%token' specifying a semantic type which is not a

	8386 subclass of ASTNode, will cause a compile-time error.

	8387

	8388 Types used in the directives may be qualified with a package name.

	8389 Primitive data types are accepted for Java version 1.5 or later. Note

	8390 that in this case the autoboxing feature of Java 1.5 will be used.

	8391 Generic types may not be used; this is due to a limitation in the

	8392 implementation of Bison, and may change in future releases.

	8393

	8394 Java parsers do not support `%destructor', since the language adopts

	8395 garbage collection. The parser will try to hold references to semantic

	8396 values for as little time as needed.

	8397

	8398 Java parsers do not support `%printer', as `toString()' can be used

	8399 to print the semantic values. This however may change (in a

	8400 backwards-compatible way) in future versions of Bison.

	8401

	8402

	8403 File: bison.info, Node: Java Location Values, Next: Java Parser Interface, Pr ev: Java Semantic Values, Up: Java Parsers

	8404

	8405 10.2.3 Java Location Values

	8406 ---------------------------

	8407

	8408 When the directive `%locations' is used, the Java parser supports

	8409 location tracking, see *Note Locations Overview: Locations. An

	8410 auxiliary user-defined class defines a "position", a single point in a

	8411 file; Bison itself defines a class representing a "location", a range

	8412 composed of a pair of positions (possibly spanning several files). The

	8413 location class is an inner class of the parser; the name is `Location'

	8414 by default, and may also be renamed using `%define location_type

	8415 "CLASS-NAME'.

	8416

	8417 The location class treats the position as a completely opaque value.

	8418 By default, the class name is `Position', but this can be changed with

	8419 `%define position_type "CLASS-NAME"'. This class must be supplied by

	8420 the user.

	8421

	8422 -- Instance Variable of Location: Position begin

	8423 -- Instance Variable of Location: Position end

	8424 The first, inclusive, position of the range, and the first beyond.

	8425

	8426 -- Constructor on Location: Location (Position LOC)

	8427 Create a `Location' denoting an empty range located at a given

	8428 point.

	8429

	8430 -- Constructor on Location: Location (Position BEGIN, Position END)

	8431 Create a `Location' from the endpoints of the range.

	8432

	8433 -- Method on Location: String toString ()

	8434 Prints the range represented by the location. For this to work

	8435 properly, the position class should override the `equals' and

	8436 `toString' methods appropriately.

	8437

	8438

	8439 File: bison.info, Node: Java Parser Interface, Next: Java Scanner Interface, Prev: Java Location Values, Up: Java Parsers

	8440

	8441 10.2.4 Java Parser Interface

	8442 ----------------------------

	8443

	8444 The name of the generated parser class defaults to `YYParser'. The

	8445 `YY' prefix may be changed using the `%name-prefix' directive or the

	8446 `-p'/`--name-prefix' option. Alternatively, use `%define

	8447 parser_class_name "NAME"' to give a custom name to the class. The

	8448 interface of this class is detailed below.

	8449

	8450 By default, the parser class has package visibility. A declaration

	8451 `%define public' will change to public visibility. Remember that,

	8452 according to the Java language specification, the name of the `.java'

	8453 file should match the name of the class in this case. Similarly, you

	8454 can use `abstract', `final' and `strictfp' with the `%define'

	8455 declaration to add other modifiers to the parser class.

	8456

	8457 The Java package name of the parser class can be specified using the

	8458 `%define package' directive. The superclass and the implemented

	8459 interfaces of the parser class can be specified with the `%define

	8460 extends' and `%define implements' directives.

	8461

	8462 The parser class defines an inner class, `Location', that is used

	8463 for location tracking (see *Note Java Location Values::), and a inner

	8464 interface, `Lexer' (see *Note Java Scanner Interface::). Other than

	8465 these inner class/interface, and the members described in the interface

	8466 below, all the other members and fields are preceded with a `yy' or

	8467 `YY' prefix to avoid clashes with user code.

	8468

	8469 The parser class can be extended using the `%parse-param' directive.

	8470 Each occurrence of the directive will add a `protected final' field to

	8471 the parser class, and an argument to its constructor, which initialize

	8472 them automatically.

	8473

	8474 Token names defined by `%token' and the predefined `EOF' token name

	8475 are added as constant fields to the parser class.

	8476

	8477 -- Constructor on YYParser: YYParser (LEX_PARAM, ..., PARSE_PARAM,

	8478 ...)

	8479 Build a new parser object with embedded `%code lexer'. There are

	8480 no parameters, unless `%parse-param's and/or `%lex-param's are

	8481 used.

	8482

	8483 -- Constructor on YYParser: YYParser (Lexer LEXER, PARSE_PARAM, ...)

	8484 Build a new parser object using the specified scanner. There are

	8485 no additional parameters unless `%parse-param's are used.

	8486

	8487 If the scanner is defined by `%code lexer', this constructor is

	8488 declared `protected' and is called automatically with a scanner

	8489 created with the correct `%lex-param's.

	8490

	8491 -- Method on YYParser: boolean parse ()

	8492 Run the syntactic analysis, and return `true' on success, `false'

	8493 otherwise.

	8494

	8495 -- Method on YYParser: boolean recovering ()

	8496 During the syntactic analysis, return `true' if recovering from a

	8497 syntax error. *Note Error Recovery::.

	8498

	8499 -- Method on YYParser: java.io.PrintStream getDebugStream ()

	8500 -- Method on YYParser: void setDebugStream (java.io.printStream O)

	8501 Get or set the stream used for tracing the parsing. It defaults to

	8502 `System.err'.

	8503

	8504 -- Method on YYParser: int getDebugLevel ()

	8505 -- Method on YYParser: void setDebugLevel (int L)

	8506 Get or set the tracing level. Currently its value is either 0, no

	8507 trace, or nonzero, full tracing.

	8508

	8509

	8510 File: bison.info, Node: Java Scanner Interface, Next: Java Action Features, P rev: Java Parser Interface, Up: Java Parsers

	8511

	8512 10.2.5 Java Scanner Interface

	8513 -----------------------------

	8514

	8515 There are two possible ways to interface a Bison-generated Java parser

	8516 with a scanner: the scanner may be defined by `%code lexer', or defined

	8517 elsewhere. In either case, the scanner has to implement the `Lexer'

	8518 inner interface of the parser class.

	8519

	8520 In the first case, the body of the scanner class is placed in `%code

	8521 lexer' blocks. If you want to pass parameters from the parser

	8522 constructor to the scanner constructor, specify them with `%lex-param';

	8523 they are passed before `%parse-param's to the constructor.

	8524

	8525 In the second case, the scanner has to implement the `Lexer'

	8526 interface, which is defined within the parser class (e.g.,

	8527 `YYParser.Lexer'). The constructor of the parser object will then

	8528 accept an object implementing the interface; `%lex-param' is not used

	8529 in this case.

	8530

	8531 In both cases, the scanner has to implement the following methods.

	8532

	8533 -- Method on Lexer: void yyerror (Location LOC, String MSG)

	8534 This method is defined by the user to emit an error message. The

	8535 first parameter is omitted if location tracking is not active.

	8536 Its type can be changed using `%define location_type "CLASS-NAME".'

	8537

	8538 -- Method on Lexer: int yylex ()

	8539 Return the next token. Its type is the return value, its semantic

	8540 value and location are saved and returned by the ther methods in

	8541 the interface.

	8542

	8543 Use `%define lex_throws' to specify any uncaught exceptions.

	8544 Default is `java.io.IOException'.

	8545

	8546 -- Method on Lexer: Position getStartPos ()

	8547 -- Method on Lexer: Position getEndPos ()

	8548 Return respectively the first position of the last token that

	8549 `yylex' returned, and the first position beyond it. These methods

	8550 are not needed unless location tracking is active.

	8551

	8552 The return type can be changed using `%define position_type

	8553 "CLASS-NAME".'

	8554

	8555 -- Method on Lexer: Object getLVal ()

	8556 Return the semantical value of the last token that yylex returned.

	8557

	8558 The return type can be changed using `%define stype "CLASS-NAME".'

	8559

	8560

	8561 File: bison.info, Node: Java Action Features, Next: Java Differences, Prev: J ava Scanner Interface, Up: Java Parsers

	8562

	8563 10.2.6 Special Features for Use in Java Actions

	8564 -----------------------------------------------

	8565

	8566 The following special constructs can be uses in Java actions. Other

	8567 analogous C action features are currently unavailable for Java.

	8568

	8569 Use `%define throws' to specify any uncaught exceptions from parser

	8570 actions, and initial actions specified by `%initial-action'.

	8571

	8572 -- Variable: $N

	8573 The semantic value for the Nth component of the current rule.

	8574 This may not be assigned to. *Note Java Semantic Values::.

	8575

	8576 -- Variable: $<TYPEALT>N

	8577 Like `$N' but specifies a alternative type TYPEALT. *Note Java

	8578 Semantic Values::.

	8579

	8580 -- Variable: $$

	8581 The semantic value for the grouping made by the current rule. As a

	8582 value, this is in the base type (`Object' or as specified by

	8583 `%define stype') as in not cast to the declared subtype because

	8584 casts are not allowed on the left-hand side of Java assignments.

	8585 Use an explicit Java cast if the correct subtype is needed. *Note

	8586 Java Semantic Values::.

	8587

	8588 -- Variable: $<TYPEALT>$

	8589 Same as `$$' since Java always allow assigning to the base type.

	8590 Perhaps we should use this and `$<>$' for the value and `$$' for

	8591 setting the value but there is currently no easy way to distinguish

	8592 these constructs. *Note Java Semantic Values::.

	8593

	8594 -- Variable: @N

	8595 The location information of the Nth component of the current rule.

	8596 This may not be assigned to. *Note Java Location Values::.

	8597

	8598 -- Variable: @$

	8599 The location information of the grouping made by the current rule.

	8600 *Note Java Location Values::.

	8601

	8602 -- Statement: return YYABORT;

	8603 Return immediately from the parser, indicating failure. *Note

	8604 Java Parser Interface::.

	8605

	8606 -- Statement: return YYACCEPT;

	8607 Return immediately from the parser, indicating success. *Note

	8608 Java Parser Interface::.

	8609

	8610 -- Statement: return YYERROR;

	8611 Start error recovery without printing an error message. *Note

	8612 Error Recovery::.

	8613

	8614 -- Statement: return YYFAIL;

	8615 Print an error message and start error recovery. *Note Error

	8616 Recovery::.

	8617

	8618 -- Function: boolean recovering ()

	8619 Return whether error recovery is being done. In this state, the

	8620 parser reads token until it reaches a known state, and then

	8621 restarts normal operation. *Note Error Recovery::.

	8622

	8623 -- Function: protected void yyerror (String msg)

	8624 -- Function: protected void yyerror (Position pos, String msg)

	8625 -- Function: protected void yyerror (Location loc, String msg)

	8626 Print an error message using the `yyerror' method of the scanner

	8627 instance in use.

	8628

	8629

	8630 File: bison.info, Node: Java Differences, Next: Java Declarations Summary, Pr ev: Java Action Features, Up: Java Parsers

	8631

	8632 10.2.7 Differences between C/C++ and Java Grammars

	8633 --------------------------------------------------

	8634

	8635 The different structure of the Java language forces several differences

	8636 between C/C++ grammars, and grammars designed for Java parsers. This

	8637 section summarizes these differences.

	8638

	8639 * Java lacks a preprocessor, so the `YYERROR', `YYACCEPT', `YYABORT'

	8640 symbols (*note Table of Symbols::) cannot obviously be macros.

	8641 Instead, they should be preceded by `return' when they appear in

	8642 an action. The actual definition of these symbols is opaque to

	8643 the Bison grammar, and it might change in the future. The only

	8644 meaningful operation that you can do, is to return them. See

	8645 *note Java Action Features::.

	8646

	8647 Note that of these three symbols, only `YYACCEPT' and `YYABORT'

	8648 will cause a return from the `yyparse' method(1).

	8649

	8650 * Java lacks unions, so `%union' has no effect. Instead, semantic

	8651 values have a common base type: `Object' or as specified by

	8652 `%define stype'. Angle backets on `%token', `type', `$N' and `$$'

	8653 specify subtypes rather than fields of an union. The type of

	8654 `$$', even with angle brackets, is the base type since Java casts

	8655 are not allow on the left-hand side of assignments. Also, `$N'

	8656 and `@N' are not allowed on the left-hand side of assignments. See

	8657 note Java Semantic Values:: and note Java Action Features::.

	8658

	8659 * The prolog declarations have a different meaning than in C/C++

	8660 code.

	8661 `%code imports'

	8662 blocks are placed at the beginning of the Java source code.

	8663 They may include copyright notices. For a `package'

	8664 declarations, it is suggested to use `%define package'

	8665 instead.

	8666

	8667 unqualified `%code'

	8668 blocks are placed inside the parser class.

	8669

	8670 `%code lexer'

	8671 blocks, if specified, should include the implementation of the

	8672 scanner. If there is no such block, the scanner can be any

	8673 class that implements the appropriate interface (see *note

	8674 Java Scanner Interface::).

	8675

	8676 Other `%code' blocks are not supported in Java parsers. In

	8677 particular, `%{ ... %}' blocks should not be used and may give an

	8678 error in future versions of Bison.

	8679

	8680 The epilogue has the same meaning as in C/C++ code and it can be

	8681 used to define other classes used by the parser _outside_ the

	8682 parser class.

	8683

	8684 ---------- Footnotes ----------

	8685

	8686 (1) Java parsers include the actions in a separate method than

	8687 `yyparse' in order to have an intuitive syntax that corresponds to

	8688 these C macros.

	8689

	8690

	8691 File: bison.info, Node: Java Declarations Summary, Prev: Java Differences, Up : Java Parsers

	8692

	8693 10.2.8 Java Declarations Summary

	8694 --------------------------------

	8695

	8696 This summary only include declarations specific to Java or have special

	8697 meaning when used in a Java parser.

	8698

	8699 -- Directive: %language "Java"

	8700 Generate a Java class for the parser.

	8701

	8702 -- Directive: %lex-param {TYPE NAME}

	8703 A parameter for the lexer class defined by `%code lexer' _only_,

	8704 added as parameters to the lexer constructor and the parser

	8705 constructor that _creates_ a lexer. Default is none. *Note Java

	8706 Scanner Interface::.

	8707

	8708 -- Directive: %name-prefix "PREFIX"

	8709 The prefix of the parser class name `PREFIXParser' if `%define

	8710 parser_class_name' is not used. Default is `YY'. *Note Java

	8711 Bison Interface::.

	8712

	8713 -- Directive: %parse-param {TYPE NAME}

	8714 A parameter for the parser class added as parameters to

	8715 constructor(s) and as fields initialized by the constructor(s).

	8716 Default is none. *Note Java Parser Interface::.

	8717

	8718 -- Directive: %token <TYPE> TOKEN ...

	8719 Declare tokens. Note that the angle brackets enclose a Java

	8720 _type_. *Note Java Semantic Values::.

	8721

	8722 -- Directive: %type <TYPE> NONTERMINAL ...

	8723 Declare the type of nonterminals. Note that the angle brackets

	8724 enclose a Java _type_. *Note Java Semantic Values::.

	8725

	8726 -- Directive: %code { CODE ... }

	8727 Code appended to the inside of the parser class. *Note Java

	8728 Differences::.

	8729

	8730 -- Directive: %code imports { CODE ... }

	8731 Code inserted just after the `package' declaration. *Note Java

	8732 Differences::.

	8733

	8734 -- Directive: %code lexer { CODE ... }

	8735 Code added to the body of a inner lexer class within the parser

	8736 class. *Note Java Scanner Interface::.

	8737

	8738 -- Directive: %% CODE ...

	8739 Code (after the second `%%') appended to the end of the file,

	8740 _outside_ the parser class. *Note Java Differences::.

	8741

	8742 -- Directive: %{ CODE ... %}

	8743 Not supported. Use `%code import' instead. *Note Java

	8744 Differences::.

	8745

	8746 -- Directive: %define abstract

	8747 Whether the parser class is declared `abstract'. Default is false.

	8748 *Note Java Bison Interface::.

	8749

	8750 -- Directive: %define extends "SUPERCLASS"

	8751 The superclass of the parser class. Default is none. *Note Java

	8752 Bison Interface::.

	8753

	8754 -- Directive: %define final

	8755 Whether the parser class is declared `final'. Default is false.

	8756 *Note Java Bison Interface::.

	8757

	8758 -- Directive: %define implements "INTERFACES"

	8759 The implemented interfaces of the parser class, a comma-separated

	8760 list. Default is none. *Note Java Bison Interface::.

	8761

	8762 -- Directive: %define lex_throws "EXCEPTIONS"

	8763 The exceptions thrown by the `yylex' method of the lexer, a

	8764 comma-separated list. Default is `java.io.IOException'. *Note

	8765 Java Scanner Interface::.

	8766

	8767 -- Directive: %define location_type "CLASS"

	8768 The name of the class used for locations (a range between two

	8769 positions). This class is generated as an inner class of the

	8770 parser class by `bison'. Default is `Location'. *Note Java

	8771 Location Values::.

	8772

	8773 -- Directive: %define package "PACKAGE"

	8774 The package to put the parser class in. Default is none. *Note

	8775 Java Bison Interface::.

	8776

	8777 -- Directive: %define parser_class_name "NAME"

	8778 The name of the parser class. Default is `YYParser' or

	8779 `NAME-PREFIXParser'. *Note Java Bison Interface::.

	8780

	8781 -- Directive: %define position_type "CLASS"

	8782 The name of the class used for positions. This class must be

	8783 supplied by the user. Default is `Position'. *Note Java Location

	8784 Values::.

	8785

	8786 -- Directive: %define public

	8787 Whether the parser class is declared `public'. Default is false.

	8788 *Note Java Bison Interface::.

	8789

	8790 -- Directive: %define stype "CLASS"

	8791 The base type of semantic values. Default is `Object'. *Note

	8792 Java Semantic Values::.

	8793

	8794 -- Directive: %define strictfp

	8795 Whether the parser class is declared `strictfp'. Default is false.

	8796 *Note Java Bison Interface::.

	8797

	8798 -- Directive: %define throws "EXCEPTIONS"

	8799 The exceptions thrown by user-supplied parser actions and

	8800 `%initial-action', a comma-separated list. Default is none.

	8801 *Note Java Parser Interface::.

	8802

	8803

	8804 File: bison.info, Node: FAQ, Next: Table of Symbols, Prev: Other Languages, Up: Top

	8805

	8806 11 Frequently Asked Questions

	8807 *****************************

	8808

	8809 Several questions about Bison come up occasionally. Here some of them

	8810 are addressed.

	8811

	8812 * Menu:

	8813

	8814 * Memory Exhausted:: Breaking the Stack Limits

	8815 * How Can I Reset the Parser:: `yyparse' Keeps some State

	8816 * Strings are Destroyed:: `yylval' Loses Track of Strings

	8817 * Implementing Gotos/Loops:: Control Flow in the Calculator

	8818 * Multiple start-symbols:: Factoring closely related grammars

	8819 * Secure? Conform?:: Is Bison POSIX safe?

	8820 * I can't build Bison:: Troubleshooting

	8821 * Where can I find help?:: Troubleshouting

	8822 * Bug Reports:: Troublereporting

	8823 * More Languages:: Parsers in C++, Java, and so on

	8824 * Beta Testing:: Experimenting development versions

	8825 * Mailing Lists:: Meeting other Bison users

	8826

	8827

	8828 File: bison.info, Node: Memory Exhausted, Next: How Can I Reset the Parser, U p: FAQ

	8829

	8830 11.1 Memory Exhausted

	8831 =====================

	8832

	8833 My parser returns with error with a `memory exhausted'

	8834 message. What can I do?

	8835

	8836 This question is already addressed elsewhere, *Note Recursive Rules:

	8837 Recursion.

	8838

	8839

	8840 File: bison.info, Node: How Can I Reset the Parser, Next: Strings are Destroye d, Prev: Memory Exhausted, Up: FAQ

	8841

	8842 11.2 How Can I Reset the Parser

	8843 ===============================

	8844

	8845 The following phenomenon has several symptoms, resulting in the

	8846 following typical questions:

	8847

	8848 I invoke `yyparse' several times, and on correct input it works

	8849 properly; but when a parse error is found, all the other calls fail

	8850 too. How can I reset the error flag of `yyparse'?

	8851

	8852 or

	8853

	8854 My parser includes support for an `#include'-like feature, in

	8855 which case I run `yyparse' from `yyparse'. This fails

	8856 although I did specify `%define api.pure'.

	8857

	8858 These problems typically come not from Bison itself, but from

	8859 Lex-generated scanners. Because these scanners use large buffers for

	8860 speed, they might not notice a change of input file. As a

	8861 demonstration, consider the following source file, `first-line.l':

	8862

	8863

	8864 %{

	8865 #include <stdio.h>

	8866 #include <stdlib.h>

	8867 %}

	8868 %%

	8869 .*\n ECHO; return 1;

	8870 %%

	8871 int

	8872 yyparse (char const *file)

	8873 {

	8874 yyin = fopen (file, "r");

	8875 if (!yyin)

	8876 exit (2);

	8877 /* One token only. */

	8878 yylex ();

	8879 if (fclose (yyin) != 0)

	8880 exit (3);

	8881 return 0;

	8882 }

	8883

	8884 int

	8885 main (void)

	8886 {

	8887 yyparse ("input");

	8888 yyparse ("input");

	8889 return 0;

	8890 }

	8891

	8892 If the file `input' contains

	8893

	8894

	8895 input:1: Hello,

	8896 input:2: World!

	8897

	8898 then instead of getting the first line twice, you get:

	8899

	8900 $ flex -ofirst-line.c first-line.l

	8901 $ gcc -ofirst-line first-line.c -ll

	8902 $ ./first-line

	8903 input:1: Hello,

	8904 input:2: World!

	8905

	8906 Therefore, whenever you change `yyin', you must tell the

	8907 Lex-generated scanner to discard its current buffer and switch to the

	8908 new one. This depends upon your implementation of Lex; see its

	8909 documentation for more. For Flex, it suffices to call

	8910 `YY_FLUSH_BUFFER' after each change to `yyin'. If your Flex-generated

	8911 scanner needs to read from several input streams to handle features

	8912 like include files, you might consider using Flex functions like

	8913 `yy_switch_to_buffer' that manipulate multiple input buffers.

	8914

	8915 If your Flex-generated scanner uses start conditions (*note Start

	8916 conditions: (flex)Start conditions.), you might also want to reset the

	8917 scanner's state, i.e., go back to the initial start condition, through

	8918 a call to `BEGIN (0)'.

	8919

	8920

	8921 File: bison.info, Node: Strings are Destroyed, Next: Implementing Gotos/Loops, Prev: How Can I Reset the Parser, Up: FAQ

	8922

	8923 11.3 Strings are Destroyed

	8924 ==========================

	8925

	8926 My parser seems to destroy old strings, or maybe it loses track of

	8927 them. Instead of reporting `"foo", "bar"', it reports

	8928 `"bar", "bar"', or even `"foo\nbar", "bar"'.

	8929

	8930 This error is probably the single most frequent "bug report" sent to

	8931 Bison lists, but is only concerned with a misunderstanding of the role

	8932 of the scanner. Consider the following Lex code:

	8933

	8934

	8935 %{

	8936 #include <stdio.h>

	8937 char *yylval = NULL;

	8938 %}

	8939 %%

	8940 .* yylval = yytext; return 1;

	8941 \n /* IGNORE */

	8942 %%

	8943 int

	8944 main ()

	8945 {

	8946 /* Similar to using $1, $2 in a Bison action. */

	8947 char *fst = (yylex (), yylval);

	8948 char *snd = (yylex (), yylval);

	8949 printf ("\"%s\", \"%s\"\n", fst, snd);

	8950 return 0;

	8951 }

	8952

	8953 If you compile and run this code, you get:

	8954

	8955 $ flex -osplit-lines.c split-lines.l

	8956 $ gcc -osplit-lines split-lines.c -ll

	8957 $ printf 'one\ntwo\n' \| ./split-lines

	8958 "one

	8959 two", "two"

	8960

	8961 this is because `yytext' is a buffer provided for _reading_ in the

	8962 action, but if you want to keep it, you have to duplicate it (e.g.,

	8963 using `strdup'). Note that the output may depend on how your

	8964 implementation of Lex handles `yytext'. For instance, when given the

	8965 Lex compatibility option `-l' (which triggers the option `%array') Flex

	8966 generates a different behavior:

	8967

	8968 $ flex -l -osplit-lines.c split-lines.l

	8969 $ gcc -osplit-lines split-lines.c -ll

	8970 $ printf 'one\ntwo\n' \| ./split-lines

	8971 "two", "two"

	8972

	8973

	8974 File: bison.info, Node: Implementing Gotos/Loops, Next: Multiple start-symbols , Prev: Strings are Destroyed, Up: FAQ

	8975

	8976 11.4 Implementing Gotos/Loops

	8977 =============================

	8978

	8979 My simple calculator supports variables, assignments, and functions,

	8980 but how can I implement gotos, or loops?

	8981

	8982 Although very pedagogical, the examples included in the document blur

	8983 the distinction to make between the parser--whose job is to recover the

	8984 structure of a text and to transmit it to subsequent modules of the

	8985 program--and the processing (such as the execution) of this structure.

	8986 This works well with so called straight line programs, i.e., precisely

	8987 those that have a straightforward execution model: execute simple

	8988 instructions one after the others.

	8989

	8990 If you want a richer model, you will probably need to use the parser

	8991 to construct a tree that does represent the structure it has recovered;

	8992 this tree is usually called the "abstract syntax tree", or "AST" for

	8993 short. Then, walking through this tree, traversing it in various ways,

	8994 will enable treatments such as its execution or its translation, which

	8995 will result in an interpreter or a compiler.

	8996

	8997 This topic is way beyond the scope of this manual, and the reader is

	8998 invited to consult the dedicated literature.

	8999

	9000

	9001 File: bison.info, Node: Multiple start-symbols, Next: Secure? Conform?, Prev: Implementing Gotos/Loops, Up: FAQ

	9002

	9003 11.5 Multiple start-symbols

	9004 ===========================

	9005

	9006 I have several closely related grammars, and I would like to share their

	9007 implementations. In fact, I could use a single grammar but with

	9008 multiple entry points.

	9009

	9010 Bison does not support multiple start-symbols, but there is a very

	9011 simple means to simulate them. If `foo' and `bar' are the two pseudo

	9012 start-symbols, then introduce two new tokens, say `START_FOO' and

	9013 `START_BAR', and use them as switches from the real start-symbol:

	9014

	9015 %token START_FOO START_BAR;

	9016 %start start;

	9017 start: START_FOO foo

	9018 \| START_BAR bar;

	9019

	9020 These tokens prevents the introduction of new conflicts. As far as

	9021 the parser goes, that is all that is needed.

	9022

	9023 Now the difficult part is ensuring that the scanner will send these

	9024 tokens first. If your scanner is hand-written, that should be

	9025 straightforward. If your scanner is generated by Lex, them there is

	9026 simple means to do it: recall that anything between `%{ ... %}' after

	9027 the first `%%' is copied verbatim in the top of the generated `yylex'

	9028 function. Make sure a variable `start_token' is available in the

	9029 scanner (e.g., a global variable or using `%lex-param' etc.), and use

	9030 the following:

	9031

	9032 /* Prologue. */

	9033 %%

	9034 %{

	9035 if (start_token)

	9036 {

	9037 int t = start_token;

	9038 start_token = 0;

	9039 return t;

	9040 }

	9041 %}

	9042 /* The rules. */

	9043

	9044

	9045 File: bison.info, Node: Secure? Conform?, Next: I can't build Bison, Prev: Mu ltiple start-symbols, Up: FAQ

	9046

	9047 11.6 Secure? Conform?

	9048 ======================

	9049

	9050 Is Bison secure? Does it conform to POSIX?

	9051

	9052 If you're looking for a guarantee or certification, we don't provide

	9053 it. However, Bison is intended to be a reliable program that conforms

	9054 to the POSIX specification for Yacc. If you run into problems, please

	9055 send us a bug report.

	9056

	9057

	9058 File: bison.info, Node: I can't build Bison, Next: Where can I find help?, Pr ev: Secure? Conform?, Up: FAQ

	9059

	9060 11.7 I can't build Bison

	9061 ========================

	9062

	9063 I can't build Bison because `make' complains that

	9064 `msgfmt' is not found.

	9065 What should I do?

	9066

	9067 Like most GNU packages with internationalization support, that

	9068 feature is turned on by default. If you have problems building in the

	9069 `po' subdirectory, it indicates that your system's internationalization

	9070 support is lacking. You can re-configure Bison with `--disable-nls' to

	9071 turn off this support, or you can install GNU gettext from

	9072 `ftp://ftp.gnu.org/gnu/gettext/' and re-configure Bison. See the file

	9073 `ABOUT-NLS' for more information.

	9074

	9075

	9076 File: bison.info, Node: Where can I find help?, Next: Bug Reports, Prev: I ca n't build Bison, Up: FAQ

	9077

	9078 11.8 Where can I find help?

	9079 ===========================

	9080

	9081 I'm having trouble using Bison. Where can I find help?

	9082

	9083 First, read this fine manual. Beyond that, you can send mail to

	9084 <help-bison@gnu.org>. This mailing list is intended to be populated

	9085 with people who are willing to answer questions about using and

	9086 installing Bison. Please keep in mind that (most of) the people on the

	9087 list have aspects of their lives which are not related to Bison (!), so

	9088 you may not receive an answer to your question right away. This can be

	9089 frustrating, but please try not to honk them off; remember that any

	9090 help they provide is purely voluntary and out of the kindness of their

	9091 hearts.

	9092

	9093

	9094 File: bison.info, Node: Bug Reports, Next: More Languages, Prev: Where can I find help?, Up: FAQ

	9095

	9096 11.9 Bug Reports

	9097 ================

	9098

	9099 I found a bug. What should I include in the bug report?

	9100

	9101 Before you send a bug report, make sure you are using the latest

	9102 version. Check `ftp://ftp.gnu.org/pub/gnu/bison/' or one of its

	9103 mirrors. Be sure to include the version number in your bug report. If

	9104 the bug is present in the latest version but not in a previous version,

	9105 try to determine the most recent version which did not contain the bug.

	9106

	9107 If the bug is parser-related, you should include the smallest grammar

	9108 you can which demonstrates the bug. The grammar file should also be

	9109 complete (i.e., I should be able to run it through Bison without having

	9110 to edit or add anything). The smaller and simpler the grammar, the

	9111 easier it will be to fix the bug.

	9112

	9113 Include information about your compilation environment, including

	9114 your operating system's name and version and your compiler's name and

	9115 version. If you have trouble compiling, you should also include a

	9116 transcript of the build session, starting with the invocation of

	9117 `configure'. Depending on the nature of the bug, you may be asked to

	9118 send additional files as well (such as `config.h' or `config.cache').

	9119

	9120 Patches are most welcome, but not required. That is, do not

	9121 hesitate to send a bug report just because you can not provide a fix.

	9122

	9123 Send bug reports to <bug-bison@gnu.org>.

	9124

	9125

	9126 File: bison.info, Node: More Languages, Next: Beta Testing, Prev: Bug Reports , Up: FAQ

	9127

	9128 11.10 More Languages

	9129 ====================

	9130

	9131 Will Bison ever have C++ and Java support? How about INSERT YOUR

	9132 FAVORITE LANGUAGE HERE?

	9133

	9134 C++ and Java support is there now, and is documented. We'd love to

	9135 add other languages; contributions are welcome.

	9136

	9137

	9138 File: bison.info, Node: Beta Testing, Next: Mailing Lists, Prev: More Languag es, Up: FAQ

	9139

	9140 11.11 Beta Testing

	9141 ==================

	9142

	9143 What is involved in being a beta tester?

	9144

	9145 It's not terribly involved. Basically, you would download a test

	9146 release, compile it, and use it to build and run a parser or two. After

	9147 that, you would submit either a bug report or a message saying that

	9148 everything is okay. It is important to report successes as well as

	9149 failures because test releases eventually become mainstream releases,

	9150 but only if they are adequately tested. If no one tests, development is

	9151 essentially halted.

	9152

	9153 Beta testers are particularly needed for operating systems to which

	9154 the developers do not have easy access. They currently have easy

	9155 access to recent GNU/Linux and Solaris versions. Reports about other

	9156 operating systems are especially welcome.

	9157

	9158

	9159 File: bison.info, Node: Mailing Lists, Prev: Beta Testing, Up: FAQ

	9160

	9161 11.12 Mailing Lists

	9162 ===================

	9163

	9164 How do I join the help-bison and bug-bison mailing lists?

	9165

	9166 See `http://lists.gnu.org/'.

	9167

	9168

	9169 File: bison.info, Node: Table of Symbols, Next: Glossary, Prev: FAQ, Up: Top

	9170

	9171 Appendix A Bison Symbols

	9172 ************************

	9173

	9174 -- Variable: @$

	9175 In an action, the location of the left-hand side of the rule.

	9176 *Note Locations Overview: Locations.

	9177

	9178 -- Variable: @N

	9179 In an action, the location of the N-th symbol of the right-hand

	9180 side of the rule. *Note Locations Overview: Locations.

	9181

	9182 -- Variable: $$

	9183 In an action, the semantic value of the left-hand side of the rule.

	9184 *Note Actions::.

	9185

	9186 -- Variable: $N

	9187 In an action, the semantic value of the N-th symbol of the

	9188 right-hand side of the rule. *Note Actions::.

	9189

	9190 -- Delimiter: %%

	9191 Delimiter used to separate the grammar rule section from the Bison

	9192 declarations section or the epilogue. *Note The Overall Layout of

	9193 a Bison Grammar: Grammar Layout.

	9194

	9195 -- Delimiter: %{CODE%}

	9196 All code listed between `%{' and `%}' is copied directly to the

	9197 output file uninterpreted. Such code forms the prologue of the

	9198 input file. *Note Outline of a Bison Grammar: Grammar Outline.

	9199

	9200 -- Construct: /.../

	9201 Comment delimiters, as in C.

	9202

	9203 -- Delimiter: :

	9204 Separates a rule's result from its components. *Note Syntax of

	9205 Grammar Rules: Rules.

	9206

	9207 -- Delimiter: ;

	9208 Terminates a rule. *Note Syntax of Grammar Rules: Rules.

	9209

	9210 -- Delimiter: \|

	9211 Separates alternate rules for the same result nonterminal. *Note

	9212 Syntax of Grammar Rules: Rules.

	9213

	9214 -- Directive: <*>

	9215 Used to define a default tagged `%destructor' or default tagged

	9216 `%printer'.

	9217

	9218 This feature is experimental. More user feedback will help to

	9219 determine whether it should become a permanent feature.

	9220

	9221 *Note Freeing Discarded Symbols: Destructor Decl.

	9222

	9223 -- Directive: <>

	9224 Used to define a default tagless `%destructor' or default tagless

	9225 `%printer'.

	9226

	9227 This feature is experimental. More user feedback will help to

	9228 determine whether it should become a permanent feature.

	9229

	9230 *Note Freeing Discarded Symbols: Destructor Decl.

	9231

	9232 -- Symbol: $accept

	9233 The predefined nonterminal whose only rule is `$accept: START

	9234 $end', where START is the start symbol. *Note The Start-Symbol:

	9235 Start Decl. It cannot be used in the grammar.

	9236

	9237 -- Directive: %code {CODE}

	9238 -- Directive: %code QUALIFIER {CODE}

	9239 Insert CODE verbatim into output parser source. *Note %code: Decl

	9240 Summary.

	9241

	9242 -- Directive: %debug

	9243 Equip the parser for debugging. *Note Decl Summary::.

	9244

	9245 -- Directive: %debug

	9246 Equip the parser for debugging. *Note Decl Summary::.

	9247

	9248 -- Directive: %define DEFINE-VARIABLE

	9249 -- Directive: %define DEFINE-VARIABLE VALUE

	9250 Define a variable to adjust Bison's behavior. *Note %define: Decl

	9251 Summary.

	9252

	9253 -- Directive: %defines

	9254 Bison declaration to create a header file meant for the scanner.

	9255 *Note Decl Summary::.

	9256

	9257 -- Directive: %defines DEFINES-FILE

	9258 Same as above, but save in the file DEFINES-FILE. *Note Decl

	9259 Summary::.

	9260

	9261 -- Directive: %destructor

	9262 Specify how the parser should reclaim the memory associated to

	9263 discarded symbols. *Note Freeing Discarded Symbols: Destructor

	9264 Decl.

	9265

	9266 -- Directive: %dprec

	9267 Bison declaration to assign a precedence to a rule that is used at

	9268 parse time to resolve reduce/reduce conflicts. *Note Writing GLR

	9269 Parsers: GLR Parsers.

	9270

	9271 -- Symbol: $end

	9272 The predefined token marking the end of the token stream. It

	9273 cannot be used in the grammar.

	9274

	9275 -- Symbol: error

	9276 A token name reserved for error recovery. This token may be used

	9277 in grammar rules so as to allow the Bison parser to recognize an

	9278 error in the grammar without halting the process. In effect, a

	9279 sentence containing an error may be recognized as valid. On a

	9280 syntax error, the token `error' becomes the current lookahead

	9281 token. Actions corresponding to `error' are then executed, and

	9282 the lookahead token is reset to the token that originally caused

	9283 the violation. *Note Error Recovery::.

	9284

	9285 -- Directive: %error-verbose

	9286 Bison declaration to request verbose, specific error message

	9287 strings when `yyerror' is called.

	9288

	9289 -- Directive: %file-prefix "PREFIX"

	9290 Bison declaration to set the prefix of the output files. *Note

	9291 Decl Summary::.

	9292

	9293 -- Directive: %glr-parser

	9294 Bison declaration to produce a GLR parser. *Note Writing GLR

	9295 Parsers: GLR Parsers.

	9296

	9297 -- Directive: %initial-action

	9298 Run user code before parsing. *Note Performing Actions before

	9299 Parsing: Initial Action Decl.

	9300

	9301 -- Directive: %language

	9302 Specify the programming language for the generated parser. *Note

	9303 Decl Summary::.

	9304

	9305 -- Directive: %left

	9306 Bison declaration to assign left associativity to token(s). *Note

	9307 Operator Precedence: Precedence Decl.

	9308

	9309 -- Directive: %lex-param {ARGUMENT-DECLARATION}

	9310 Bison declaration to specifying an additional parameter that

	9311 `yylex' should accept. *Note Calling Conventions for Pure

	9312 Parsers: Pure Calling.

	9313

	9314 -- Directive: %merge

	9315 Bison declaration to assign a merging function to a rule. If

	9316 there is a reduce/reduce conflict with a rule having the same

	9317 merging function, the function is applied to the two semantic

	9318 values to get a single result. *Note Writing GLR Parsers: GLR

	9319 Parsers.

	9320

	9321 -- Directive: %name-prefix "PREFIX"

	9322 Bison declaration to rename the external symbols. *Note Decl

	9323 Summary::.

	9324

	9325 -- Directive: %no-lines

	9326 Bison declaration to avoid generating `#line' directives in the

	9327 parser file. *Note Decl Summary::.

	9328

	9329 -- Directive: %nonassoc

	9330 Bison declaration to assign nonassociativity to token(s). *Note

	9331 Operator Precedence: Precedence Decl.

	9332

	9333 -- Directive: %output "FILE"

	9334 Bison declaration to set the name of the parser file. *Note Decl

	9335 Summary::.

	9336

	9337 -- Directive: %parse-param {ARGUMENT-DECLARATION}

	9338 Bison declaration to specifying an additional parameter that

	9339 `yyparse' should accept. *Note The Parser Function `yyparse':

	9340 Parser Function.

	9341

	9342 -- Directive: %prec

	9343 Bison declaration to assign a precedence to a specific rule.

	9344 *Note Context-Dependent Precedence: Contextual Precedence.

	9345

	9346 -- Directive: %pure-parser

	9347 Deprecated version of `%define api.pure' (*note %define: Decl

	9348 Summary.), for which Bison is more careful to warn about

	9349 unreasonable usage.

	9350

	9351 -- Directive: %require "VERSION"

	9352 Require version VERSION or higher of Bison. *Note Require a

	9353 Version of Bison: Require Decl.

	9354

	9355 -- Directive: %right

	9356 Bison declaration to assign right associativity to token(s).

	9357 *Note Operator Precedence: Precedence Decl.

	9358

	9359 -- Directive: %skeleton

	9360 Specify the skeleton to use; usually for development. *Note Decl

	9361 Summary::.

	9362

	9363 -- Directive: %start

	9364 Bison declaration to specify the start symbol. *Note The

	9365 Start-Symbol: Start Decl.

	9366

	9367 -- Directive: %token

	9368 Bison declaration to declare token(s) without specifying

	9369 precedence. *Note Token Type Names: Token Decl.

	9370

	9371 -- Directive: %token-table

	9372 Bison declaration to include a token name table in the parser file.

	9373 *Note Decl Summary::.

	9374

	9375 -- Directive: %type

	9376 Bison declaration to declare nonterminals. *Note Nonterminal

	9377 Symbols: Type Decl.

	9378

	9379 -- Symbol: $undefined

	9380 The predefined token onto which all undefined values returned by

	9381 `yylex' are mapped. It cannot be used in the grammar, rather, use

	9382 `error'.

	9383

	9384 -- Directive: %union

	9385 Bison declaration to specify several possible data types for

	9386 semantic values. *Note The Collection of Value Types: Union Decl.

	9387

	9388 -- Macro: YYABORT

	9389 Macro to pretend that an unrecoverable syntax error has occurred,

	9390 by making `yyparse' return 1 immediately. The error reporting

	9391 function `yyerror' is not called. *Note The Parser Function

	9392 `yyparse': Parser Function.

	9393

	9394 For Java parsers, this functionality is invoked using `return

	9395 YYABORT;' instead.

	9396

	9397 -- Macro: YYACCEPT

	9398 Macro to pretend that a complete utterance of the language has been

	9399 read, by making `yyparse' return 0 immediately. *Note The Parser

	9400 Function `yyparse': Parser Function.

	9401

	9402 For Java parsers, this functionality is invoked using `return

	9403 YYACCEPT;' instead.

	9404

	9405 -- Macro: YYBACKUP

	9406 Macro to discard a value from the parser stack and fake a lookahead

	9407 token. *Note Special Features for Use in Actions: Action Features.

	9408

	9409 -- Variable: yychar

	9410 External integer variable that contains the integer value of the

	9411 lookahead token. (In a pure parser, it is a local variable within

	9412 `yyparse'.) Error-recovery rule actions may examine this variable.

	9413 *Note Special Features for Use in Actions: Action Features.

	9414

	9415 -- Variable: yyclearin

	9416 Macro used in error-recovery rule actions. It clears the previous

	9417 lookahead token. *Note Error Recovery::.

	9418

	9419 -- Macro: YYDEBUG

	9420 Macro to define to equip the parser with tracing code. *Note

	9421 Tracing Your Parser: Tracing.

	9422

	9423 -- Variable: yydebug

	9424 External integer variable set to zero by default. If `yydebug' is

	9425 given a nonzero value, the parser will output information on input

	9426 symbols and parser action. *Note Tracing Your Parser: Tracing.

	9427

	9428 -- Macro: yyerrok

	9429 Macro to cause parser to recover immediately to its normal mode

	9430 after a syntax error. *Note Error Recovery::.

	9431

	9432 -- Macro: YYERROR

	9433 Macro to pretend that a syntax error has just been detected: call

	9434 `yyerror' and then perform normal error recovery if possible

	9435 (*note Error Recovery::), or (if recovery is impossible) make

	9436 `yyparse' return 1. *Note Error Recovery::.

	9437

	9438 For Java parsers, this functionality is invoked using `return

	9439 YYERROR;' instead.

	9440

	9441 -- Function: yyerror

	9442 User-supplied function to be called by `yyparse' on error. *Note

	9443 The Error Reporting Function `yyerror': Error Reporting.

	9444

	9445 -- Macro: YYERROR_VERBOSE

	9446 An obsolete macro that you define with `#define' in the prologue

	9447 to request verbose, specific error message strings when `yyerror'

	9448 is called. It doesn't matter what definition you use for

	9449 `YYERROR_VERBOSE', just whether you define it. Using

	9450 `%error-verbose' is preferred.

	9451

	9452 -- Macro: YYINITDEPTH

	9453 Macro for specifying the initial size of the parser stack. *Note

	9454 Memory Management::.

	9455

	9456 -- Function: yylex

	9457 User-supplied lexical analyzer function, called with no arguments

	9458 to get the next token. *Note The Lexical Analyzer Function

	9459 `yylex': Lexical.

	9460

	9461 -- Macro: YYLEX_PARAM

	9462 An obsolete macro for specifying an extra argument (or list of

	9463 extra arguments) for `yyparse' to pass to `yylex'. The use of this

	9464 macro is deprecated, and is supported only for Yacc like parsers.

	9465 *Note Calling Conventions for Pure Parsers: Pure Calling.

	9466

	9467 -- Variable: yylloc

	9468 External variable in which `yylex' should place the line and column

	9469 numbers associated with a token. (In a pure parser, it is a local

	9470 variable within `yyparse', and its address is passed to `yylex'.)

	9471 You can ignore this variable if you don't use the `@' feature in

	9472 the grammar actions. *Note Textual Locations of Tokens: Token

	9473 Locations. In semantic actions, it stores the location of the

	9474 lookahead token. *Note Actions and Locations: Actions and

	9475 Locations.

	9476

	9477 -- Type: YYLTYPE

	9478 Data type of `yylloc'; by default, a structure with four members.

	9479 *Note Data Types of Locations: Location Type.

	9480

	9481 -- Variable: yylval

	9482 External variable in which `yylex' should place the semantic value

	9483 associated with a token. (In a pure parser, it is a local

	9484 variable within `yyparse', and its address is passed to `yylex'.)

	9485 *Note Semantic Values of Tokens: Token Values. In semantic

	9486 actions, it stores the semantic value of the lookahead token.

	9487 *Note Actions: Actions.

	9488

	9489 -- Macro: YYMAXDEPTH

	9490 Macro for specifying the maximum size of the parser stack. *Note

	9491 Memory Management::.

	9492

	9493 -- Variable: yynerrs

	9494 Global variable which Bison increments each time it reports a

	9495 syntax error. (In a pure parser, it is a local variable within

	9496 `yyparse'. In a pure push parser, it is a member of yypstate.)

	9497 *Note The Error Reporting Function `yyerror': Error Reporting.

	9498

	9499 -- Function: yyparse

	9500 The parser function produced by Bison; call this function to start

	9501 parsing. *Note The Parser Function `yyparse': Parser Function.

	9502

	9503 -- Function: yypstate_delete

	9504 The function to delete a parser instance, produced by Bison in

	9505 push mode; call this function to delete the memory associated with

	9506 a parser. *Note The Parser Delete Function `yypstate_delete':

	9507 Parser Delete Function. (The current push parsing interface is

	9508 experimental and may evolve. More user feedback will help to

	9509 stabilize it.)

	9510

	9511 -- Function: yypstate_new

	9512 The function to create a parser instance, produced by Bison in

	9513 push mode; call this function to create a new parser. *Note The

	9514 Parser Create Function `yypstate_new': Parser Create Function.

	9515 (The current push parsing interface is experimental and may evolve.

	9516 More user feedback will help to stabilize it.)

	9517

	9518 -- Function: yypull_parse

	9519 The parser function produced by Bison in push mode; call this

	9520 function to parse the rest of the input stream. *Note The Pull

	9521 Parser Function `yypull_parse': Pull Parser Function. (The

	9522 current push parsing interface is experimental and may evolve.

	9523 More user feedback will help to stabilize it.)

	9524

	9525 -- Function: yypush_parse

	9526 The parser function produced by Bison in push mode; call this

	9527 function to parse a single token. *Note The Push Parser Function

	9528 `yypush_parse': Push Parser Function. (The current push parsing

	9529 interface is experimental and may evolve. More user feedback will

	9530 help to stabilize it.)

	9531

	9532 -- Macro: YYPARSE_PARAM

	9533 An obsolete macro for specifying the name of a parameter that

	9534 `yyparse' should accept. The use of this macro is deprecated, and

	9535 is supported only for Yacc like parsers. *Note Calling

	9536 Conventions for Pure Parsers: Pure Calling.

	9537

	9538 -- Macro: YYRECOVERING

	9539 The expression `YYRECOVERING ()' yields 1 when the parser is

	9540 recovering from a syntax error, and 0 otherwise. *Note Special

	9541 Features for Use in Actions: Action Features.

	9542

	9543 -- Macro: YYSTACK_USE_ALLOCA

	9544 Macro used to control the use of `alloca' when the C LALR(1)

	9545 parser needs to extend its stacks. If defined to 0, the parser

	9546 will use `malloc' to extend its stacks. If defined to 1, the

	9547 parser will use `alloca'. Values other than 0 and 1 are reserved

	9548 for future Bison extensions. If not defined, `YYSTACK_USE_ALLOCA'

	9549 defaults to 0.

	9550

	9551 In the all-too-common case where your code may run on a host with a

	9552 limited stack and with unreliable stack-overflow checking, you

	9553 should set `YYMAXDEPTH' to a value that cannot possibly result in

	9554 unchecked stack overflow on any of your target hosts when `alloca'

	9555 is called. You can inspect the code that Bison generates in order

	9556 to determine the proper numeric values. This will require some

	9557 expertise in low-level implementation details.

	9558

	9559 -- Type: YYSTYPE

	9560 Data type of semantic values; `int' by default. *Note Data Types

	9561 of Semantic Values: Value Type.

	9562

	9563

	9564 File: bison.info, Node: Glossary, Next: Copying This Manual, Prev: Table of S ymbols, Up: Top

	9565

	9566 Appendix B Glossary

	9567 *******************

	9568

	9569 Backus-Naur Form (BNF; also called "Backus Normal Form")

	9570 Formal method of specifying context-free grammars originally

	9571 proposed by John Backus, and slightly improved by Peter Naur in

	9572 his 1960-01-02 committee document contributing to what became the

	9573 Algol 60 report. *Note Languages and Context-Free Grammars:

	9574 Language and Grammar.

	9575

	9576 Context-free grammars

	9577 Grammars specified as rules that can be applied regardless of

	9578 context. Thus, if there is a rule which says that an integer can

	9579 be used as an expression, integers are allowed _anywhere_ an

	9580 expression is permitted. *Note Languages and Context-Free

	9581 Grammars: Language and Grammar.

	9582

	9583 Dynamic allocation

	9584 Allocation of memory that occurs during execution, rather than at

	9585 compile time or on entry to a function.

	9586

	9587 Empty string

	9588 Analogous to the empty set in set theory, the empty string is a

	9589 character string of length zero.

	9590

	9591 Finite-state stack machine

	9592 A "machine" that has discrete states in which it is said to exist

	9593 at each instant in time. As input to the machine is processed, the

	9594 machine moves from state to state as specified by the logic of the

	9595 machine. In the case of the parser, the input is the language

	9596 being parsed, and the states correspond to various stages in the

	9597 grammar rules. *Note The Bison Parser Algorithm: Algorithm.

	9598

	9599 Generalized LR (GLR)

	9600 A parsing algorithm that can handle all context-free grammars,

	9601 including those that are not LALR(1). It resolves situations that

	9602 Bison's usual LALR(1) algorithm cannot by effectively splitting

	9603 off multiple parsers, trying all possible parsers, and discarding

	9604 those that fail in the light of additional right context. *Note

	9605 Generalized LR Parsing: Generalized LR Parsing.

	9606

	9607 Grouping

	9608 A language construct that is (in general) grammatically divisible;

	9609 for example, `expression' or `declaration' in C. *Note Languages

	9610 and Context-Free Grammars: Language and Grammar.

	9611

	9612 Infix operator

	9613 An arithmetic operator that is placed between the operands on

	9614 which it performs some operation.

	9615

	9616 Input stream

	9617 A continuous flow of data between devices or programs.

	9618

	9619 Language construct

	9620 One of the typical usage schemas of the language. For example,

	9621 one of the constructs of the C language is the `if' statement.

	9622 *Note Languages and Context-Free Grammars: Language and Grammar.

	9623

	9624 Left associativity

	9625 Operators having left associativity are analyzed from left to

	9626 right: `a+b+c' first computes `a+b' and then combines with `c'.

	9627 *Note Operator Precedence: Precedence.

	9628

	9629 Left recursion

	9630 A rule whose result symbol is also its first component symbol; for

	9631 example, `expseq1 : expseq1 ',' exp;'. *Note Recursive Rules:

	9632 Recursion.

	9633

	9634 Left-to-right parsing

	9635 Parsing a sentence of a language by analyzing it token by token

	9636 from left to right. *Note The Bison Parser Algorithm: Algorithm.

	9637

	9638 Lexical analyzer (scanner)

	9639 A function that reads an input stream and returns tokens one by

	9640 one. *Note The Lexical Analyzer Function `yylex': Lexical.

	9641

	9642 Lexical tie-in

	9643 A flag, set by actions in the grammar rules, which alters the way

	9644 tokens are parsed. *Note Lexical Tie-ins::.

	9645

	9646 Literal string token

	9647 A token which consists of two or more fixed characters. *Note

	9648 Symbols::.

	9649

	9650 Lookahead token

	9651 A token already read but not yet shifted. *Note Lookahead Tokens:

	9652 Lookahead.

	9653

	9654 LALR(1)

	9655 The class of context-free grammars that Bison (like most other

	9656 parser generators) can handle; a subset of LR(1). *Note

	9657 Mysterious Reduce/Reduce Conflicts: Mystery Conflicts.

	9658

	9659 LR(1)

	9660 The class of context-free grammars in which at most one token of

	9661 lookahead is needed to disambiguate the parsing of any piece of

	9662 input.

	9663

	9664 Nonterminal symbol

	9665 A grammar symbol standing for a grammatical construct that can be

	9666 expressed through rules in terms of smaller constructs; in other

	9667 words, a construct that is not a token. *Note Symbols::.

	9668

	9669 Parser

	9670 A function that recognizes valid sentences of a language by

	9671 analyzing the syntax structure of a set of tokens passed to it

	9672 from a lexical analyzer.

	9673

	9674 Postfix operator

	9675 An arithmetic operator that is placed after the operands upon

	9676 which it performs some operation.

	9677

	9678 Reduction

	9679 Replacing a string of nonterminals and/or terminals with a single

	9680 nonterminal, according to a grammar rule. *Note The Bison Parser

	9681 Algorithm: Algorithm.

	9682

	9683 Reentrant

	9684 A reentrant subprogram is a subprogram which can be in invoked any

	9685 number of times in parallel, without interference between the

	9686 various invocations. *Note A Pure (Reentrant) Parser: Pure Decl.

	9687

	9688 Reverse polish notation

	9689 A language in which all operators are postfix operators.

	9690

	9691 Right recursion

	9692 A rule whose result symbol is also its last component symbol; for

	9693 example, `expseq1: exp ',' expseq1;'. *Note Recursive Rules:

	9694 Recursion.

	9695

	9696 Semantics

	9697 In computer languages, the semantics are specified by the actions

	9698 taken for each instance of the language, i.e., the meaning of each

	9699 statement. *Note Defining Language Semantics: Semantics.

	9700

	9701 Shift

	9702 A parser is said to shift when it makes the choice of analyzing

	9703 further input from the stream rather than reducing immediately some

	9704 already-recognized rule. *Note The Bison Parser Algorithm:

	9705 Algorithm.

	9706

	9707 Single-character literal

	9708 A single character that is recognized and interpreted as is.

	9709 *Note From Formal Rules to Bison Input: Grammar in Bison.

	9710

	9711 Start symbol

	9712 The nonterminal symbol that stands for a complete valid utterance

	9713 in the language being parsed. The start symbol is usually listed

	9714 as the first nonterminal symbol in a language specification.

	9715 *Note The Start-Symbol: Start Decl.

	9716

	9717 Symbol table

	9718 A data structure where symbol names and associated data are stored

	9719 during parsing to allow for recognition and use of existing

	9720 information in repeated uses of a symbol. *Note Multi-function

	9721 Calc::.

	9722

	9723 Syntax error

	9724 An error encountered during parsing of an input stream due to

	9725 invalid syntax. *Note Error Recovery::.

	9726

	9727 Token

	9728 A basic, grammatically indivisible unit of a language. The symbol

	9729 that describes a token in the grammar is a terminal symbol. The

	9730 input of the Bison parser is a stream of tokens which comes from

	9731 the lexical analyzer. *Note Symbols::.

	9732

	9733 Terminal symbol

	9734 A grammar symbol that has no rules in the grammar and therefore is

	9735 grammatically indivisible. The piece of text it represents is a

	9736 token. *Note Languages and Context-Free Grammars: Language and

	9737 Grammar.

	9738

	9739

	9740 File: bison.info, Node: Copying This Manual, Next: Index, Prev: Glossary, Up : Top

	9741

	9742 Appendix C Copying This Manual

	9743 ******************************

	9744

	9745 Version 1.2, November 2002

	9746

	9747 Copyright (C) 2000,2001,2002 Free Software Foundation, Inc.

	9748 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA

	9749

	9750 Everyone is permitted to copy and distribute verbatim copies

	9751 of this license document, but changing it is not allowed.

	9752

	9753 0. PREAMBLE

	9754

	9755 The purpose of this License is to make a manual, textbook, or other

	9756 functional and useful document "free" in the sense of freedom: to

	9757 assure everyone the effective freedom to copy and redistribute it,

	9758 with or without modifying it, either commercially or

	9759 noncommercially. Secondarily, this License preserves for the

	9760 author and publisher a way to get credit for their work, while not

	9761 being considered responsible for modifications made by others.

	9762

	9763 This License is a kind of "copyleft", which means that derivative

	9764 works of the document must themselves be free in the same sense.

	9765 It complements the GNU General Public License, which is a copyleft

	9766 license designed for free software.

	9767

	9768 We have designed this License in order to use it for manuals for

	9769 free software, because free software needs free documentation: a

	9770 free program should come with manuals providing the same freedoms

	9771 that the software does. But this License is not limited to

	9772 software manuals; it can be used for any textual work, regardless

	9773 of subject matter or whether it is published as a printed book.

	9774 We recommend this License principally for works whose purpose is

	9775 instruction or reference.

	9776

	9777 1. APPLICABILITY AND DEFINITIONS

	9778

	9779 This License applies to any manual or other work, in any medium,

	9780 that contains a notice placed by the copyright holder saying it

	9781 can be distributed under the terms of this License. Such a notice

	9782 grants a world-wide, royalty-free license, unlimited in duration,

	9783 to use that work under the conditions stated herein. The

	9784 "Document", below, refers to any such manual or work. Any member

	9785 of the public is a licensee, and is addressed as "you". You

	9786 accept the license if you copy, modify or distribute the work in a

	9787 way requiring permission under copyright law.

	9788

	9789 A "Modified Version" of the Document means any work containing the

	9790 Document or a portion of it, either copied verbatim, or with

	9791 modifications and/or translated into another language.

	9792

	9793 A "Secondary Section" is a named appendix or a front-matter section

	9794 of the Document that deals exclusively with the relationship of the

	9795 publishers or authors of the Document to the Document's overall

	9796 subject (or to related matters) and contains nothing that could

	9797 fall directly within that overall subject. (Thus, if the Document

	9798 is in part a textbook of mathematics, a Secondary Section may not

	9799 explain any mathematics.) The relationship could be a matter of

	9800 historical connection with the subject or with related matters, or

	9801 of legal, commercial, philosophical, ethical or political position

	9802 regarding them.

	9803

	9804 The "Invariant Sections" are certain Secondary Sections whose

	9805 titles are designated, as being those of Invariant Sections, in

	9806 the notice that says that the Document is released under this

	9807 License. If a section does not fit the above definition of

	9808 Secondary then it is not allowed to be designated as Invariant.

	9809 The Document may contain zero Invariant Sections. If the Document

	9810 does not identify any Invariant Sections then there are none.

	9811

	9812 The "Cover Texts" are certain short passages of text that are

	9813 listed, as Front-Cover Texts or Back-Cover Texts, in the notice

	9814 that says that the Document is released under this License. A

	9815 Front-Cover Text may be at most 5 words, and a Back-Cover Text may

	9816 be at most 25 words.

	9817

	9818 A "Transparent" copy of the Document means a machine-readable copy,

	9819 represented in a format whose specification is available to the

	9820 general public, that is suitable for revising the document

	9821 straightforwardly with generic text editors or (for images

	9822 composed of pixels) generic paint programs or (for drawings) some

	9823 widely available drawing editor, and that is suitable for input to

	9824 text formatters or for automatic translation to a variety of

	9825 formats suitable for input to text formatters. A copy made in an

	9826 otherwise Transparent file format whose markup, or absence of

	9827 markup, has been arranged to thwart or discourage subsequent

	9828 modification by readers is not Transparent. An image format is

	9829 not Transparent if used for any substantial amount of text. A

	9830 copy that is not "Transparent" is called "Opaque".

	9831

	9832 Examples of suitable formats for Transparent copies include plain

	9833 ASCII without markup, Texinfo input format, LaTeX input format,

	9834 SGML or XML using a publicly available DTD, and

	9835 standard-conforming simple HTML, PostScript or PDF designed for

	9836 human modification. Examples of transparent image formats include

	9837 PNG, XCF and JPG. Opaque formats include proprietary formats that

	9838 can be read and edited only by proprietary word processors, SGML or

	9839 XML for which the DTD and/or processing tools are not generally

	9840 available, and the machine-generated HTML, PostScript or PDF

	9841 produced by some word processors for output purposes only.

	9842

	9843 The "Title Page" means, for a printed book, the title page itself,

	9844 plus such following pages as are needed to hold, legibly, the

	9845 material this License requires to appear in the title page. For

	9846 works in formats which do not have any title page as such, "Title

	9847 Page" means the text near the most prominent appearance of the

	9848 work's title, preceding the beginning of the body of the text.

	9849

	9850 A section "Entitled XYZ" means a named subunit of the Document

	9851 whose title either is precisely XYZ or contains XYZ in parentheses

	9852 following text that translates XYZ in another language. (Here XYZ

	9853 stands for a specific section name mentioned below, such as

	9854 "Acknowledgements", "Dedications", "Endorsements", or "History".)

	9855 To "Preserve the Title" of such a section when you modify the

	9856 Document means that it remains a section "Entitled XYZ" according

	9857 to this definition.

	9858

	9859 The Document may include Warranty Disclaimers next to the notice

	9860 which states that this License applies to the Document. These

	9861 Warranty Disclaimers are considered to be included by reference in

	9862 this License, but only as regards disclaiming warranties: any other

	9863 implication that these Warranty Disclaimers may have is void and

	9864 has no effect on the meaning of this License.

	9865

	9866 2. VERBATIM COPYING

	9867

	9868 You may copy and distribute the Document in any medium, either

	9869 commercially or noncommercially, provided that this License, the

	9870 copyright notices, and the license notice saying this License

	9871 applies to the Document are reproduced in all copies, and that you

	9872 add no other conditions whatsoever to those of this License. You

	9873 may not use technical measures to obstruct or control the reading

	9874 or further copying of the copies you make or distribute. However,

	9875 you may accept compensation in exchange for copies. If you

	9876 distribute a large enough number of copies you must also follow

	9877 the conditions in section 3.

	9878

	9879 You may also lend copies, under the same conditions stated above,

	9880 and you may publicly display copies.

	9881

	9882 3. COPYING IN QUANTITY

	9883

	9884 If you publish printed copies (or copies in media that commonly

	9885 have printed covers) of the Document, numbering more than 100, and

	9886 the Document's license notice requires Cover Texts, you must

	9887 enclose the copies in covers that carry, clearly and legibly, all

	9888 these Cover Texts: Front-Cover Texts on the front cover, and

	9889 Back-Cover Texts on the back cover. Both covers must also clearly

	9890 and legibly identify you as the publisher of these copies. The

	9891 front cover must present the full title with all words of the

	9892 title equally prominent and visible. You may add other material

	9893 on the covers in addition. Copying with changes limited to the

	9894 covers, as long as they preserve the title of the Document and

	9895 satisfy these conditions, can be treated as verbatim copying in

	9896 other respects.

	9897

	9898 If the required texts for either cover are too voluminous to fit

	9899 legibly, you should put the first ones listed (as many as fit

	9900 reasonably) on the actual cover, and continue the rest onto

	9901 adjacent pages.

	9902

	9903 If you publish or distribute Opaque copies of the Document

	9904 numbering more than 100, you must either include a

	9905 machine-readable Transparent copy along with each Opaque copy, or

	9906 state in or with each Opaque copy a computer-network location from

	9907 which the general network-using public has access to download

	9908 using public-standard network protocols a complete Transparent

	9909 copy of the Document, free of added material. If you use the

	9910 latter option, you must take reasonably prudent steps, when you

	9911 begin distribution of Opaque copies in quantity, to ensure that

	9912 this Transparent copy will remain thus accessible at the stated

	9913 location until at least one year after the last time you

	9914 distribute an Opaque copy (directly or through your agents or

	9915 retailers) of that edition to the public.

	9916

	9917 It is requested, but not required, that you contact the authors of

	9918 the Document well before redistributing any large number of

	9919 copies, to give them a chance to provide you with an updated

	9920 version of the Document.

	9921

	9922 4. MODIFICATIONS

	9923

	9924 You may copy and distribute a Modified Version of the Document

	9925 under the conditions of sections 2 and 3 above, provided that you

	9926 release the Modified Version under precisely this License, with

	9927 the Modified Version filling the role of the Document, thus

	9928 licensing distribution and modification of the Modified Version to

	9929 whoever possesses a copy of it. In addition, you must do these

	9930 things in the Modified Version:

	9931

	9932 A. Use in the Title Page (and on the covers, if any) a title

	9933 distinct from that of the Document, and from those of

	9934 previous versions (which should, if there were any, be listed

	9935 in the History section of the Document). You may use the

	9936 same title as a previous version if the original publisher of

	9937 that version gives permission.

	9938

	9939 B. List on the Title Page, as authors, one or more persons or

	9940 entities responsible for authorship of the modifications in

	9941 the Modified Version, together with at least five of the

	9942 principal authors of the Document (all of its principal

	9943 authors, if it has fewer than five), unless they release you

	9944 from this requirement.

	9945

	9946 C. State on the Title page the name of the publisher of the

	9947 Modified Version, as the publisher.

	9948

	9949 D. Preserve all the copyright notices of the Document.

	9950

	9951 E. Add an appropriate copyright notice for your modifications

	9952 adjacent to the other copyright notices.

	9953

	9954 F. Include, immediately after the copyright notices, a license

	9955 notice giving the public permission to use the Modified

	9956 Version under the terms of this License, in the form shown in

	9957 the Addendum below.

	9958

	9959 G. Preserve in that license notice the full lists of Invariant

	9960 Sections and required Cover Texts given in the Document's

	9961 license notice.

	9962

	9963 H. Include an unaltered copy of this License.

	9964

	9965 I. Preserve the section Entitled "History", Preserve its Title,

	9966 and add to it an item stating at least the title, year, new

	9967 authors, and publisher of the Modified Version as given on

	9968 the Title Page. If there is no section Entitled "History" in

	9969 the Document, create one stating the title, year, authors,

	9970 and publisher of the Document as given on its Title Page,

	9971 then add an item describing the Modified Version as stated in

	9972 the previous sentence.

	9973

	9974 J. Preserve the network location, if any, given in the Document

	9975 for public access to a Transparent copy of the Document, and

	9976 likewise the network locations given in the Document for

	9977 previous versions it was based on. These may be placed in

	9978 the "History" section. You may omit a network location for a

	9979 work that was published at least four years before the

	9980 Document itself, or if the original publisher of the version

	9981 it refers to gives permission.

	9982

	9983 K. For any section Entitled "Acknowledgements" or "Dedications",

	9984 Preserve the Title of the section, and preserve in the

	9985 section all the substance and tone of each of the contributor

	9986 acknowledgements and/or dedications given therein.

	9987

	9988 L. Preserve all the Invariant Sections of the Document,

	9989 unaltered in their text and in their titles. Section numbers

	9990 or the equivalent are not considered part of the section

	9991 titles.

	9992

	9993 M. Delete any section Entitled "Endorsements". Such a section

	9994 may not be included in the Modified Version.

	9995

	9996 N. Do not retitle any existing section to be Entitled

	9997 "Endorsements" or to conflict in title with any Invariant

	9998 Section.

	9999

	10000 O. Preserve any Warranty Disclaimers.

	10001

	10002 If the Modified Version includes new front-matter sections or

	10003 appendices that qualify as Secondary Sections and contain no

	10004 material copied from the Document, you may at your option

	10005 designate some or all of these sections as invariant. To do this,

	10006 add their titles to the list of Invariant Sections in the Modified

	10007 Version's license notice. These titles must be distinct from any

	10008 other section titles.

	10009

	10010 You may add a section Entitled "Endorsements", provided it contains

	10011 nothing but endorsements of your Modified Version by various

	10012 parties--for example, statements of peer review or that the text

	10013 has been approved by an organization as the authoritative

	10014 definition of a standard.

	10015

	10016 You may add a passage of up to five words as a Front-Cover Text,

	10017 and a passage of up to 25 words as a Back-Cover Text, to the end

	10018 of the list of Cover Texts in the Modified Version. Only one

	10019 passage of Front-Cover Text and one of Back-Cover Text may be

	10020 added by (or through arrangements made by) any one entity. If the

	10021 Document already includes a cover text for the same cover,

	10022 previously added by you or by arrangement made by the same entity

	10023 you are acting on behalf of, you may not add another; but you may

	10024 replace the old one, on explicit permission from the previous

	10025 publisher that added the old one.

	10026

	10027 The author(s) and publisher(s) of the Document do not by this

	10028 License give permission to use their names for publicity for or to

	10029 assert or imply endorsement of any Modified Version.

	10030

	10031 5. COMBINING DOCUMENTS

	10032

	10033 You may combine the Document with other documents released under

	10034 this License, under the terms defined in section 4 above for

	10035 modified versions, provided that you include in the combination

	10036 all of the Invariant Sections of all of the original documents,

	10037 unmodified, and list them all as Invariant Sections of your

	10038 combined work in its license notice, and that you preserve all

	10039 their Warranty Disclaimers.

	10040

	10041 The combined work need only contain one copy of this License, and

	10042 multiple identical Invariant Sections may be replaced with a single

	10043 copy. If there are multiple Invariant Sections with the same name

	10044 but different contents, make the title of each such section unique

	10045 by adding at the end of it, in parentheses, the name of the

	10046 original author or publisher of that section if known, or else a

	10047 unique number. Make the same adjustment to the section titles in

	10048 the list of Invariant Sections in the license notice of the

	10049 combined work.

	10050

	10051 In the combination, you must combine any sections Entitled

	10052 "History" in the various original documents, forming one section

	10053 Entitled "History"; likewise combine any sections Entitled

	10054 "Acknowledgements", and any sections Entitled "Dedications". You

	10055 must delete all sections Entitled "Endorsements."

	10056

	10057 6. COLLECTIONS OF DOCUMENTS

	10058

	10059 You may make a collection consisting of the Document and other

	10060 documents released under this License, and replace the individual

	10061 copies of this License in the various documents with a single copy

	10062 that is included in the collection, provided that you follow the

	10063 rules of this License for verbatim copying of each of the

	10064 documents in all other respects.

	10065

	10066 You may extract a single document from such a collection, and

	10067 distribute it individually under this License, provided you insert

	10068 a copy of this License into the extracted document, and follow

	10069 this License in all other respects regarding verbatim copying of

	10070 that document.

	10071

	10072 7. AGGREGATION WITH INDEPENDENT WORKS

	10073

	10074 A compilation of the Document or its derivatives with other

	10075 separate and independent documents or works, in or on a volume of

	10076 a storage or distribution medium, is called an "aggregate" if the

	10077 copyright resulting from the compilation is not used to limit the

	10078 legal rights of the compilation's users beyond what the individual

	10079 works permit. When the Document is included in an aggregate, this

	10080 License does not apply to the other works in the aggregate which

	10081 are not themselves derivative works of the Document.

	10082

	10083 If the Cover Text requirement of section 3 is applicable to these

	10084 copies of the Document, then if the Document is less than one half

	10085 of the entire aggregate, the Document's Cover Texts may be placed

	10086 on covers that bracket the Document within the aggregate, or the

	10087 electronic equivalent of covers if the Document is in electronic

	10088 form. Otherwise they must appear on printed covers that bracket

	10089 the whole aggregate.

	10090

	10091 8. TRANSLATION

	10092

	10093 Translation is considered a kind of modification, so you may

	10094 distribute translations of the Document under the terms of section

	10095 4. Replacing Invariant Sections with translations requires special

	10096 permission from their copyright holders, but you may include

	10097 translations of some or all Invariant Sections in addition to the

	10098 original versions of these Invariant Sections. You may include a

	10099 translation of this License, and all the license notices in the

	10100 Document, and any Warranty Disclaimers, provided that you also

	10101 include the original English version of this License and the

	10102 original versions of those notices and disclaimers. In case of a

	10103 disagreement between the translation and the original version of

	10104 this License or a notice or disclaimer, the original version will

	10105 prevail.

	10106

	10107 If a section in the Document is Entitled "Acknowledgements",

	10108 "Dedications", or "History", the requirement (section 4) to

	10109 Preserve its Title (section 1) will typically require changing the

	10110 actual title.

	10111

	10112 9. TERMINATION

	10113

	10114 You may not copy, modify, sublicense, or distribute the Document

	10115 except as expressly provided for under this License. Any other

	10116 attempt to copy, modify, sublicense or distribute the Document is

	10117 void, and will automatically terminate your rights under this

	10118 License. However, parties who have received copies, or rights,

	10119 from you under this License will not have their licenses

	10120 terminated so long as such parties remain in full compliance.

	10121

	10122 10. FUTURE REVISIONS OF THIS LICENSE

	10123

	10124 The Free Software Foundation may publish new, revised versions of

	10125 the GNU Free Documentation License from time to time. Such new

	10126 versions will be similar in spirit to the present version, but may

	10127 differ in detail to address new problems or concerns. See

	10128 `http://www.gnu.org/copyleft/'.

	10129

	10130 Each version of the License is given a distinguishing version

	10131 number. If the Document specifies that a particular numbered

	10132 version of this License "or any later version" applies to it, you

	10133 have the option of following the terms and conditions either of

	10134 that specified version or of any later version that has been

	10135 published (not as a draft) by the Free Software Foundation. If

	10136 the Document does not specify a version number of this License,

	10137 you may choose any version ever published (not as a draft) by the

	10138 Free Software Foundation.

	10139

	10140 ADDENDUM: How to use this License for your documents

	10141 ====================================================

	10142

	10143 To use this License in a document you have written, include a copy of

	10144 the License in the document and put the following copyright and license

	10145 notices just after the title page:

	10146

	10147 Copyright (C) YEAR YOUR NAME.

	10148 Permission is granted to copy, distribute and/or modify this document

	10149 under the terms of the GNU Free Documentation License, Version 1.2

	10150 or any later version published by the Free Software Foundation;

	10151 with no Invariant Sections, no Front-Cover Texts, and no Back-Cover

	10152 Texts. A copy of the license is included in the section entitled ``GNU

	10153 Free Documentation License''.

	10154

	10155 If you have Invariant Sections, Front-Cover Texts and Back-Cover

	10156 Texts, replace the "with...Texts." line with this:

	10157

	10158 with the Invariant Sections being LIST THEIR TITLES, with

	10159 the Front-Cover Texts being LIST, and with the Back-Cover Texts

	10160 being LIST.

	10161

	10162 If you have Invariant Sections without Cover Texts, or some other

	10163 combination of the three, merge those two alternatives to suit the

	10164 situation.

	10165

	10166 If your document contains nontrivial examples of program code, we

	10167 recommend releasing these examples in parallel under your choice of

	10168 free software license, such as the GNU General Public License, to

	10169 permit their use in free software.

	10170

	10171

	10172 File: bison.info, Node: Index, Prev: Copying This Manual, Up: Top

	10173

	10174 Index

	10175 *****

	10176

	10177 [index]

	10178 * Menu:

	10179

	10180 * $ <1>: Table of Symbols. (line 19)

	10181 * $ <2>: Action Features. (line 14)

	10182 * $: Java Action Features.

	10183 (line 13)

	10184 * $$ <1>: Action Features. (line 10)

	10185 * $$ <2>: Java Action Features.

	10186 (line 21)

	10187 * $$ <3>: Actions. (line 6)

	10188 * $$: Table of Symbols. (line 15)

	10189 * $< <1>: Java Action Features.

	10190 (line 17)

	10191 * $< <2>: Action Features. (line 23)

	10192 * $< <3>: Java Action Features.

	10193 (line 29)

	10194 * $<: Action Features. (line 18)

	10195 * $accept: Table of Symbols. (line 65)

	10196 * $end: Table of Symbols. (line 104)

	10197 * $N: Actions. (line 6)

	10198 * $undefined: Table of Symbols. (line 212)

	10199 * % <1>: Java Declarations Summary.

	10200 (line 53)

	10201 * %: Table of Symbols. (line 28)

	10202 * %% <1>: Table of Symbols. (line 23)

	10203 * %%: Java Declarations Summary.

	10204 (line 49)

	10205 * %code <1>: Table of Symbols. (line 71)

	10206 * %code <2>: Prologue Alternatives.

	10207 (line 6)

	10208 * %code <3>: Java Declarations Summary.

	10209 (line 37)

	10210 * %code <4>: Calc++ Parser. (line 64)

	10211 * %code: Decl Summary. (line 63)

	10212 * %code imports <1>: Java Declarations Summary.

	10213 (line 41)

	10214 * %code imports: Decl Summary. (line 115)

	10215 * %code lexer: Java Declarations Summary.

	10216 (line 45)

	10217 * %code provides <1>: Prologue Alternatives.

	10218 (line 6)

	10219 * %code provides: Decl Summary. (line 303)

	10220 * %code requires <1>: Decl Summary. (line 72)

	10221 * %code requires <2>: Calc++ Parser. (line 17)

	10222 * %code requires: Prologue Alternatives.

	10223 (line 6)

	10224 * %code top <1>: Decl Summary. (line 98)

	10225 * %code top: Prologue Alternatives.

	10226 (line 6)

	10227 * %debug <1>: Table of Symbols. (line 78)

	10228 * %debug <2>: Tracing. (line 23)

	10229 * %debug <3>: Decl Summary. (line 134)

	10230 * %debug: Table of Symbols. (line 75)

	10231 * %define <1>: Table of Symbols. (line 81)

	10232 * %define <2>: Decl Summary. (line 140)

	10233 * %define: Table of Symbols. (line 82)

	10234 * %define abstract: Java Declarations Summary.

	10235 (line 57)

	10236 * %define api.pure <1>: Decl Summary. (line 166)

	10237 * %define api.pure: Pure Decl. (line 6)

	10238 * %define api.push_pull <1>: Push Decl. (line 6)

	10239 * %define api.push_pull: Decl Summary. (line 177)

	10240 * %define extends: Java Declarations Summary.

	10241 (line 61)

	10242 * %define final: Java Declarations Summary.

	10243 (line 65)

	10244 * %define implements: Java Declarations Summary.

	10245 (line 69)

	10246 * %define lex_throws: Java Declarations Summary.

	10247 (line 73)

	10248 * %define location_type: Java Declarations Summary.

	10249 (line 78)

	10250 * %define lr.keep_unreachable_states: Decl Summary. (line 190)

	10251 * %define namespace <1>: Decl Summary. (line 232)

	10252 * %define namespace: C++ Bison Interface. (line 10)

	10253 * %define package: Java Declarations Summary.

	10254 (line 84)

	10255 * %define parser_class_name: Java Declarations Summary.

	10256 (line 88)

	10257 * %define position_type: Java Declarations Summary.

	10258 (line 92)

	10259 * %define public: Java Declarations Summary.

	10260 (line 97)

	10261 * %define strictfp: Java Declarations Summary.

	10262 (line 105)

	10263 * %define stype: Java Declarations Summary.

	10264 (line 101)

	10265 * %define throws: Java Declarations Summary.

	10266 (line 109)

	10267 * %defines <1>: Table of Symbols. (line 90)

	10268 * %defines <2>: Decl Summary. (line 307)

	10269 * %defines: Table of Symbols. (line 86)

	10270 * %destructor <1>: Destructor Decl. (line 22)

	10271 * %destructor <2>: Decl Summary. (line 310)

	10272 * %destructor <3>: Destructor Decl. (line 6)

	10273 * %destructor <4>: Mid-Rule Actions. (line 59)

	10274 * %destructor <5>: Table of Symbols. (line 94)

	10275 * %destructor: Destructor Decl. (line 22)

	10276 * %dprec <1>: Table of Symbols. (line 99)

	10277 * %dprec: Merging GLR Parses. (line 6)

	10278 * %error-verbose <1>: Table of Symbols. (line 118)

	10279 * %error-verbose: Error Reporting. (line 17)

	10280 * %expect <1>: Decl Summary. (line 38)

	10281 * %expect: Expect Decl. (line 6)

	10282 * %expect-rr <1>: Expect Decl. (line 6)

	10283 * %expect-rr: Simple GLR Parsers. (line 6)

	10284 * %file-prefix <1>: Decl Summary. (line 315)

	10285 * %file-prefix: Table of Symbols. (line 122)

	10286 * %glr-parser <1>: Simple GLR Parsers. (line 6)

	10287 * %glr-parser <2>: Table of Symbols. (line 126)

	10288 * %glr-parser: GLR Parsers. (line 6)

	10289 * %initial-action <1>: Table of Symbols. (line 130)

	10290 * %initial-action: Initial Action Decl. (line 11)

	10291 * %language <1>: Decl Summary. (line 319)

	10292 * %language: Table of Symbols. (line 134)

	10293 * %language "Java": Java Declarations Summary.

	10294 (line 10)

	10295 * %left <1>: Using Precedence. (line 6)

	10296 * %left <2>: Decl Summary. (line 21)

	10297 * %left: Table of Symbols. (line 138)

	10298 * %lex-param <1>: Table of Symbols. (line 142)

	10299 * %lex-param <2>: Pure Calling. (line 31)

	10300 * %lex-param: Java Declarations Summary.

	10301 (line 13)

	10302 * %locations: Decl Summary. (line 327)

	10303 * %merge <1>: Merging GLR Parses. (line 6)

	10304 * %merge: Table of Symbols. (line 147)

	10305 * %name-prefix <1>: Java Declarations Summary.

	10306 (line 19)

	10307 * %name-prefix <2>: Decl Summary. (line 334)

	10308 * %name-prefix: Table of Symbols. (line 154)

	10309 * %no-lines <1>: Decl Summary. (line 346)

	10310 * %no-lines: Table of Symbols. (line 158)

	10311 * %nonassoc <1>: Table of Symbols. (line 162)

	10312 * %nonassoc <2>: Using Precedence. (line 6)

	10313 * %nonassoc: Decl Summary. (line 25)

	10314 * %output <1>: Decl Summary. (line 354)

	10315 * %output: Table of Symbols. (line 166)

	10316 * %parse-param <1>: Java Declarations Summary.

	10317 (line 24)

	10318 * %parse-param <2>: Parser Function. (line 36)

	10319 * %parse-param <3>: Table of Symbols. (line 170)

	10320 * %parse-param: Parser Function. (line 36)

	10321 * %prec <1>: Table of Symbols. (line 175)

	10322 * %prec: Contextual Precedence.

	10323 (line 6)

	10324 * %pure-parser <1>: Table of Symbols. (line 179)

	10325 * %pure-parser: Decl Summary. (line 357)

	10326 * %require <1>: Table of Symbols. (line 184)

	10327 * %require <2>: Require Decl. (line 6)

	10328 * %require: Decl Summary. (line 362)

	10329 * %right <1>: Using Precedence. (line 6)

	10330 * %right <2>: Decl Summary. (line 17)

	10331 * %right: Table of Symbols. (line 188)

	10332 * %skeleton <1>: Decl Summary. (line 366)

	10333 * %skeleton: Table of Symbols. (line 192)

	10334 * %start <1>: Table of Symbols. (line 196)

	10335 * %start <2>: Decl Summary. (line 34)

	10336 * %start: Start Decl. (line 6)

	10337 * %token <1>: Decl Summary. (line 13)

	10338 * %token <2>: Token Decl. (line 6)

	10339 * %token <3>: Java Declarations Summary.

	10340 (line 29)

	10341 * %token: Table of Symbols. (line 200)

	10342 * %token-table <1>: Decl Summary. (line 374)

	10343 * %token-table: Table of Symbols. (line 204)

	10344 * %type <1>: Java Declarations Summary.

	10345 (line 33)

	10346 * %type <2>: Type Decl. (line 6)

	10347 * %type <3>: Table of Symbols. (line 208)

	10348 * %type: Decl Summary. (line 30)

	10349 * %union <1>: Decl Summary. (line 9)

	10350 * %union <2>: Union Decl. (line 6)

	10351 * %union: Table of Symbols. (line 217)

	10352 * %verbose: Decl Summary. (line 407)

	10353 * %yacc: Decl Summary. (line 413)

	10354 * *yypstate_new: Parser Create Function.

	10355 (line 15)

	10356 * /*: Table of Symbols. (line 33)

	10357 * :: Table of Symbols. (line 36)

	10358 * ;: Table of Symbols. (line 40)

	10359 * <*> <1>: Destructor Decl. (line 6)

	10360 * <*>: Table of Symbols. (line 47)

	10361 * <> <1>: Destructor Decl. (line 6)

	10362 * <>: Table of Symbols. (line 56)

	10363 * @$ <1>: Action Features. (line 98)

	10364 * @$ <2>: Java Action Features.

	10365 (line 39)

	10366 * @$ <3>: Table of Symbols. (line 7)

	10367 * @$: Actions and Locations.

	10368 (line 6)

	10369 * @N <1>: Action Features. (line 104)

	10370 * @N <2>: Actions and Locations.

	10371 (line 6)

	10372 * @N <3>: Table of Symbols. (line 11)

	10373 * @N <4>: Action Features. (line 104)

	10374 * @N: Java Action Features.

	10375 (line 35)

	10376 * abstract syntax tree: Implementing Gotos/Loops.

	10377 (line 17)

	10378 * action: Actions. (line 6)

	10379 * action data types: Action Types. (line 6)

	10380 * action features summary: Action Features. (line 6)

	10381 * actions in mid-rule <1>: Mid-Rule Actions. (line 6)

	10382 * actions in mid-rule: Destructor Decl. (line 88)

	10383 * actions, location: Actions and Locations.

	10384 (line 6)

	10385 * actions, semantic: Semantic Actions. (line 6)

	10386 * additional C code section: Epilogue. (line 6)

	10387 * algorithm of parser: Algorithm. (line 6)

	10388 * ambiguous grammars <1>: Generalized LR Parsing.

	10389 (line 6)

	10390 * ambiguous grammars: Language and Grammar.

	10391 (line 33)

	10392 * associativity: Why Precedence. (line 33)

	10393 * AST: Implementing Gotos/Loops.

	10394 (line 17)

	10395 * Backus-Naur form: Language and Grammar.

	10396 (line 16)

	10397 * begin of Location: Java Location Values.

	10398 (line 21)

	10399 * begin on location: C++ Location Values. (line 44)

	10400 * Bison declaration summary: Decl Summary. (line 6)

	10401 * Bison declarations: Declarations. (line 6)

	10402 * Bison declarations (introduction): Bison Declarations. (line 6)

	10403 * Bison grammar: Grammar in Bison. (line 6)

	10404 * Bison invocation: Invocation. (line 6)

	10405 * Bison parser: Bison Parser. (line 6)

	10406 * Bison parser algorithm: Algorithm. (line 6)

	10407 * Bison symbols, table of: Table of Symbols. (line 6)

	10408 * Bison utility: Bison Parser. (line 6)

	10409 * bison-i18n.m4: Internationalization.

	10410 (line 20)

	10411 * bison-po: Internationalization.

	10412 (line 6)

	10413 * BISON_I18N: Internationalization.

	10414 (line 27)

	10415 * BISON_LOCALEDIR: Internationalization.

	10416 (line 27)

	10417 * BNF: Language and Grammar.

	10418 (line 16)

	10419 * braced code: Rules. (line 31)

	10420 * C code, section for additional: Epilogue. (line 6)

	10421 * C-language interface: Interface. (line 6)

	10422 * calc: Infix Calc. (line 6)

	10423 * calculator, infix notation: Infix Calc. (line 6)

	10424 * calculator, location tracking: Location Tracking Calc.

	10425 (line 6)

	10426 * calculator, multi-function: Multi-function Calc. (line 6)

	10427 * calculator, simple: RPN Calc. (line 6)

	10428 * character token: Symbols. (line 31)

	10429 * column on position: C++ Location Values. (line 25)

	10430 * columns on location: C++ Location Values. (line 48)

	10431 * columns on position: C++ Location Values. (line 28)

	10432 * compiling the parser: Rpcalc Compile. (line 6)

	10433 * conflicts <1>: Shift/Reduce. (line 6)

	10434 * conflicts <2>: Merging GLR Parses. (line 6)

	10435 * conflicts <3>: GLR Parsers. (line 6)

	10436 * conflicts: Simple GLR Parsers. (line 6)

	10437 * conflicts, reduce/reduce: Reduce/Reduce. (line 6)

	10438 * conflicts, suppressing warnings of: Expect Decl. (line 6)

	10439 * context-dependent precedence: Contextual Precedence.

	10440 (line 6)

	10441 * context-free grammar: Language and Grammar.

	10442 (line 6)

	10443 * controlling function: Rpcalc Main. (line 6)

	10444 * core, item set: Understanding. (line 129)

	10445 * dangling else: Shift/Reduce. (line 6)

	10446 * data type of locations: Location Type. (line 6)

	10447 * data types in actions: Action Types. (line 6)

	10448 * data types of semantic values: Value Type. (line 6)

	10449 * debug_level on parser: C++ Parser Interface.

	10450 (line 31)

	10451 * debug_stream on parser: C++ Parser Interface.

	10452 (line 26)

	10453 * debugging: Tracing. (line 6)

	10454 * declaration summary: Decl Summary. (line 6)

	10455 * declarations: Prologue. (line 6)

	10456 * declarations section: Prologue. (line 6)

	10457 * declarations, Bison: Declarations. (line 6)

	10458 * declarations, Bison (introduction): Bison Declarations. (line 6)

	10459 * declaring literal string tokens: Token Decl. (line 6)

	10460 * declaring operator precedence: Precedence Decl. (line 6)

	10461 * declaring the start symbol: Start Decl. (line 6)

	10462 * declaring token type names: Token Decl. (line 6)

	10463 * declaring value types: Union Decl. (line 6)

	10464 * declaring value types, nonterminals: Type Decl. (line 6)

	10465 * default action: Actions. (line 50)

	10466 * default data type: Value Type. (line 6)

	10467 * default location type: Location Type. (line 6)

	10468 * default stack limit: Memory Management. (line 30)

	10469 * default start symbol: Start Decl. (line 6)

	10470 * deferred semantic actions: GLR Semantic Actions.

	10471 (line 6)

	10472 * defining language semantics: Semantics. (line 6)

	10473 * discarded symbols: Destructor Decl. (line 98)

	10474 * discarded symbols, mid-rule actions: Mid-Rule Actions. (line 59)

	10475 * else, dangling: Shift/Reduce. (line 6)

	10476 * end of Location: Java Location Values.

	10477 (line 22)

	10478 * end on location: C++ Location Values. (line 45)

	10479 * epilogue: Epilogue. (line 6)

	10480 * error <1>: Error Recovery. (line 20)

	10481 * error: Table of Symbols. (line 108)

	10482 * error on parser: C++ Parser Interface.

	10483 (line 37)

	10484 * error recovery: Error Recovery. (line 6)

	10485 * error recovery, mid-rule actions: Mid-Rule Actions. (line 59)

	10486 * error recovery, simple: Simple Error Recovery.

	10487 (line 6)

	10488 * error reporting function: Error Reporting. (line 6)

	10489 * error reporting routine: Rpcalc Error. (line 6)

	10490 * examples, simple: Examples. (line 6)

	10491 * exercises: Exercises. (line 6)

	10492 * file format: Grammar Layout. (line 6)

	10493 * file on position: C++ Location Values. (line 13)

	10494 * finite-state machine: Parser States. (line 6)

	10495 * formal grammar: Grammar in Bison. (line 6)

	10496 * format of grammar file: Grammar Layout. (line 6)

	10497 * freeing discarded symbols: Destructor Decl. (line 6)

	10498 * frequently asked questions: FAQ. (line 6)

	10499 * generalized LR (GLR) parsing <1>: Generalized LR Parsing.

	10500 (line 6)

	10501 * generalized LR (GLR) parsing <2>: Language and Grammar.

	10502 (line 33)

	10503 * generalized LR (GLR) parsing: GLR Parsers. (line 6)

	10504 * generalized LR (GLR) parsing, ambiguous grammars: Merging GLR Parses.

	10505 (line 6)

	10506 * generalized LR (GLR) parsing, unambiguous grammars: Simple GLR Parsers.

	10507 (line 6)

	10508 * getDebugLevel on YYParser: Java Parser Interface.

	10509 (line 67)

	10510 * getDebugStream on YYParser: Java Parser Interface.

	10511 (line 62)

	10512 * getEndPos on Lexer: Java Scanner Interface.

	10513 (line 39)

	10514 * getLVal on Lexer: Java Scanner Interface.

	10515 (line 47)

	10516 * getStartPos on Lexer: Java Scanner Interface.

	10517 (line 38)

	10518 * gettext: Internationalization.

	10519 (line 6)

	10520 * glossary: Glossary. (line 6)

	10521 * GLR parsers and inline: Compiler Requirements.

	10522 (line 6)

	10523 * GLR parsers and yychar: GLR Semantic Actions.

	10524 (line 10)

	10525 * GLR parsers and yyclearin: GLR Semantic Actions.

	10526 (line 18)

	10527 * GLR parsers and YYERROR: GLR Semantic Actions.

	10528 (line 28)

	10529 * GLR parsers and yylloc: GLR Semantic Actions.

	10530 (line 10)

	10531 * GLR parsers and YYLLOC_DEFAULT: Location Default Action.

	10532 (line 6)

	10533 * GLR parsers and yylval: GLR Semantic Actions.

	10534 (line 10)

	10535 * GLR parsing <1>: Language and Grammar.

	10536 (line 33)

	10537 * GLR parsing <2>: Generalized LR Parsing.

	10538 (line 6)

	10539 * GLR parsing: GLR Parsers. (line 6)

	10540 * GLR parsing, ambiguous grammars: Merging GLR Parses. (line 6)

	10541 * GLR parsing, unambiguous grammars: Simple GLR Parsers. (line 6)

	10542 * grammar file: Grammar Layout. (line 6)

	10543 * grammar rule syntax: Rules. (line 6)

	10544 * grammar rules section: Grammar Rules. (line 6)

	10545 * grammar, Bison: Grammar in Bison. (line 6)

	10546 * grammar, context-free: Language and Grammar.

	10547 (line 6)

	10548 * grouping, syntactic: Language and Grammar.

	10549 (line 47)

	10550 * i18n: Internationalization.

	10551 (line 6)

	10552 * infix notation calculator: Infix Calc. (line 6)

	10553 * inline: Compiler Requirements.

	10554 (line 6)

	10555 * interface: Interface. (line 6)

	10556 * internationalization: Internationalization.

	10557 (line 6)

	10558 * introduction: Introduction. (line 6)

	10559 * invoking Bison: Invocation. (line 6)

	10560 * item: Understanding. (line 107)

	10561 * item set core: Understanding. (line 129)

	10562 * kernel, item set: Understanding. (line 129)

	10563 * LALR(1): Mystery Conflicts. (line 36)

	10564 * LALR(1) grammars: Language and Grammar.

	10565 (line 22)

	10566 * language semantics, defining: Semantics. (line 6)

	10567 * layout of Bison grammar: Grammar Layout. (line 6)

	10568 * left recursion: Recursion. (line 16)

	10569 * lex-param: Pure Calling. (line 31)

	10570 * lexical analyzer: Lexical. (line 6)

	10571 * lexical analyzer, purpose: Bison Parser. (line 6)

	10572 * lexical analyzer, writing: Rpcalc Lexer. (line 6)

	10573 * lexical tie-in: Lexical Tie-ins. (line 6)

	10574 * line on position: C++ Location Values. (line 19)

	10575 * lines on location: C++ Location Values. (line 49)

	10576 * lines on position: C++ Location Values. (line 22)

	10577 * literal string token: Symbols. (line 53)

	10578 * literal token: Symbols. (line 31)

	10579 * location <1>: Locations Overview. (line 6)

	10580 * location: Locations. (line 6)

	10581 * location actions: Actions and Locations.

	10582 (line 6)

	10583 * Location on Location: Java Location Values.

	10584 (line 25)

	10585 * location tracking calculator: Location Tracking Calc.

	10586 (line 6)

	10587 * location, textual <1>: Locations. (line 6)

	10588 * location, textual: Locations Overview. (line 6)

	10589 * location_value_type: C++ Parser Interface.

	10590 (line 16)

	10591 * lookahead token: Lookahead. (line 6)

	10592 * LR(1): Mystery Conflicts. (line 36)

	10593 * LR(1) grammars: Language and Grammar.

	10594 (line 22)

	10595 * ltcalc: Location Tracking Calc.

	10596 (line 6)

	10597 * main function in simple example: Rpcalc Main. (line 6)

	10598 * memory exhaustion: Memory Management. (line 6)

	10599 * memory management: Memory Management. (line 6)

	10600 * mfcalc: Multi-function Calc. (line 6)

	10601 * mid-rule actions <1>: Destructor Decl. (line 88)

	10602 * mid-rule actions: Mid-Rule Actions. (line 6)

	10603 * multi-function calculator: Multi-function Calc. (line 6)

	10604 * multicharacter literal: Symbols. (line 53)

	10605 * mutual recursion: Recursion. (line 32)

	10606 * NLS: Internationalization.

	10607 (line 6)

	10608 * nondeterministic parsing <1>: Generalized LR Parsing.

	10609 (line 6)

	10610 * nondeterministic parsing: Language and Grammar.

	10611 (line 33)

	10612 * nonterminal symbol: Symbols. (line 6)

	10613 * nonterminal, useless: Understanding. (line 62)

	10614 * operator precedence: Precedence. (line 6)

	10615 * operator precedence, declaring: Precedence Decl. (line 6)

	10616 * operator+ on location: C++ Location Values. (line 53)

	10617 * operator+ on position: C++ Location Values. (line 33)

	10618 * operator+= on location: C++ Location Values. (line 57)

	10619 * operator+= on position: C++ Location Values. (line 31)

	10620 * operator- on position: C++ Location Values. (line 36)

	10621 * operator-= on position: C++ Location Values. (line 35)

	10622 * operator<< on position: C++ Location Values. (line 40)

	10623 * options for invoking Bison: Invocation. (line 6)

	10624 * overflow of parser stack: Memory Management. (line 6)

	10625 * parse error: Error Reporting. (line 6)

	10626 * parse on parser: C++ Parser Interface.

	10627 (line 23)

	10628 * parse on YYParser: Java Parser Interface.

	10629 (line 54)

	10630 * parser: Bison Parser. (line 6)

	10631 * parser on parser: C++ Parser Interface.

	10632 (line 19)

	10633 * parser stack: Algorithm. (line 6)

	10634 * parser stack overflow: Memory Management. (line 6)

	10635 * parser state: Parser States. (line 6)

	10636 * pointed rule: Understanding. (line 107)

	10637 * polish notation calculator: RPN Calc. (line 6)

	10638 * precedence declarations: Precedence Decl. (line 6)

	10639 * precedence of operators: Precedence. (line 6)

	10640 * precedence, context-dependent: Contextual Precedence.

	10641 (line 6)

	10642 * precedence, unary operator: Contextual Precedence.

	10643 (line 6)

	10644 * preventing warnings about conflicts: Expect Decl. (line 6)

	10645 * Prologue <1>: Decl Summary. (line 129)

	10646 * Prologue <2>: Prologue. (line 6)

	10647 * Prologue: Decl Summary. (line 50)

	10648 * Prologue Alternatives: Prologue Alternatives.

	10649 (line 6)

	10650 * pure parser: Pure Decl. (line 6)

	10651 * push parser: Push Decl. (line 6)

	10652 * questions: FAQ. (line 6)

	10653 * recovering: Java Action Features.

	10654 (line 59)

	10655 * recovering on YYParser: Java Parser Interface.

	10656 (line 58)

	10657 * recovery from errors: Error Recovery. (line 6)

	10658 * recursive rule: Recursion. (line 6)

	10659 * reduce/reduce conflict: Reduce/Reduce. (line 6)

	10660 * reduce/reduce conflicts <1>: GLR Parsers. (line 6)

	10661 * reduce/reduce conflicts <2>: Simple GLR Parsers. (line 6)

	10662 * reduce/reduce conflicts: Merging GLR Parses. (line 6)

	10663 * reduction: Algorithm. (line 6)

	10664 * reentrant parser: Pure Decl. (line 6)

	10665 * requiring a version of Bison: Require Decl. (line 6)

	10666 * return YYABORT;: Java Action Features.

	10667 (line 43)

	10668 * return YYACCEPT;: Java Action Features.

	10669 (line 47)

	10670 * return YYERROR;: Java Action Features.

	10671 (line 51)

	10672 * return YYFAIL;: Java Action Features.

	10673 (line 55)

	10674 * reverse polish notation: RPN Calc. (line 6)

	10675 * right recursion: Recursion. (line 16)

	10676 * rpcalc: RPN Calc. (line 6)

	10677 * rule syntax: Rules. (line 6)

	10678 * rule, pointed: Understanding. (line 107)

	10679 * rule, useless: Understanding. (line 62)

	10680 * rules section for grammar: Grammar Rules. (line 6)

	10681 * running Bison (introduction): Rpcalc Generate. (line 6)

	10682 * semantic actions: Semantic Actions. (line 6)

	10683 * semantic value: Semantic Values. (line 6)

	10684 * semantic value type: Value Type. (line 6)

	10685 * semantic_value_type: C++ Parser Interface.

	10686 (line 15)

	10687 * set_debug_level on parser: C++ Parser Interface.

	10688 (line 32)

	10689 * set_debug_stream on parser: C++ Parser Interface.

	10690 (line 27)

	10691 * setDebugLevel on YYParser: Java Parser Interface.

	10692 (line 68)

	10693 * setDebugStream on YYParser: Java Parser Interface.

	10694 (line 63)

	10695 * shift/reduce conflicts <1>: Simple GLR Parsers. (line 6)

	10696 * shift/reduce conflicts <2>: Shift/Reduce. (line 6)

	10697 * shift/reduce conflicts: GLR Parsers. (line 6)

	10698 * shifting: Algorithm. (line 6)

	10699 * simple examples: Examples. (line 6)

	10700 * single-character literal: Symbols. (line 31)

	10701 * stack overflow: Memory Management. (line 6)

	10702 * stack, parser: Algorithm. (line 6)

	10703 * stages in using Bison: Stages. (line 6)

	10704 * start symbol: Language and Grammar.

	10705 (line 96)

	10706 * start symbol, declaring: Start Decl. (line 6)

	10707 * state (of parser): Parser States. (line 6)

	10708 * step on location: C++ Location Values. (line 60)

	10709 * string token: Symbols. (line 53)

	10710 * summary, action features: Action Features. (line 6)

	10711 * summary, Bison declaration: Decl Summary. (line 6)

	10712 * suppressing conflict warnings: Expect Decl. (line 6)

	10713 * symbol: Symbols. (line 6)

	10714 * symbol table example: Mfcalc Symbol Table. (line 6)

	10715 * symbols (abstract): Language and Grammar.

	10716 (line 47)

	10717 * symbols in Bison, table of: Table of Symbols. (line 6)

	10718 * syntactic grouping: Language and Grammar.

	10719 (line 47)

	10720 * syntax error: Error Reporting. (line 6)

	10721 * syntax of grammar rules: Rules. (line 6)

	10722 * terminal symbol: Symbols. (line 6)

	10723 * textual location <1>: Locations Overview. (line 6)

	10724 * textual location: Locations. (line 6)

	10725 * token: Language and Grammar.

	10726 (line 47)

	10727 * token type: Symbols. (line 6)

	10728 * token type names, declaring: Token Decl. (line 6)

	10729 * token, useless: Understanding. (line 62)

	10730 * toString on Location: Java Location Values.

	10731 (line 32)

	10732 * tracing the parser: Tracing. (line 6)

	10733 * unary operator precedence: Contextual Precedence.

	10734 (line 6)

	10735 * useless nonterminal: Understanding. (line 62)

	10736 * useless rule: Understanding. (line 62)

	10737 * useless token: Understanding. (line 62)

	10738 * using Bison: Stages. (line 6)

	10739 * value type, semantic: Value Type. (line 6)

	10740 * value types, declaring: Union Decl. (line 6)

	10741 * value types, nonterminals, declaring: Type Decl. (line 6)

	10742 * value, semantic: Semantic Values. (line 6)

	10743 * version requirement: Require Decl. (line 6)

	10744 * warnings, preventing: Expect Decl. (line 6)

	10745 * writing a lexical analyzer: Rpcalc Lexer. (line 6)

	10746 * YYABORT <1>: Table of Symbols. (line 221)

	10747 * YYABORT: Parser Function. (line 29)

	10748 * YYABORT;: Action Features. (line 28)

	10749 * YYACCEPT <1>: Table of Symbols. (line 230)

	10750 * YYACCEPT: Parser Function. (line 26)

	10751 * YYACCEPT;: Action Features. (line 32)

	10752 * YYBACKUP <1>: Table of Symbols. (line 238)

	10753 * YYBACKUP: Action Features. (line 36)

	10754 * yychar <1>: Action Features. (line 69)

	10755 * yychar <2>: Lookahead. (line 47)

	10756 * yychar <3>: Table of Symbols. (line 242)

	10757 * yychar: GLR Semantic Actions.

	10758 (line 10)

	10759 * yyclearin <1>: GLR Semantic Actions.

	10760 (line 18)

	10761 * yyclearin <2>: Table of Symbols. (line 248)

	10762 * yyclearin: Error Recovery. (line 97)

	10763 * yyclearin;: Action Features. (line 76)

	10764 * yydebug <1>: Tracing. (line 6)

	10765 * yydebug: Table of Symbols. (line 256)

	10766 * YYDEBUG <1>: Table of Symbols. (line 252)

	10767 * YYDEBUG: Tracing. (line 12)

	10768 * YYEMPTY: Action Features. (line 49)

	10769 * YYENABLE_NLS: Internationalization.

	10770 (line 27)

	10771 * YYEOF: Action Features. (line 52)

	10772 * yyerrok <1>: Table of Symbols. (line 261)

	10773 * yyerrok: Error Recovery. (line 92)

	10774 * yyerrok;: Action Features. (line 81)

	10775 * YYERROR: Action Features. (line 56)

	10776 * yyerror: Java Action Features.

	10777 (line 64)

	10778 * YYERROR: Table of Symbols. (line 265)

	10779 * yyerror <1>: Table of Symbols. (line 274)

	10780 * yyerror: Error Reporting. (line 6)

	10781 * YYERROR: GLR Semantic Actions.

	10782 (line 28)

	10783 * yyerror on Lexer: Java Scanner Interface.

	10784 (line 25)

	10785 * YYERROR;: Action Features. (line 56)

	10786 * YYERROR_VERBOSE: Table of Symbols. (line 278)

	10787 * YYINITDEPTH <1>: Table of Symbols. (line 285)

	10788 * YYINITDEPTH: Memory Management. (line 32)

	10789 * yylex <1>: Table of Symbols. (line 289)

	10790 * yylex: Lexical. (line 6)

	10791 * yylex on Lexer: Java Scanner Interface.

	10792 (line 30)

	10793 * yylex on parser: C++ Scanner Interface.

	10794 (line 12)

	10795 * YYLEX_PARAM: Table of Symbols. (line 294)

	10796 * yylloc <1>: Token Locations. (line 6)

	10797 * yylloc <2>: Table of Symbols. (line 300)

	10798 * yylloc <3>: GLR Semantic Actions.

	10799 (line 10)

	10800 * yylloc <4>: Action Features. (line 86)

	10801 * yylloc <5>: Lookahead. (line 47)

	10802 * yylloc: Actions and Locations.

	10803 (line 60)

	10804 * YYLLOC_DEFAULT: Location Default Action.

	10805 (line 6)

	10806 * YYLTYPE <1>: Table of Symbols. (line 310)

	10807 * YYLTYPE: Token Locations. (line 19)

	10808 * yylval <1>: Actions. (line 74)

	10809 * yylval <2>: Action Features. (line 92)

	10810 * yylval <3>: Table of Symbols. (line 314)

	10811 * yylval <4>: GLR Semantic Actions.

	10812 (line 10)

	10813 * yylval <5>: Lookahead. (line 47)

	10814 * yylval: Token Values. (line 6)

	10815 * YYMAXDEPTH <1>: Table of Symbols. (line 322)

	10816 * YYMAXDEPTH: Memory Management. (line 14)

	10817 * yynerrs <1>: Error Reporting. (line 92)

	10818 * yynerrs: Table of Symbols. (line 326)

	10819 * yyparse <1>: Table of Symbols. (line 332)

	10820 * yyparse: Parser Function. (line 6)

	10821 * YYPARSE_PARAM: Table of Symbols. (line 365)

	10822 * YYParser on YYParser: Java Parser Interface.

	10823 (line 41)

	10824 * YYPRINT: Tracing. (line 71)

	10825 * yypstate_delete <1>: Table of Symbols. (line 336)

	10826 * yypstate_delete: Parser Delete Function.

	10827 (line 6)

	10828 * yypstate_new <1>: Parser Create Function.

	10829 (line 6)

	10830 * yypstate_new: Table of Symbols. (line 344)

	10831 * yypull_parse <1>: Pull Parser Function.

	10832 (line 6)

	10833 * yypull_parse <2>: Table of Symbols. (line 351)

	10834 * yypull_parse: Pull Parser Function.

	10835 (line 14)

	10836 * yypush_parse <1>: Push Parser Function.

	10837 (line 15)

	10838 * yypush_parse: Table of Symbols. (line 358)

	10839 * YYRECOVERING <1>: Action Features. (line 64)

	10840 * YYRECOVERING <2>: Error Recovery. (line 109)

	10841 * YYRECOVERING <3>: Action Features. (line 64)

	10842 * YYRECOVERING: Table of Symbols. (line 371)

	10843 * YYSTACK_USE_ALLOCA: Table of Symbols. (line 376)

	10844 * YYSTYPE: Table of Symbols. (line 392)

	10845 * \| <1>: Table of Symbols. (line 43)

	10846 * \|: Rules. (line 49)

	10847

	10848

	10849

	10850 Tag Table:

	10851 Node: Top1174

	10852 Node: Introduction13739

	10853 Node: Conditions15002

	10854 Node: Copying16893

	10855 Node: Concepts54431

	10856 Node: Language and Grammar55612

	10857 Node: Grammar in Bison61501

	10858 Node: Semantic Values63430

	10859 Node: Semantic Actions65536

	10860 Node: GLR Parsers66723

	10861 Node: Simple GLR Parsers69470

	10862 Node: Merging GLR Parses76122

	10863 Node: GLR Semantic Actions80691

	10864 Node: Compiler Requirements82581

	10865 Node: Locations Overview83317

	10866 Node: Bison Parser84770

	10867 Node: Stages87710

	10868 Node: Grammar Layout88998

	10869 Node: Examples90330

	10870 Node: RPN Calc91533

	10871 Node: Rpcalc Declarations92533

	10872 Node: Rpcalc Rules94461

	10873 Node: Rpcalc Input96277

	10874 Node: Rpcalc Line97752

	10875 Node: Rpcalc Expr98880

	10876 Node: Rpcalc Lexer100847

	10877 Node: Rpcalc Main103441

	10878 Node: Rpcalc Error103848

	10879 Node: Rpcalc Generate104881

	10880 Node: Rpcalc Compile106016

	10881 Node: Infix Calc106895

	10882 Node: Simple Error Recovery109658

	10883 Node: Location Tracking Calc111553

	10884 Node: Ltcalc Declarations112249

	10885 Node: Ltcalc Rules113338

	10886 Node: Ltcalc Lexer115354

	10887 Node: Multi-function Calc117677

	10888 Node: Mfcalc Declarations119253

	10889 Node: Mfcalc Rules121300

	10890 Node: Mfcalc Symbol Table122695

	10891 Node: Exercises128871

	10892 Node: Grammar File129385

	10893 Node: Grammar Outline130234

	10894 Node: Prologue131084

	10895 Node: Prologue Alternatives132873

	10896 Node: Bison Declarations142558

	10897 Node: Grammar Rules142986

	10898 Node: Epilogue143457

	10899 Node: Symbols144473

	10900 Node: Rules151176

	10901 Node: Recursion153655

	10902 Node: Semantics155373

	10903 Node: Value Type156472

	10904 Node: Multiple Types157307

	10905 Node: Actions158474

	10906 Node: Action Types161889

	10907 Node: Mid-Rule Actions163201

	10908 Node: Locations169666

	10909 Node: Location Type170317

	10910 Node: Actions and Locations171103

	10911 Node: Location Default Action173564

	10912 Node: Declarations177284

	10913 Node: Require Decl178811

	10914 Node: Token Decl179130

	10915 Node: Precedence Decl181556

	10916 Node: Union Decl183566

	10917 Node: Type Decl185340

	10918 Node: Initial Action Decl186266

	10919 Node: Destructor Decl187037

	10920 Node: Expect Decl192501

	10921 Node: Start Decl194494

	10922 Node: Pure Decl194882

	10923 Node: Push Decl196632

	10924 Node: Decl Summary201131

	10925 Ref: Decl Summary-Footnote-1218017

	10926 Node: Multiple Parsers218221

	10927 Node: Interface219860

	10928 Node: Parser Function221178

	10929 Node: Push Parser Function223194

	10930 Node: Pull Parser Function224004

	10931 Node: Parser Create Function224655

	10932 Node: Parser Delete Function225478

	10933 Node: Lexical226249

	10934 Node: Calling Convention227681

	10935 Node: Token Values230641

	10936 Node: Token Locations231805

	10937 Node: Pure Calling232699

	10938 Node: Error Reporting234580

	10939 Node: Action Features238710

	10940 Node: Internationalization243012

	10941 Node: Algorithm245553

	10942 Node: Lookahead247919

	10943 Node: Shift/Reduce250128

	10944 Node: Precedence253023

	10945 Node: Why Precedence253679

	10946 Node: Using Precedence255552

	10947 Node: Precedence Examples256529

	10948 Node: How Precedence257239

	10949 Node: Contextual Precedence258396

	10950 Node: Parser States260192

	10951 Node: Reduce/Reduce261436

	10952 Node: Mystery Conflicts264977

	10953 Node: Generalized LR Parsing268684

	10954 Node: Memory Management273303

	10955 Node: Error Recovery275516

	10956 Node: Context Dependency280819

	10957 Node: Semantic Tokens281668

	10958 Node: Lexical Tie-ins284738

	10959 Node: Tie-in Recovery286315

	10960 Node: Debugging288492

	10961 Node: Understanding289158

	10962 Node: Tracing300317

	10963 Node: Invocation304419

	10964 Node: Bison Options305818

	10965 Node: Option Cross Key312822

	10966 Node: Yacc Library313874

	10967 Node: Other Languages314699

	10968 Node: C++ Parsers315026

	10969 Node: C++ Bison Interface315523

	10970 Node: C++ Semantic Values316791

	10971 Ref: C++ Semantic Values-Footnote-1317733

	10972 Node: C++ Location Values317886

	10973 Node: C++ Parser Interface320259

	10974 Node: C++ Scanner Interface321976

	10975 Node: A Complete C++ Example322678

	10976 Node: Calc++ --- C++ Calculator323620

	10977 Node: Calc++ Parsing Driver324134

	10978 Node: Calc++ Parser327915

	10979 Node: Calc++ Scanner331705

	10980 Node: Calc++ Top Level335131

	10981 Node: Java Parsers335780

	10982 Node: Java Bison Interface336457

	10983 Node: Java Semantic Values338420

	10984 Node: Java Location Values340034

	10985 Node: Java Parser Interface341590

	10986 Node: Java Scanner Interface344828

	10987 Node: Java Action Features347013

	10988 Node: Java Differences349740

	10989 Ref: Java Differences-Footnote-1352315

	10990 Node: Java Declarations Summary352465

	10991 Node: FAQ356713

	10992 Node: Memory Exhausted357660

	10993 Node: How Can I Reset the Parser357970

	10994 Node: Strings are Destroyed360239

	10995 Node: Implementing Gotos/Loops361828

	10996 Node: Multiple start-symbols363111

	10997 Node: Secure? Conform?364656

	10998 Node: I can't build Bison365104

	10999 Node: Where can I find help?365822

	11000 Node: Bug Reports366615

	11001 Node: More Languages368076

	11002 Node: Beta Testing368434

	11003 Node: Mailing Lists369308

	11004 Node: Table of Symbols369519

	11005 Node: Glossary384901

	11006 Node: Copying This Manual391798

	11007 Node: Index414191

	11008

	11009 End Tag Table

OLD	NEW

« no previous file with comments | « bison/src/bison/2.4.1/bison-2.4.1-src/doc/bison.1 ('k') | bison/src/bison/2.4.1/bison-2.4.1-src/doc/bison.texinfo » ('j') | no next file with comments »