Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(335)

Side by Side Diff: bison/src/bison/2.4.1/bison-2.4.1-src/doc/bison.info

Issue 10807020: Add native Windows binary for bison. (Closed) Base URL: svn://chrome-svn/chrome/trunk/deps/third_party/
Patch Set: Created 8 years, 5 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch | Annotate | Revision Log
OLDNEW
(Empty)
1 This is ../../bison-2.4.1-src/doc/bison.info, produced by makeinfo
2 version 4.8 from ../../bison-2.4.1-src/doc/bison.texinfo.
3
4 This manual (19 November 2008) is for GNU Bison (version 2.4.1), the
5 GNU parser generator.
6
7 Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999,
8 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software
9 Foundation, Inc.
10
11 Permission is granted to copy, distribute and/or modify this
12 document under the terms of the GNU Free Documentation License,
13 Version 1.2 or any later version published by the Free Software
14 Foundation; with no Invariant Sections, with the Front-Cover texts
15 being "A GNU Manual," and with the Back-Cover Texts as in (a)
16 below. A copy of the license is included in the section entitled
17 "GNU Free Documentation License."
18
19 (a) The FSF's Back-Cover Text is: "You have the freedom to copy and
20 modify this GNU manual. Buying copies from the FSF supports it in
21 developing GNU and promoting software freedom."
22
23 INFO-DIR-SECTION Software development
24 START-INFO-DIR-ENTRY
25 * bison: (bison). GNU parser generator (Yacc replacement).
26 END-INFO-DIR-ENTRY
27
28 
29 File: bison.info, Node: Top, Next: Introduction, Up: (dir)
30
31 Bison
32 *****
33
34 This manual (19 November 2008) is for GNU Bison (version 2.4.1), the
35 GNU parser generator.
36
37 Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999,
38 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software
39 Foundation, Inc.
40
41 Permission is granted to copy, distribute and/or modify this
42 document under the terms of the GNU Free Documentation License,
43 Version 1.2 or any later version published by the Free Software
44 Foundation; with no Invariant Sections, with the Front-Cover texts
45 being "A GNU Manual," and with the Back-Cover Texts as in (a)
46 below. A copy of the license is included in the section entitled
47 "GNU Free Documentation License."
48
49 (a) The FSF's Back-Cover Text is: "You have the freedom to copy and
50 modify this GNU manual. Buying copies from the FSF supports it in
51 developing GNU and promoting software freedom."
52
53 * Menu:
54
55 * Introduction::
56 * Conditions::
57 * Copying:: The GNU General Public License says
58 how you can copy and share Bison.
59
60 Tutorial sections:
61 * Concepts:: Basic concepts for understanding Bison.
62 * Examples:: Three simple explained examples of using Bison.
63
64 Reference sections:
65 * Grammar File:: Writing Bison declarations and rules.
66 * Interface:: C-language interface to the parser function `yyparse'.
67 * Algorithm:: How the Bison parser works at run-time.
68 * Error Recovery:: Writing rules for error recovery.
69 * Context Dependency:: What to do if your language syntax is too
70 messy for Bison to handle straightforwardly.
71 * Debugging:: Understanding or debugging Bison parsers.
72 * Invocation:: How to run Bison (to produce the parser source file).
73 * Other Languages:: Creating C++ and Java parsers.
74 * FAQ:: Frequently Asked Questions
75 * Table of Symbols:: All the keywords of the Bison language are explained.
76 * Glossary:: Basic concepts are explained.
77 * Copying This Manual:: License for copying this manual.
78 * Index:: Cross-references to the text.
79
80 --- The Detailed Node Listing ---
81
82 The Concepts of Bison
83
84 * Language and Grammar:: Languages and context-free grammars,
85 as mathematical ideas.
86 * Grammar in Bison:: How we represent grammars for Bison's sake.
87 * Semantic Values:: Each token or syntactic grouping can have
88 a semantic value (the value of an integer,
89 the name of an identifier, etc.).
90 * Semantic Actions:: Each rule can have an action containing C code.
91 * GLR Parsers:: Writing parsers for general context-free languages.
92 * Locations Overview:: Tracking Locations.
93 * Bison Parser:: What are Bison's input and output,
94 how is the output used?
95 * Stages:: Stages in writing and running Bison grammars.
96 * Grammar Layout:: Overall structure of a Bison grammar file.
97
98 Writing GLR Parsers
99
100 * Simple GLR Parsers:: Using GLR parsers on unambiguous grammars.
101 * Merging GLR Parses:: Using GLR parsers to resolve ambiguities.
102 * GLR Semantic Actions:: Deferred semantic actions have special concerns.
103 * Compiler Requirements:: GLR parsers require a modern C compiler.
104
105 Examples
106
107 * RPN Calc:: Reverse polish notation calculator;
108 a first example with no operator precedence.
109 * Infix Calc:: Infix (algebraic) notation calculator.
110 Operator precedence is introduced.
111 * Simple Error Recovery:: Continuing after syntax errors.
112 * Location Tracking Calc:: Demonstrating the use of @N and @$.
113 * Multi-function Calc:: Calculator with memory and trig functions.
114 It uses multiple data-types for semantic values.
115 * Exercises:: Ideas for improving the multi-function calculator.
116
117 Reverse Polish Notation Calculator
118
119 * Rpcalc Declarations:: Prologue (declarations) for rpcalc.
120 * Rpcalc Rules:: Grammar Rules for rpcalc, with explanation.
121 * Rpcalc Lexer:: The lexical analyzer.
122 * Rpcalc Main:: The controlling function.
123 * Rpcalc Error:: The error reporting function.
124 * Rpcalc Generate:: Running Bison on the grammar file.
125 * Rpcalc Compile:: Run the C compiler on the output code.
126
127 Grammar Rules for `rpcalc'
128
129 * Rpcalc Input::
130 * Rpcalc Line::
131 * Rpcalc Expr::
132
133 Location Tracking Calculator: `ltcalc'
134
135 * Ltcalc Declarations:: Bison and C declarations for ltcalc.
136 * Ltcalc Rules:: Grammar rules for ltcalc, with explanations.
137 * Ltcalc Lexer:: The lexical analyzer.
138
139 Multi-Function Calculator: `mfcalc'
140
141 * Mfcalc Declarations:: Bison declarations for multi-function calculator.
142 * Mfcalc Rules:: Grammar rules for the calculator.
143 * Mfcalc Symbol Table:: Symbol table management subroutines.
144
145 Bison Grammar Files
146
147 * Grammar Outline:: Overall layout of the grammar file.
148 * Symbols:: Terminal and nonterminal symbols.
149 * Rules:: How to write grammar rules.
150 * Recursion:: Writing recursive rules.
151 * Semantics:: Semantic values and actions.
152 * Locations:: Locations and actions.
153 * Declarations:: All kinds of Bison declarations are described here.
154 * Multiple Parsers:: Putting more than one Bison parser in one program.
155
156 Outline of a Bison Grammar
157
158 * Prologue:: Syntax and usage of the prologue.
159 * Prologue Alternatives:: Syntax and usage of alternatives to the prologue.
160 * Bison Declarations:: Syntax and usage of the Bison declarations section.
161 * Grammar Rules:: Syntax and usage of the grammar rules section.
162 * Epilogue:: Syntax and usage of the epilogue.
163
164 Defining Language Semantics
165
166 * Value Type:: Specifying one data type for all semantic values.
167 * Multiple Types:: Specifying several alternative data types.
168 * Actions:: An action is the semantic definition of a grammar rule.
169 * Action Types:: Specifying data types for actions to operate on.
170 * Mid-Rule Actions:: Most actions go at the end of a rule.
171 This says when, why and how to use the exceptional
172 action in the middle of a rule.
173
174 Tracking Locations
175
176 * Location Type:: Specifying a data type for locations.
177 * Actions and Locations:: Using locations in actions.
178 * Location Default Action:: Defining a general way to compute locations.
179
180 Bison Declarations
181
182 * Require Decl:: Requiring a Bison version.
183 * Token Decl:: Declaring terminal symbols.
184 * Precedence Decl:: Declaring terminals with precedence and associativity.
185 * Union Decl:: Declaring the set of all semantic value types.
186 * Type Decl:: Declaring the choice of type for a nonterminal symbol.
187 * Initial Action Decl:: Code run before parsing starts.
188 * Destructor Decl:: Declaring how symbols are freed.
189 * Expect Decl:: Suppressing warnings about parsing conflicts.
190 * Start Decl:: Specifying the start symbol.
191 * Pure Decl:: Requesting a reentrant parser.
192 * Push Decl:: Requesting a push parser.
193 * Decl Summary:: Table of all Bison declarations.
194
195 Parser C-Language Interface
196
197 * Parser Function:: How to call `yyparse' and what it returns.
198 * Push Parser Function:: How to call `yypush_parse' and what it returns.
199 * Pull Parser Function:: How to call `yypull_parse' and what it returns.
200 * Parser Create Function:: How to call `yypstate_new' and what it returns.
201 * Parser Delete Function:: How to call `yypstate_delete' and what it returns.
202 * Lexical:: You must supply a function `yylex'
203 which reads tokens.
204 * Error Reporting:: You must supply a function `yyerror'.
205 * Action Features:: Special features for use in actions.
206 * Internationalization:: How to let the parser speak in the user's
207 native language.
208
209 The Lexical Analyzer Function `yylex'
210
211 * Calling Convention:: How `yyparse' calls `yylex'.
212 * Token Values:: How `yylex' must return the semantic value
213 of the token it has read.
214 * Token Locations:: How `yylex' must return the text location
215 (line number, etc.) of the token, if the
216 actions want that.
217 * Pure Calling:: How the calling convention differs in a pure parser
218 (*note A Pure (Reentrant) Parser: Pure Decl.).
219
220 The Bison Parser Algorithm
221
222 * Lookahead:: Parser looks one token ahead when deciding what to do.
223 * Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
224 * Precedence:: Operator precedence works by resolving conflicts.
225 * Contextual Precedence:: When an operator's precedence depends on context.
226 * Parser States:: The parser is a finite-state-machine with stack.
227 * Reduce/Reduce:: When two rules are applicable in the same situation.
228 * Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
229 * Generalized LR Parsing:: Parsing arbitrary context-free grammars.
230 * Memory Management:: What happens when memory is exhausted. How to avoid it.
231
232 Operator Precedence
233
234 * Why Precedence:: An example showing why precedence is needed.
235 * Using Precedence:: How to specify precedence in Bison grammars.
236 * Precedence Examples:: How these features are used in the previous example.
237 * How Precedence:: How they work.
238
239 Handling Context Dependencies
240
241 * Semantic Tokens:: Token parsing can depend on the semantic context.
242 * Lexical Tie-ins:: Token parsing can depend on the syntactic context.
243 * Tie-in Recovery:: Lexical tie-ins have implications for how
244 error recovery rules must be written.
245
246 Debugging Your Parser
247
248 * Understanding:: Understanding the structure of your parser.
249 * Tracing:: Tracing the execution of your parser.
250
251 Invoking Bison
252
253 * Bison Options:: All the options described in detail,
254 in alphabetical order by short options.
255 * Option Cross Key:: Alphabetical list of long options.
256 * Yacc Library:: Yacc-compatible `yylex' and `main'.
257
258 Parsers Written In Other Languages
259
260 * C++ Parsers:: The interface to generate C++ parser classes
261 * Java Parsers:: The interface to generate Java parser classes
262
263 C++ Parsers
264
265 * C++ Bison Interface:: Asking for C++ parser generation
266 * C++ Semantic Values:: %union vs. C++
267 * C++ Location Values:: The position and location classes
268 * C++ Parser Interface:: Instantiating and running the parser
269 * C++ Scanner Interface:: Exchanges between yylex and parse
270 * A Complete C++ Example:: Demonstrating their use
271
272 A Complete C++ Example
273
274 * Calc++ --- C++ Calculator:: The specifications
275 * Calc++ Parsing Driver:: An active parsing context
276 * Calc++ Parser:: A parser class
277 * Calc++ Scanner:: A pure C++ Flex scanner
278 * Calc++ Top Level:: Conducting the band
279
280 Java Parsers
281
282 * Java Bison Interface:: Asking for Java parser generation
283 * Java Semantic Values:: %type and %token vs. Java
284 * Java Location Values:: The position and location classes
285 * Java Parser Interface:: Instantiating and running the parser
286 * Java Scanner Interface:: Specifying the scanner for the parser
287 * Java Action Features:: Special features for use in actions
288 * Java Differences:: Differences between C/C++ and Java Grammars
289 * Java Declarations Summary:: List of Bison declarations used with Java
290
291 Frequently Asked Questions
292
293 * Memory Exhausted:: Breaking the Stack Limits
294 * How Can I Reset the Parser:: `yyparse' Keeps some State
295 * Strings are Destroyed:: `yylval' Loses Track of Strings
296 * Implementing Gotos/Loops:: Control Flow in the Calculator
297 * Multiple start-symbols:: Factoring closely related grammars
298 * Secure? Conform?:: Is Bison POSIX safe?
299 * I can't build Bison:: Troubleshooting
300 * Where can I find help?:: Troubleshouting
301 * Bug Reports:: Troublereporting
302 * More Languages:: Parsers in C++, Java, and so on
303 * Beta Testing:: Experimenting development versions
304 * Mailing Lists:: Meeting other Bison users
305
306 Copying This Manual
307
308 * Copying This Manual:: License for copying this manual.
309
310 
311 File: bison.info, Node: Introduction, Next: Conditions, Prev: Top, Up: Top
312
313 Introduction
314 ************
315
316 "Bison" is a general-purpose parser generator that converts an
317 annotated context-free grammar into an LALR(1) or GLR parser for that
318 grammar. Once you are proficient with Bison, you can use it to develop
319 a wide range of language parsers, from those used in simple desk
320 calculators to complex programming languages.
321
322 Bison is upward compatible with Yacc: all properly-written Yacc
323 grammars ought to work with Bison with no change. Anyone familiar with
324 Yacc should be able to use Bison with little trouble. You need to be
325 fluent in C or C++ programming in order to use Bison or to understand
326 this manual.
327
328 We begin with tutorial chapters that explain the basic concepts of
329 using Bison and show three explained examples, each building on the
330 last. If you don't know Bison or Yacc, start by reading these
331 chapters. Reference chapters follow which describe specific aspects of
332 Bison in detail.
333
334 Bison was written primarily by Robert Corbett; Richard Stallman made
335 it Yacc-compatible. Wilfred Hansen of Carnegie Mellon University added
336 multi-character string literals and other features.
337
338 This edition corresponds to version 2.4.1 of Bison.
339
340 
341 File: bison.info, Node: Conditions, Next: Copying, Prev: Introduction, Up: T op
342
343 Conditions for Using Bison
344 **************************
345
346 The distribution terms for Bison-generated parsers permit using the
347 parsers in nonfree programs. Before Bison version 2.2, these extra
348 permissions applied only when Bison was generating LALR(1) parsers in
349 C. And before Bison version 1.24, Bison-generated parsers could be
350 used only in programs that were free software.
351
352 The other GNU programming tools, such as the GNU C compiler, have
353 never had such a requirement. They could always be used for nonfree
354 software. The reason Bison was different was not due to a special
355 policy decision; it resulted from applying the usual General Public
356 License to all of the Bison source code.
357
358 The output of the Bison utility--the Bison parser file--contains a
359 verbatim copy of a sizable piece of Bison, which is the code for the
360 parser's implementation. (The actions from your grammar are inserted
361 into this implementation at one point, but most of the rest of the
362 implementation is not changed.) When we applied the GPL terms to the
363 skeleton code for the parser's implementation, the effect was to
364 restrict the use of Bison output to free software.
365
366 We didn't change the terms because of sympathy for people who want to
367 make software proprietary. *Software should be free.* But we
368 concluded that limiting Bison's use to free software was doing little to
369 encourage people to make other software free. So we decided to make the
370 practical conditions for using Bison match the practical conditions for
371 using the other GNU tools.
372
373 This exception applies when Bison is generating code for a parser.
374 You can tell whether the exception applies to a Bison output file by
375 inspecting the file for text beginning with "As a special
376 exception...". The text spells out the exact terms of the exception.
377
378 
379 File: bison.info, Node: Copying, Next: Concepts, Prev: Conditions, Up: Top
380
381 GNU GENERAL PUBLIC LICENSE
382 **************************
383
384 Version 3, 29 June 2007
385
386 Copyright (C) 2007 Free Software Foundation, Inc. `http://fsf.org/'
387
388 Everyone is permitted to copy and distribute verbatim copies of this
389 license document, but changing it is not allowed.
390
391 Preamble
392 ========
393
394 The GNU General Public License is a free, copyleft license for software
395 and other kinds of works.
396
397 The licenses for most software and other practical works are designed
398 to take away your freedom to share and change the works. By contrast,
399 the GNU General Public License is intended to guarantee your freedom to
400 share and change all versions of a program--to make sure it remains
401 free software for all its users. We, the Free Software Foundation, use
402 the GNU General Public License for most of our software; it applies
403 also to any other work released this way by its authors. You can apply
404 it to your programs, too.
405
406 When we speak of free software, we are referring to freedom, not
407 price. Our General Public Licenses are designed to make sure that you
408 have the freedom to distribute copies of free software (and charge for
409 them if you wish), that you receive source code or can get it if you
410 want it, that you can change the software or use pieces of it in new
411 free programs, and that you know you can do these things.
412
413 To protect your rights, we need to prevent others from denying you
414 these rights or asking you to surrender the rights. Therefore, you
415 have certain responsibilities if you distribute copies of the software,
416 or if you modify it: responsibilities to respect the freedom of others.
417
418 For example, if you distribute copies of such a program, whether
419 gratis or for a fee, you must pass on to the recipients the same
420 freedoms that you received. You must make sure that they, too, receive
421 or can get the source code. And you must show them these terms so they
422 know their rights.
423
424 Developers that use the GNU GPL protect your rights with two steps:
425 (1) assert copyright on the software, and (2) offer you this License
426 giving you legal permission to copy, distribute and/or modify it.
427
428 For the developers' and authors' protection, the GPL clearly explains
429 that there is no warranty for this free software. For both users' and
430 authors' sake, the GPL requires that modified versions be marked as
431 changed, so that their problems will not be attributed erroneously to
432 authors of previous versions.
433
434 Some devices are designed to deny users access to install or run
435 modified versions of the software inside them, although the
436 manufacturer can do so. This is fundamentally incompatible with the
437 aim of protecting users' freedom to change the software. The
438 systematic pattern of such abuse occurs in the area of products for
439 individuals to use, which is precisely where it is most unacceptable.
440 Therefore, we have designed this version of the GPL to prohibit the
441 practice for those products. If such problems arise substantially in
442 other domains, we stand ready to extend this provision to those domains
443 in future versions of the GPL, as needed to protect the freedom of
444 users.
445
446 Finally, every program is threatened constantly by software patents.
447 States should not allow patents to restrict development and use of
448 software on general-purpose computers, but in those that do, we wish to
449 avoid the special danger that patents applied to a free program could
450 make it effectively proprietary. To prevent this, the GPL assures that
451 patents cannot be used to render the program non-free.
452
453 The precise terms and conditions for copying, distribution and
454 modification follow.
455
456 TERMS AND CONDITIONS
457 ====================
458
459 0. Definitions.
460
461 "This License" refers to version 3 of the GNU General Public
462 License.
463
464 "Copyright" also means copyright-like laws that apply to other
465 kinds of works, such as semiconductor masks.
466
467 "The Program" refers to any copyrightable work licensed under this
468 License. Each licensee is addressed as "you". "Licensees" and
469 "recipients" may be individuals or organizations.
470
471 To "modify" a work means to copy from or adapt all or part of the
472 work in a fashion requiring copyright permission, other than the
473 making of an exact copy. The resulting work is called a "modified
474 version" of the earlier work or a work "based on" the earlier work.
475
476 A "covered work" means either the unmodified Program or a work
477 based on the Program.
478
479 To "propagate" a work means to do anything with it that, without
480 permission, would make you directly or secondarily liable for
481 infringement under applicable copyright law, except executing it
482 on a computer or modifying a private copy. Propagation includes
483 copying, distribution (with or without modification), making
484 available to the public, and in some countries other activities as
485 well.
486
487 To "convey" a work means any kind of propagation that enables other
488 parties to make or receive copies. Mere interaction with a user
489 through a computer network, with no transfer of a copy, is not
490 conveying.
491
492 An interactive user interface displays "Appropriate Legal Notices"
493 to the extent that it includes a convenient and prominently visible
494 feature that (1) displays an appropriate copyright notice, and (2)
495 tells the user that there is no warranty for the work (except to
496 the extent that warranties are provided), that licensees may
497 convey the work under this License, and how to view a copy of this
498 License. If the interface presents a list of user commands or
499 options, such as a menu, a prominent item in the list meets this
500 criterion.
501
502 1. Source Code.
503
504 The "source code" for a work means the preferred form of the work
505 for making modifications to it. "Object code" means any
506 non-source form of a work.
507
508 A "Standard Interface" means an interface that either is an
509 official standard defined by a recognized standards body, or, in
510 the case of interfaces specified for a particular programming
511 language, one that is widely used among developers working in that
512 language.
513
514 The "System Libraries" of an executable work include anything,
515 other than the work as a whole, that (a) is included in the normal
516 form of packaging a Major Component, but which is not part of that
517 Major Component, and (b) serves only to enable use of the work
518 with that Major Component, or to implement a Standard Interface
519 for which an implementation is available to the public in source
520 code form. A "Major Component", in this context, means a major
521 essential component (kernel, window system, and so on) of the
522 specific operating system (if any) on which the executable work
523 runs, or a compiler used to produce the work, or an object code
524 interpreter used to run it.
525
526 The "Corresponding Source" for a work in object code form means all
527 the source code needed to generate, install, and (for an executable
528 work) run the object code and to modify the work, including
529 scripts to control those activities. However, it does not include
530 the work's System Libraries, or general-purpose tools or generally
531 available free programs which are used unmodified in performing
532 those activities but which are not part of the work. For example,
533 Corresponding Source includes interface definition files
534 associated with source files for the work, and the source code for
535 shared libraries and dynamically linked subprograms that the work
536 is specifically designed to require, such as by intimate data
537 communication or control flow between those subprograms and other
538 parts of the work.
539
540 The Corresponding Source need not include anything that users can
541 regenerate automatically from other parts of the Corresponding
542 Source.
543
544 The Corresponding Source for a work in source code form is that
545 same work.
546
547 2. Basic Permissions.
548
549 All rights granted under this License are granted for the term of
550 copyright on the Program, and are irrevocable provided the stated
551 conditions are met. This License explicitly affirms your unlimited
552 permission to run the unmodified Program. The output from running
553 a covered work is covered by this License only if the output,
554 given its content, constitutes a covered work. This License
555 acknowledges your rights of fair use or other equivalent, as
556 provided by copyright law.
557
558 You may make, run and propagate covered works that you do not
559 convey, without conditions so long as your license otherwise
560 remains in force. You may convey covered works to others for the
561 sole purpose of having them make modifications exclusively for
562 you, or provide you with facilities for running those works,
563 provided that you comply with the terms of this License in
564 conveying all material for which you do not control copyright.
565 Those thus making or running the covered works for you must do so
566 exclusively on your behalf, under your direction and control, on
567 terms that prohibit them from making any copies of your
568 copyrighted material outside their relationship with you.
569
570 Conveying under any other circumstances is permitted solely under
571 the conditions stated below. Sublicensing is not allowed; section
572 10 makes it unnecessary.
573
574 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
575
576 No covered work shall be deemed part of an effective technological
577 measure under any applicable law fulfilling obligations under
578 article 11 of the WIPO copyright treaty adopted on 20 December
579 1996, or similar laws prohibiting or restricting circumvention of
580 such measures.
581
582 When you convey a covered work, you waive any legal power to forbid
583 circumvention of technological measures to the extent such
584 circumvention is effected by exercising rights under this License
585 with respect to the covered work, and you disclaim any intention
586 to limit operation or modification of the work as a means of
587 enforcing, against the work's users, your or third parties' legal
588 rights to forbid circumvention of technological measures.
589
590 4. Conveying Verbatim Copies.
591
592 You may convey verbatim copies of the Program's source code as you
593 receive it, in any medium, provided that you conspicuously and
594 appropriately publish on each copy an appropriate copyright notice;
595 keep intact all notices stating that this License and any
596 non-permissive terms added in accord with section 7 apply to the
597 code; keep intact all notices of the absence of any warranty; and
598 give all recipients a copy of this License along with the Program.
599
600 You may charge any price or no price for each copy that you convey,
601 and you may offer support or warranty protection for a fee.
602
603 5. Conveying Modified Source Versions.
604
605 You may convey a work based on the Program, or the modifications to
606 produce it from the Program, in the form of source code under the
607 terms of section 4, provided that you also meet all of these
608 conditions:
609
610 a. The work must carry prominent notices stating that you
611 modified it, and giving a relevant date.
612
613 b. The work must carry prominent notices stating that it is
614 released under this License and any conditions added under
615 section 7. This requirement modifies the requirement in
616 section 4 to "keep intact all notices".
617
618 c. You must license the entire work, as a whole, under this
619 License to anyone who comes into possession of a copy. This
620 License will therefore apply, along with any applicable
621 section 7 additional terms, to the whole of the work, and all
622 its parts, regardless of how they are packaged. This License
623 gives no permission to license the work in any other way, but
624 it does not invalidate such permission if you have separately
625 received it.
626
627 d. If the work has interactive user interfaces, each must display
628 Appropriate Legal Notices; however, if the Program has
629 interactive interfaces that do not display Appropriate Legal
630 Notices, your work need not make them do so.
631
632 A compilation of a covered work with other separate and independent
633 works, which are not by their nature extensions of the covered
634 work, and which are not combined with it such as to form a larger
635 program, in or on a volume of a storage or distribution medium, is
636 called an "aggregate" if the compilation and its resulting
637 copyright are not used to limit the access or legal rights of the
638 compilation's users beyond what the individual works permit.
639 Inclusion of a covered work in an aggregate does not cause this
640 License to apply to the other parts of the aggregate.
641
642 6. Conveying Non-Source Forms.
643
644 You may convey a covered work in object code form under the terms
645 of sections 4 and 5, provided that you also convey the
646 machine-readable Corresponding Source under the terms of this
647 License, in one of these ways:
648
649 a. Convey the object code in, or embodied in, a physical product
650 (including a physical distribution medium), accompanied by the
651 Corresponding Source fixed on a durable physical medium
652 customarily used for software interchange.
653
654 b. Convey the object code in, or embodied in, a physical product
655 (including a physical distribution medium), accompanied by a
656 written offer, valid for at least three years and valid for
657 as long as you offer spare parts or customer support for that
658 product model, to give anyone who possesses the object code
659 either (1) a copy of the Corresponding Source for all the
660 software in the product that is covered by this License, on a
661 durable physical medium customarily used for software
662 interchange, for a price no more than your reasonable cost of
663 physically performing this conveying of source, or (2) access
664 to copy the Corresponding Source from a network server at no
665 charge.
666
667 c. Convey individual copies of the object code with a copy of
668 the written offer to provide the Corresponding Source. This
669 alternative is allowed only occasionally and noncommercially,
670 and only if you received the object code with such an offer,
671 in accord with subsection 6b.
672
673 d. Convey the object code by offering access from a designated
674 place (gratis or for a charge), and offer equivalent access
675 to the Corresponding Source in the same way through the same
676 place at no further charge. You need not require recipients
677 to copy the Corresponding Source along with the object code.
678 If the place to copy the object code is a network server, the
679 Corresponding Source may be on a different server (operated
680 by you or a third party) that supports equivalent copying
681 facilities, provided you maintain clear directions next to
682 the object code saying where to find the Corresponding Source.
683 Regardless of what server hosts the Corresponding Source, you
684 remain obligated to ensure that it is available for as long
685 as needed to satisfy these requirements.
686
687 e. Convey the object code using peer-to-peer transmission,
688 provided you inform other peers where the object code and
689 Corresponding Source of the work are being offered to the
690 general public at no charge under subsection 6d.
691
692
693 A separable portion of the object code, whose source code is
694 excluded from the Corresponding Source as a System Library, need
695 not be included in conveying the object code work.
696
697 A "User Product" is either (1) a "consumer product", which means
698 any tangible personal property which is normally used for personal,
699 family, or household purposes, or (2) anything designed or sold for
700 incorporation into a dwelling. In determining whether a product
701 is a consumer product, doubtful cases shall be resolved in favor of
702 coverage. For a particular product received by a particular user,
703 "normally used" refers to a typical or common use of that class of
704 product, regardless of the status of the particular user or of the
705 way in which the particular user actually uses, or expects or is
706 expected to use, the product. A product is a consumer product
707 regardless of whether the product has substantial commercial,
708 industrial or non-consumer uses, unless such uses represent the
709 only significant mode of use of the product.
710
711 "Installation Information" for a User Product means any methods,
712 procedures, authorization keys, or other information required to
713 install and execute modified versions of a covered work in that
714 User Product from a modified version of its Corresponding Source.
715 The information must suffice to ensure that the continued
716 functioning of the modified object code is in no case prevented or
717 interfered with solely because modification has been made.
718
719 If you convey an object code work under this section in, or with,
720 or specifically for use in, a User Product, and the conveying
721 occurs as part of a transaction in which the right of possession
722 and use of the User Product is transferred to the recipient in
723 perpetuity or for a fixed term (regardless of how the transaction
724 is characterized), the Corresponding Source conveyed under this
725 section must be accompanied by the Installation Information. But
726 this requirement does not apply if neither you nor any third party
727 retains the ability to install modified object code on the User
728 Product (for example, the work has been installed in ROM).
729
730 The requirement to provide Installation Information does not
731 include a requirement to continue to provide support service,
732 warranty, or updates for a work that has been modified or
733 installed by the recipient, or for the User Product in which it
734 has been modified or installed. Access to a network may be denied
735 when the modification itself materially and adversely affects the
736 operation of the network or violates the rules and protocols for
737 communication across the network.
738
739 Corresponding Source conveyed, and Installation Information
740 provided, in accord with this section must be in a format that is
741 publicly documented (and with an implementation available to the
742 public in source code form), and must require no special password
743 or key for unpacking, reading or copying.
744
745 7. Additional Terms.
746
747 "Additional permissions" are terms that supplement the terms of
748 this License by making exceptions from one or more of its
749 conditions. Additional permissions that are applicable to the
750 entire Program shall be treated as though they were included in
751 this License, to the extent that they are valid under applicable
752 law. If additional permissions apply only to part of the Program,
753 that part may be used separately under those permissions, but the
754 entire Program remains governed by this License without regard to
755 the additional permissions.
756
757 When you convey a copy of a covered work, you may at your option
758 remove any additional permissions from that copy, or from any part
759 of it. (Additional permissions may be written to require their own
760 removal in certain cases when you modify the work.) You may place
761 additional permissions on material, added by you to a covered work,
762 for which you have or can give appropriate copyright permission.
763
764 Notwithstanding any other provision of this License, for material
765 you add to a covered work, you may (if authorized by the copyright
766 holders of that material) supplement the terms of this License
767 with terms:
768
769 a. Disclaiming warranty or limiting liability differently from
770 the terms of sections 15 and 16 of this License; or
771
772 b. Requiring preservation of specified reasonable legal notices
773 or author attributions in that material or in the Appropriate
774 Legal Notices displayed by works containing it; or
775
776 c. Prohibiting misrepresentation of the origin of that material,
777 or requiring that modified versions of such material be
778 marked in reasonable ways as different from the original
779 version; or
780
781 d. Limiting the use for publicity purposes of names of licensors
782 or authors of the material; or
783
784 e. Declining to grant rights under trademark law for use of some
785 trade names, trademarks, or service marks; or
786
787 f. Requiring indemnification of licensors and authors of that
788 material by anyone who conveys the material (or modified
789 versions of it) with contractual assumptions of liability to
790 the recipient, for any liability that these contractual
791 assumptions directly impose on those licensors and authors.
792
793 All other non-permissive additional terms are considered "further
794 restrictions" within the meaning of section 10. If the Program as
795 you received it, or any part of it, contains a notice stating that
796 it is governed by this License along with a term that is a further
797 restriction, you may remove that term. If a license document
798 contains a further restriction but permits relicensing or
799 conveying under this License, you may add to a covered work
800 material governed by the terms of that license document, provided
801 that the further restriction does not survive such relicensing or
802 conveying.
803
804 If you add terms to a covered work in accord with this section, you
805 must place, in the relevant source files, a statement of the
806 additional terms that apply to those files, or a notice indicating
807 where to find the applicable terms.
808
809 Additional terms, permissive or non-permissive, may be stated in
810 the form of a separately written license, or stated as exceptions;
811 the above requirements apply either way.
812
813 8. Termination.
814
815 You may not propagate or modify a covered work except as expressly
816 provided under this License. Any attempt otherwise to propagate or
817 modify it is void, and will automatically terminate your rights
818 under this License (including any patent licenses granted under
819 the third paragraph of section 11).
820
821 However, if you cease all violation of this License, then your
822 license from a particular copyright holder is reinstated (a)
823 provisionally, unless and until the copyright holder explicitly
824 and finally terminates your license, and (b) permanently, if the
825 copyright holder fails to notify you of the violation by some
826 reasonable means prior to 60 days after the cessation.
827
828 Moreover, your license from a particular copyright holder is
829 reinstated permanently if the copyright holder notifies you of the
830 violation by some reasonable means, this is the first time you have
831 received notice of violation of this License (for any work) from
832 that copyright holder, and you cure the violation prior to 30 days
833 after your receipt of the notice.
834
835 Termination of your rights under this section does not terminate
836 the licenses of parties who have received copies or rights from
837 you under this License. If your rights have been terminated and
838 not permanently reinstated, you do not qualify to receive new
839 licenses for the same material under section 10.
840
841 9. Acceptance Not Required for Having Copies.
842
843 You are not required to accept this License in order to receive or
844 run a copy of the Program. Ancillary propagation of a covered work
845 occurring solely as a consequence of using peer-to-peer
846 transmission to receive a copy likewise does not require
847 acceptance. However, nothing other than this License grants you
848 permission to propagate or modify any covered work. These actions
849 infringe copyright if you do not accept this License. Therefore,
850 by modifying or propagating a covered work, you indicate your
851 acceptance of this License to do so.
852
853 10. Automatic Licensing of Downstream Recipients.
854
855 Each time you convey a covered work, the recipient automatically
856 receives a license from the original licensors, to run, modify and
857 propagate that work, subject to this License. You are not
858 responsible for enforcing compliance by third parties with this
859 License.
860
861 An "entity transaction" is a transaction transferring control of an
862 organization, or substantially all assets of one, or subdividing an
863 organization, or merging organizations. If propagation of a
864 covered work results from an entity transaction, each party to that
865 transaction who receives a copy of the work also receives whatever
866 licenses to the work the party's predecessor in interest had or
867 could give under the previous paragraph, plus a right to
868 possession of the Corresponding Source of the work from the
869 predecessor in interest, if the predecessor has it or can get it
870 with reasonable efforts.
871
872 You may not impose any further restrictions on the exercise of the
873 rights granted or affirmed under this License. For example, you
874 may not impose a license fee, royalty, or other charge for
875 exercise of rights granted under this License, and you may not
876 initiate litigation (including a cross-claim or counterclaim in a
877 lawsuit) alleging that any patent claim is infringed by making,
878 using, selling, offering for sale, or importing the Program or any
879 portion of it.
880
881 11. Patents.
882
883 A "contributor" is a copyright holder who authorizes use under this
884 License of the Program or a work on which the Program is based.
885 The work thus licensed is called the contributor's "contributor
886 version".
887
888 A contributor's "essential patent claims" are all patent claims
889 owned or controlled by the contributor, whether already acquired or
890 hereafter acquired, that would be infringed by some manner,
891 permitted by this License, of making, using, or selling its
892 contributor version, but do not include claims that would be
893 infringed only as a consequence of further modification of the
894 contributor version. For purposes of this definition, "control"
895 includes the right to grant patent sublicenses in a manner
896 consistent with the requirements of this License.
897
898 Each contributor grants you a non-exclusive, worldwide,
899 royalty-free patent license under the contributor's essential
900 patent claims, to make, use, sell, offer for sale, import and
901 otherwise run, modify and propagate the contents of its
902 contributor version.
903
904 In the following three paragraphs, a "patent license" is any
905 express agreement or commitment, however denominated, not to
906 enforce a patent (such as an express permission to practice a
907 patent or covenant not to sue for patent infringement). To
908 "grant" such a patent license to a party means to make such an
909 agreement or commitment not to enforce a patent against the party.
910
911 If you convey a covered work, knowingly relying on a patent
912 license, and the Corresponding Source of the work is not available
913 for anyone to copy, free of charge and under the terms of this
914 License, through a publicly available network server or other
915 readily accessible means, then you must either (1) cause the
916 Corresponding Source to be so available, or (2) arrange to deprive
917 yourself of the benefit of the patent license for this particular
918 work, or (3) arrange, in a manner consistent with the requirements
919 of this License, to extend the patent license to downstream
920 recipients. "Knowingly relying" means you have actual knowledge
921 that, but for the patent license, your conveying the covered work
922 in a country, or your recipient's use of the covered work in a
923 country, would infringe one or more identifiable patents in that
924 country that you have reason to believe are valid.
925
926 If, pursuant to or in connection with a single transaction or
927 arrangement, you convey, or propagate by procuring conveyance of, a
928 covered work, and grant a patent license to some of the parties
929 receiving the covered work authorizing them to use, propagate,
930 modify or convey a specific copy of the covered work, then the
931 patent license you grant is automatically extended to all
932 recipients of the covered work and works based on it.
933
934 A patent license is "discriminatory" if it does not include within
935 the scope of its coverage, prohibits the exercise of, or is
936 conditioned on the non-exercise of one or more of the rights that
937 are specifically granted under this License. You may not convey a
938 covered work if you are a party to an arrangement with a third
939 party that is in the business of distributing software, under
940 which you make payment to the third party based on the extent of
941 your activity of conveying the work, and under which the third
942 party grants, to any of the parties who would receive the covered
943 work from you, a discriminatory patent license (a) in connection
944 with copies of the covered work conveyed by you (or copies made
945 from those copies), or (b) primarily for and in connection with
946 specific products or compilations that contain the covered work,
947 unless you entered into that arrangement, or that patent license
948 was granted, prior to 28 March 2007.
949
950 Nothing in this License shall be construed as excluding or limiting
951 any implied license or other defenses to infringement that may
952 otherwise be available to you under applicable patent law.
953
954 12. No Surrender of Others' Freedom.
955
956 If conditions are imposed on you (whether by court order,
957 agreement or otherwise) that contradict the conditions of this
958 License, they do not excuse you from the conditions of this
959 License. If you cannot convey a covered work so as to satisfy
960 simultaneously your obligations under this License and any other
961 pertinent obligations, then as a consequence you may not convey it
962 at all. For example, if you agree to terms that obligate you to
963 collect a royalty for further conveying from those to whom you
964 convey the Program, the only way you could satisfy both those
965 terms and this License would be to refrain entirely from conveying
966 the Program.
967
968 13. Use with the GNU Affero General Public License.
969
970 Notwithstanding any other provision of this License, you have
971 permission to link or combine any covered work with a work licensed
972 under version 3 of the GNU Affero General Public License into a
973 single combined work, and to convey the resulting work. The terms
974 of this License will continue to apply to the part which is the
975 covered work, but the special requirements of the GNU Affero
976 General Public License, section 13, concerning interaction through
977 a network will apply to the combination as such.
978
979 14. Revised Versions of this License.
980
981 The Free Software Foundation may publish revised and/or new
982 versions of the GNU General Public License from time to time.
983 Such new versions will be similar in spirit to the present
984 version, but may differ in detail to address new problems or
985 concerns.
986
987 Each version is given a distinguishing version number. If the
988 Program specifies that a certain numbered version of the GNU
989 General Public License "or any later version" applies to it, you
990 have the option of following the terms and conditions either of
991 that numbered version or of any later version published by the
992 Free Software Foundation. If the Program does not specify a
993 version number of the GNU General Public License, you may choose
994 any version ever published by the Free Software Foundation.
995
996 If the Program specifies that a proxy can decide which future
997 versions of the GNU General Public License can be used, that
998 proxy's public statement of acceptance of a version permanently
999 authorizes you to choose that version for the Program.
1000
1001 Later license versions may give you additional or different
1002 permissions. However, no additional obligations are imposed on any
1003 author or copyright holder as a result of your choosing to follow a
1004 later version.
1005
1006 15. Disclaimer of Warranty.
1007
1008 THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
1009 APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE
1010 COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
1011 WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED,
1012 INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
1013 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE
1014 RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.
1015 SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
1016 NECESSARY SERVICING, REPAIR OR CORRECTION.
1017
1018 16. Limitation of Liability.
1019
1020 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
1021 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES
1022 AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU
1023 FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
1024 CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE
1025 THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA
1026 BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
1027 PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
1028 PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF
1029 THE POSSIBILITY OF SUCH DAMAGES.
1030
1031 17. Interpretation of Sections 15 and 16.
1032
1033 If the disclaimer of warranty and limitation of liability provided
1034 above cannot be given local legal effect according to their terms,
1035 reviewing courts shall apply local law that most closely
1036 approximates an absolute waiver of all civil liability in
1037 connection with the Program, unless a warranty or assumption of
1038 liability accompanies a copy of the Program in return for a fee.
1039
1040
1041 END OF TERMS AND CONDITIONS
1042 ===========================
1043
1044 How to Apply These Terms to Your New Programs
1045 =============================================
1046
1047 If you develop a new program, and you want it to be of the greatest
1048 possible use to the public, the best way to achieve this is to make it
1049 free software which everyone can redistribute and change under these
1050 terms.
1051
1052 To do so, attach the following notices to the program. It is safest
1053 to attach them to the start of each source file to most effectively
1054 state the exclusion of warranty; and each file should have at least the
1055 "copyright" line and a pointer to where the full notice is found.
1056
1057 ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.
1058 Copyright (C) YEAR NAME OF AUTHOR
1059
1060 This program is free software: you can redistribute it and/or modify
1061 it under the terms of the GNU General Public License as published by
1062 the Free Software Foundation, either version 3 of the License, or (at
1063 your option) any later version.
1064
1065 This program is distributed in the hope that it will be useful, but
1066 WITHOUT ANY WARRANTY; without even the implied warranty of
1067 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1068 General Public License for more details.
1069
1070 You should have received a copy of the GNU General Public License
1071 along with this program. If not, see `http://www.gnu.org/licenses/'.
1072
1073 Also add information on how to contact you by electronic and paper
1074 mail.
1075
1076 If the program does terminal interaction, make it output a short
1077 notice like this when it starts in an interactive mode:
1078
1079 PROGRAM Copyright (C) YEAR NAME OF AUTHOR
1080 This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
1081 This is free software, and you are welcome to redistribute it
1082 under certain conditions; type `show c' for details.
1083
1084 The hypothetical commands `show w' and `show c' should show the
1085 appropriate parts of the General Public License. Of course, your
1086 program's commands might be different; for a GUI interface, you would
1087 use an "about box".
1088
1089 You should also get your employer (if you work as a programmer) or
1090 school, if any, to sign a "copyright disclaimer" for the program, if
1091 necessary. For more information on this, and how to apply and follow
1092 the GNU GPL, see `http://www.gnu.org/licenses/'.
1093
1094 The GNU General Public License does not permit incorporating your
1095 program into proprietary programs. If your program is a subroutine
1096 library, you may consider it more useful to permit linking proprietary
1097 applications with the library. If this is what you want to do, use the
1098 GNU Lesser General Public License instead of this License. But first,
1099 please read `http://www.gnu.org/philosophy/why-not-lgpl.html'.
1100
1101 
1102 File: bison.info, Node: Concepts, Next: Examples, Prev: Copying, Up: Top
1103
1104 1 The Concepts of Bison
1105 ***********************
1106
1107 This chapter introduces many of the basic concepts without which the
1108 details of Bison will not make sense. If you do not already know how to
1109 use Bison or Yacc, we suggest you start by reading this chapter
1110 carefully.
1111
1112 * Menu:
1113
1114 * Language and Grammar:: Languages and context-free grammars,
1115 as mathematical ideas.
1116 * Grammar in Bison:: How we represent grammars for Bison's sake.
1117 * Semantic Values:: Each token or syntactic grouping can have
1118 a semantic value (the value of an integer,
1119 the name of an identifier, etc.).
1120 * Semantic Actions:: Each rule can have an action containing C code.
1121 * GLR Parsers:: Writing parsers for general context-free languages.
1122 * Locations Overview:: Tracking Locations.
1123 * Bison Parser:: What are Bison's input and output,
1124 how is the output used?
1125 * Stages:: Stages in writing and running Bison grammars.
1126 * Grammar Layout:: Overall structure of a Bison grammar file.
1127
1128 
1129 File: bison.info, Node: Language and Grammar, Next: Grammar in Bison, Up: Con cepts
1130
1131 1.1 Languages and Context-Free Grammars
1132 =======================================
1133
1134 In order for Bison to parse a language, it must be described by a
1135 "context-free grammar". This means that you specify one or more
1136 "syntactic groupings" and give rules for constructing them from their
1137 parts. For example, in the C language, one kind of grouping is called
1138 an `expression'. One rule for making an expression might be, "An
1139 expression can be made of a minus sign and another expression".
1140 Another would be, "An expression can be an integer". As you can see,
1141 rules are often recursive, but there must be at least one rule which
1142 leads out of the recursion.
1143
1144 The most common formal system for presenting such rules for humans
1145 to read is "Backus-Naur Form" or "BNF", which was developed in order to
1146 specify the language Algol 60. Any grammar expressed in BNF is a
1147 context-free grammar. The input to Bison is essentially
1148 machine-readable BNF.
1149
1150 There are various important subclasses of context-free grammar.
1151 Although it can handle almost all context-free grammars, Bison is
1152 optimized for what are called LALR(1) grammars. In brief, in these
1153 grammars, it must be possible to tell how to parse any portion of an
1154 input string with just a single token of lookahead. Strictly speaking,
1155 that is a description of an LR(1) grammar, and LALR(1) involves
1156 additional restrictions that are hard to explain simply; but it is rare
1157 in actual practice to find an LR(1) grammar that fails to be LALR(1).
1158 *Note Mysterious Reduce/Reduce Conflicts: Mystery Conflicts, for more
1159 information on this.
1160
1161 Parsers for LALR(1) grammars are "deterministic", meaning roughly
1162 that the next grammar rule to apply at any point in the input is
1163 uniquely determined by the preceding input and a fixed, finite portion
1164 (called a "lookahead") of the remaining input. A context-free grammar
1165 can be "ambiguous", meaning that there are multiple ways to apply the
1166 grammar rules to get the same inputs. Even unambiguous grammars can be
1167 "nondeterministic", meaning that no fixed lookahead always suffices to
1168 determine the next grammar rule to apply. With the proper
1169 declarations, Bison is also able to parse these more general
1170 context-free grammars, using a technique known as GLR parsing (for
1171 Generalized LR). Bison's GLR parsers are able to handle any
1172 context-free grammar for which the number of possible parses of any
1173 given string is finite.
1174
1175 In the formal grammatical rules for a language, each kind of
1176 syntactic unit or grouping is named by a "symbol". Those which are
1177 built by grouping smaller constructs according to grammatical rules are
1178 called "nonterminal symbols"; those which can't be subdivided are called
1179 "terminal symbols" or "token types". We call a piece of input
1180 corresponding to a single terminal symbol a "token", and a piece
1181 corresponding to a single nonterminal symbol a "grouping".
1182
1183 We can use the C language as an example of what symbols, terminal and
1184 nonterminal, mean. The tokens of C are identifiers, constants (numeric
1185 and string), and the various keywords, arithmetic operators and
1186 punctuation marks. So the terminal symbols of a grammar for C include
1187 `identifier', `number', `string', plus one symbol for each keyword,
1188 operator or punctuation mark: `if', `return', `const', `static', `int',
1189 `char', `plus-sign', `open-brace', `close-brace', `comma' and many more.
1190 (These tokens can be subdivided into characters, but that is a matter of
1191 lexicography, not grammar.)
1192
1193 Here is a simple C function subdivided into tokens:
1194
1195 int /* keyword `int' */
1196 square (int x) /* identifier, open-paren, keyword `int',
1197 identifier, close-paren */
1198 { /* open-brace */
1199 return x * x; /* keyword `return', identifier, asterisk,
1200 identifier, semicolon */
1201 } /* close-brace */
1202
1203 The syntactic groupings of C include the expression, the statement,
1204 the declaration, and the function definition. These are represented in
1205 the grammar of C by nonterminal symbols `expression', `statement',
1206 `declaration' and `function definition'. The full grammar uses dozens
1207 of additional language constructs, each with its own nonterminal
1208 symbol, in order to express the meanings of these four. The example
1209 above is a function definition; it contains one declaration, and one
1210 statement. In the statement, each `x' is an expression and so is `x *
1211 x'.
1212
1213 Each nonterminal symbol must have grammatical rules showing how it
1214 is made out of simpler constructs. For example, one kind of C
1215 statement is the `return' statement; this would be described with a
1216 grammar rule which reads informally as follows:
1217
1218 A `statement' can be made of a `return' keyword, an `expression'
1219 and a `semicolon'.
1220
1221 There would be many other rules for `statement', one for each kind of
1222 statement in C.
1223
1224 One nonterminal symbol must be distinguished as the special one which
1225 defines a complete utterance in the language. It is called the "start
1226 symbol". In a compiler, this means a complete input program. In the C
1227 language, the nonterminal symbol `sequence of definitions and
1228 declarations' plays this role.
1229
1230 For example, `1 + 2' is a valid C expression--a valid part of a C
1231 program--but it is not valid as an _entire_ C program. In the
1232 context-free grammar of C, this follows from the fact that `expression'
1233 is not the start symbol.
1234
1235 The Bison parser reads a sequence of tokens as its input, and groups
1236 the tokens using the grammar rules. If the input is valid, the end
1237 result is that the entire token sequence reduces to a single grouping
1238 whose symbol is the grammar's start symbol. If we use a grammar for C,
1239 the entire input must be a `sequence of definitions and declarations'.
1240 If not, the parser reports a syntax error.
1241
1242 
1243 File: bison.info, Node: Grammar in Bison, Next: Semantic Values, Prev: Langua ge and Grammar, Up: Concepts
1244
1245 1.2 From Formal Rules to Bison Input
1246 ====================================
1247
1248 A formal grammar is a mathematical construct. To define the language
1249 for Bison, you must write a file expressing the grammar in Bison syntax:
1250 a "Bison grammar" file. *Note Bison Grammar Files: Grammar File.
1251
1252 A nonterminal symbol in the formal grammar is represented in Bison
1253 input as an identifier, like an identifier in C. By convention, it
1254 should be in lower case, such as `expr', `stmt' or `declaration'.
1255
1256 The Bison representation for a terminal symbol is also called a
1257 "token type". Token types as well can be represented as C-like
1258 identifiers. By convention, these identifiers should be upper case to
1259 distinguish them from nonterminals: for example, `INTEGER',
1260 `IDENTIFIER', `IF' or `RETURN'. A terminal symbol that stands for a
1261 particular keyword in the language should be named after that keyword
1262 converted to upper case. The terminal symbol `error' is reserved for
1263 error recovery. *Note Symbols::.
1264
1265 A terminal symbol can also be represented as a character literal,
1266 just like a C character constant. You should do this whenever a token
1267 is just a single character (parenthesis, plus-sign, etc.): use that
1268 same character in a literal as the terminal symbol for that token.
1269
1270 A third way to represent a terminal symbol is with a C string
1271 constant containing several characters. *Note Symbols::, for more
1272 information.
1273
1274 The grammar rules also have an expression in Bison syntax. For
1275 example, here is the Bison rule for a C `return' statement. The
1276 semicolon in quotes is a literal character token, representing part of
1277 the C syntax for the statement; the naked semicolon, and the colon, are
1278 Bison punctuation used in every rule.
1279
1280 stmt: RETURN expr ';'
1281 ;
1282
1283 *Note Syntax of Grammar Rules: Rules.
1284
1285 
1286 File: bison.info, Node: Semantic Values, Next: Semantic Actions, Prev: Gramma r in Bison, Up: Concepts
1287
1288 1.3 Semantic Values
1289 ===================
1290
1291 A formal grammar selects tokens only by their classifications: for
1292 example, if a rule mentions the terminal symbol `integer constant', it
1293 means that _any_ integer constant is grammatically valid in that
1294 position. The precise value of the constant is irrelevant to how to
1295 parse the input: if `x+4' is grammatical then `x+1' or `x+3989' is
1296 equally grammatical.
1297
1298 But the precise value is very important for what the input means
1299 once it is parsed. A compiler is useless if it fails to distinguish
1300 between 4, 1 and 3989 as constants in the program! Therefore, each
1301 token in a Bison grammar has both a token type and a "semantic value".
1302 *Note Defining Language Semantics: Semantics, for details.
1303
1304 The token type is a terminal symbol defined in the grammar, such as
1305 `INTEGER', `IDENTIFIER' or `',''. It tells everything you need to know
1306 to decide where the token may validly appear and how to group it with
1307 other tokens. The grammar rules know nothing about tokens except their
1308 types.
1309
1310 The semantic value has all the rest of the information about the
1311 meaning of the token, such as the value of an integer, or the name of an
1312 identifier. (A token such as `','' which is just punctuation doesn't
1313 need to have any semantic value.)
1314
1315 For example, an input token might be classified as token type
1316 `INTEGER' and have the semantic value 4. Another input token might
1317 have the same token type `INTEGER' but value 3989. When a grammar rule
1318 says that `INTEGER' is allowed, either of these tokens is acceptable
1319 because each is an `INTEGER'. When the parser accepts the token, it
1320 keeps track of the token's semantic value.
1321
1322 Each grouping can also have a semantic value as well as its
1323 nonterminal symbol. For example, in a calculator, an expression
1324 typically has a semantic value that is a number. In a compiler for a
1325 programming language, an expression typically has a semantic value that
1326 is a tree structure describing the meaning of the expression.
1327
1328 
1329 File: bison.info, Node: Semantic Actions, Next: GLR Parsers, Prev: Semantic V alues, Up: Concepts
1330
1331 1.4 Semantic Actions
1332 ====================
1333
1334 In order to be useful, a program must do more than parse input; it must
1335 also produce some output based on the input. In a Bison grammar, a
1336 grammar rule can have an "action" made up of C statements. Each time
1337 the parser recognizes a match for that rule, the action is executed.
1338 *Note Actions::.
1339
1340 Most of the time, the purpose of an action is to compute the
1341 semantic value of the whole construct from the semantic values of its
1342 parts. For example, suppose we have a rule which says an expression
1343 can be the sum of two expressions. When the parser recognizes such a
1344 sum, each of the subexpressions has a semantic value which describes
1345 how it was built up. The action for this rule should create a similar
1346 sort of value for the newly recognized larger expression.
1347
1348 For example, here is a rule that says an expression can be the sum of
1349 two subexpressions:
1350
1351 expr: expr '+' expr { $$ = $1 + $3; }
1352 ;
1353
1354 The action says how to produce the semantic value of the sum expression
1355 from the values of the two subexpressions.
1356
1357 
1358 File: bison.info, Node: GLR Parsers, Next: Locations Overview, Prev: Semantic Actions, Up: Concepts
1359
1360 1.5 Writing GLR Parsers
1361 =======================
1362
1363 In some grammars, Bison's standard LALR(1) parsing algorithm cannot
1364 decide whether to apply a certain grammar rule at a given point. That
1365 is, it may not be able to decide (on the basis of the input read so
1366 far) which of two possible reductions (applications of a grammar rule)
1367 applies, or whether to apply a reduction or read more of the input and
1368 apply a reduction later in the input. These are known respectively as
1369 "reduce/reduce" conflicts (*note Reduce/Reduce::), and "shift/reduce"
1370 conflicts (*note Shift/Reduce::).
1371
1372 To use a grammar that is not easily modified to be LALR(1), a more
1373 general parsing algorithm is sometimes necessary. If you include
1374 `%glr-parser' among the Bison declarations in your file (*note Grammar
1375 Outline::), the result is a Generalized LR (GLR) parser. These parsers
1376 handle Bison grammars that contain no unresolved conflicts (i.e., after
1377 applying precedence declarations) identically to LALR(1) parsers.
1378 However, when faced with unresolved shift/reduce and reduce/reduce
1379 conflicts, GLR parsers use the simple expedient of doing both,
1380 effectively cloning the parser to follow both possibilities. Each of
1381 the resulting parsers can again split, so that at any given time, there
1382 can be any number of possible parses being explored. The parsers
1383 proceed in lockstep; that is, all of them consume (shift) a given input
1384 symbol before any of them proceed to the next. Each of the cloned
1385 parsers eventually meets one of two possible fates: either it runs into
1386 a parsing error, in which case it simply vanishes, or it merges with
1387 another parser, because the two of them have reduced the input to an
1388 identical set of symbols.
1389
1390 During the time that there are multiple parsers, semantic actions are
1391 recorded, but not performed. When a parser disappears, its recorded
1392 semantic actions disappear as well, and are never performed. When a
1393 reduction makes two parsers identical, causing them to merge, Bison
1394 records both sets of semantic actions. Whenever the last two parsers
1395 merge, reverting to the single-parser case, Bison resolves all the
1396 outstanding actions either by precedences given to the grammar rules
1397 involved, or by performing both actions, and then calling a designated
1398 user-defined function on the resulting values to produce an arbitrary
1399 merged result.
1400
1401 * Menu:
1402
1403 * Simple GLR Parsers:: Using GLR parsers on unambiguous grammars.
1404 * Merging GLR Parses:: Using GLR parsers to resolve ambiguities.
1405 * GLR Semantic Actions:: Deferred semantic actions have special concerns.
1406 * Compiler Requirements:: GLR parsers require a modern C compiler.
1407
1408 
1409 File: bison.info, Node: Simple GLR Parsers, Next: Merging GLR Parses, Up: GLR Parsers
1410
1411 1.5.1 Using GLR on Unambiguous Grammars
1412 ---------------------------------------
1413
1414 In the simplest cases, you can use the GLR algorithm to parse grammars
1415 that are unambiguous, but fail to be LALR(1). Such grammars typically
1416 require more than one symbol of lookahead, or (in rare cases) fall into
1417 the category of grammars in which the LALR(1) algorithm throws away too
1418 much information (they are in LR(1), but not LALR(1), *Note Mystery
1419 Conflicts::).
1420
1421 Consider a problem that arises in the declaration of enumerated and
1422 subrange types in the programming language Pascal. Here are some
1423 examples:
1424
1425 type subrange = lo .. hi;
1426 type enum = (a, b, c);
1427
1428 The original language standard allows only numeric literals and
1429 constant identifiers for the subrange bounds (`lo' and `hi'), but
1430 Extended Pascal (ISO/IEC 10206) and many other Pascal implementations
1431 allow arbitrary expressions there. This gives rise to the following
1432 situation, containing a superfluous pair of parentheses:
1433
1434 type subrange = (a) .. b;
1435
1436 Compare this to the following declaration of an enumerated type with
1437 only one value:
1438
1439 type enum = (a);
1440
1441 (These declarations are contrived, but they are syntactically valid,
1442 and more-complicated cases can come up in practical programs.)
1443
1444 These two declarations look identical until the `..' token. With
1445 normal LALR(1) one-token lookahead it is not possible to decide between
1446 the two forms when the identifier `a' is parsed. It is, however,
1447 desirable for a parser to decide this, since in the latter case `a'
1448 must become a new identifier to represent the enumeration value, while
1449 in the former case `a' must be evaluated with its current meaning,
1450 which may be a constant or even a function call.
1451
1452 You could parse `(a)' as an "unspecified identifier in parentheses",
1453 to be resolved later, but this typically requires substantial
1454 contortions in both semantic actions and large parts of the grammar,
1455 where the parentheses are nested in the recursive rules for expressions.
1456
1457 You might think of using the lexer to distinguish between the two
1458 forms by returning different tokens for currently defined and undefined
1459 identifiers. But if these declarations occur in a local scope, and `a'
1460 is defined in an outer scope, then both forms are possible--either
1461 locally redefining `a', or using the value of `a' from the outer scope.
1462 So this approach cannot work.
1463
1464 A simple solution to this problem is to declare the parser to use
1465 the GLR algorithm. When the GLR parser reaches the critical state, it
1466 merely splits into two branches and pursues both syntax rules
1467 simultaneously. Sooner or later, one of them runs into a parsing
1468 error. If there is a `..' token before the next `;', the rule for
1469 enumerated types fails since it cannot accept `..' anywhere; otherwise,
1470 the subrange type rule fails since it requires a `..' token. So one of
1471 the branches fails silently, and the other one continues normally,
1472 performing all the intermediate actions that were postponed during the
1473 split.
1474
1475 If the input is syntactically incorrect, both branches fail and the
1476 parser reports a syntax error as usual.
1477
1478 The effect of all this is that the parser seems to "guess" the
1479 correct branch to take, or in other words, it seems to use more
1480 lookahead than the underlying LALR(1) algorithm actually allows for.
1481 In this example, LALR(2) would suffice, but also some cases that are
1482 not LALR(k) for any k can be handled this way.
1483
1484 In general, a GLR parser can take quadratic or cubic worst-case time,
1485 and the current Bison parser even takes exponential time and space for
1486 some grammars. In practice, this rarely happens, and for many grammars
1487 it is possible to prove that it cannot happen. The present example
1488 contains only one conflict between two rules, and the type-declaration
1489 context containing the conflict cannot be nested. So the number of
1490 branches that can exist at any time is limited by the constant 2, and
1491 the parsing time is still linear.
1492
1493 Here is a Bison grammar corresponding to the example above. It
1494 parses a vastly simplified form of Pascal type declarations.
1495
1496 %token TYPE DOTDOT ID
1497
1498 %left '+' '-'
1499 %left '*' '/'
1500
1501 %%
1502
1503 type_decl : TYPE ID '=' type ';'
1504 ;
1505
1506 type : '(' id_list ')'
1507 | expr DOTDOT expr
1508 ;
1509
1510 id_list : ID
1511 | id_list ',' ID
1512 ;
1513
1514 expr : '(' expr ')'
1515 | expr '+' expr
1516 | expr '-' expr
1517 | expr '*' expr
1518 | expr '/' expr
1519 | ID
1520 ;
1521
1522 When used as a normal LALR(1) grammar, Bison correctly complains
1523 about one reduce/reduce conflict. In the conflicting situation the
1524 parser chooses one of the alternatives, arbitrarily the one declared
1525 first. Therefore the following correct input is not recognized:
1526
1527 type t = (a) .. b;
1528
1529 The parser can be turned into a GLR parser, while also telling Bison
1530 to be silent about the one known reduce/reduce conflict, by adding
1531 these two declarations to the Bison input file (before the first `%%'):
1532
1533 %glr-parser
1534 %expect-rr 1
1535
1536 No change in the grammar itself is required. Now the parser recognizes
1537 all valid declarations, according to the limited syntax above,
1538 transparently. In fact, the user does not even notice when the parser
1539 splits.
1540
1541 So here we have a case where we can use the benefits of GLR, almost
1542 without disadvantages. Even in simple cases like this, however, there
1543 are at least two potential problems to beware. First, always analyze
1544 the conflicts reported by Bison to make sure that GLR splitting is only
1545 done where it is intended. A GLR parser splitting inadvertently may
1546 cause problems less obvious than an LALR parser statically choosing the
1547 wrong alternative in a conflict. Second, consider interactions with
1548 the lexer (*note Semantic Tokens::) with great care. Since a split
1549 parser consumes tokens without performing any actions during the split,
1550 the lexer cannot obtain information via parser actions. Some cases of
1551 lexer interactions can be eliminated by using GLR to shift the
1552 complications from the lexer to the parser. You must check the
1553 remaining cases for correctness.
1554
1555 In our example, it would be safe for the lexer to return tokens
1556 based on their current meanings in some symbol table, because no new
1557 symbols are defined in the middle of a type declaration. Though it is
1558 possible for a parser to define the enumeration constants as they are
1559 parsed, before the type declaration is completed, it actually makes no
1560 difference since they cannot be used within the same enumerated type
1561 declaration.
1562
1563 
1564 File: bison.info, Node: Merging GLR Parses, Next: GLR Semantic Actions, Prev: Simple GLR Parsers, Up: GLR Parsers
1565
1566 1.5.2 Using GLR to Resolve Ambiguities
1567 --------------------------------------
1568
1569 Let's consider an example, vastly simplified from a C++ grammar.
1570
1571 %{
1572 #include <stdio.h>
1573 #define YYSTYPE char const *
1574 int yylex (void);
1575 void yyerror (char const *);
1576 %}
1577
1578 %token TYPENAME ID
1579
1580 %right '='
1581 %left '+'
1582
1583 %glr-parser
1584
1585 %%
1586
1587 prog :
1588 | prog stmt { printf ("\n"); }
1589 ;
1590
1591 stmt : expr ';' %dprec 1
1592 | decl %dprec 2
1593 ;
1594
1595 expr : ID { printf ("%s ", $$); }
1596 | TYPENAME '(' expr ')'
1597 { printf ("%s <cast> ", $1); }
1598 | expr '+' expr { printf ("+ "); }
1599 | expr '=' expr { printf ("= "); }
1600 ;
1601
1602 decl : TYPENAME declarator ';'
1603 { printf ("%s <declare> ", $1); }
1604 | TYPENAME declarator '=' expr ';'
1605 { printf ("%s <init-declare> ", $1); }
1606 ;
1607
1608 declarator : ID { printf ("\"%s\" ", $1); }
1609 | '(' declarator ')'
1610 ;
1611
1612 This models a problematic part of the C++ grammar--the ambiguity between
1613 certain declarations and statements. For example,
1614
1615 T (x) = y+z;
1616
1617 parses as either an `expr' or a `stmt' (assuming that `T' is recognized
1618 as a `TYPENAME' and `x' as an `ID'). Bison detects this as a
1619 reduce/reduce conflict between the rules `expr : ID' and `declarator :
1620 ID', which it cannot resolve at the time it encounters `x' in the
1621 example above. Since this is a GLR parser, it therefore splits the
1622 problem into two parses, one for each choice of resolving the
1623 reduce/reduce conflict. Unlike the example from the previous section
1624 (*note Simple GLR Parsers::), however, neither of these parses "dies,"
1625 because the grammar as it stands is ambiguous. One of the parsers
1626 eventually reduces `stmt : expr ';'' and the other reduces `stmt :
1627 decl', after which both parsers are in an identical state: they've seen
1628 `prog stmt' and have the same unprocessed input remaining. We say that
1629 these parses have "merged."
1630
1631 At this point, the GLR parser requires a specification in the
1632 grammar of how to choose between the competing parses. In the example
1633 above, the two `%dprec' declarations specify that Bison is to give
1634 precedence to the parse that interprets the example as a `decl', which
1635 implies that `x' is a declarator. The parser therefore prints
1636
1637 "x" y z + T <init-declare>
1638
1639 The `%dprec' declarations only come into play when more than one
1640 parse survives. Consider a different input string for this parser:
1641
1642 T (x) + y;
1643
1644 This is another example of using GLR to parse an unambiguous construct,
1645 as shown in the previous section (*note Simple GLR Parsers::). Here,
1646 there is no ambiguity (this cannot be parsed as a declaration).
1647 However, at the time the Bison parser encounters `x', it does not have
1648 enough information to resolve the reduce/reduce conflict (again,
1649 between `x' as an `expr' or a `declarator'). In this case, no
1650 precedence declaration is used. Again, the parser splits into two, one
1651 assuming that `x' is an `expr', and the other assuming `x' is a
1652 `declarator'. The second of these parsers then vanishes when it sees
1653 `+', and the parser prints
1654
1655 x T <cast> y +
1656
1657 Suppose that instead of resolving the ambiguity, you wanted to see
1658 all the possibilities. For this purpose, you must merge the semantic
1659 actions of the two possible parsers, rather than choosing one over the
1660 other. To do so, you could change the declaration of `stmt' as follows:
1661
1662 stmt : expr ';' %merge <stmtMerge>
1663 | decl %merge <stmtMerge>
1664 ;
1665
1666 and define the `stmtMerge' function as:
1667
1668 static YYSTYPE
1669 stmtMerge (YYSTYPE x0, YYSTYPE x1)
1670 {
1671 printf ("<OR> ");
1672 return "";
1673 }
1674
1675 with an accompanying forward declaration in the C declarations at the
1676 beginning of the file:
1677
1678 %{
1679 #define YYSTYPE char const *
1680 static YYSTYPE stmtMerge (YYSTYPE x0, YYSTYPE x1);
1681 %}
1682
1683 With these declarations, the resulting parser parses the first example
1684 as both an `expr' and a `decl', and prints
1685
1686 "x" y z + T <init-declare> x T <cast> y z + = <OR>
1687
1688 Bison requires that all of the productions that participate in any
1689 particular merge have identical `%merge' clauses. Otherwise, the
1690 ambiguity would be unresolvable, and the parser will report an error
1691 during any parse that results in the offending merge.
1692
1693 
1694 File: bison.info, Node: GLR Semantic Actions, Next: Compiler Requirements, Pr ev: Merging GLR Parses, Up: GLR Parsers
1695
1696 1.5.3 GLR Semantic Actions
1697 --------------------------
1698
1699 By definition, a deferred semantic action is not performed at the same
1700 time as the associated reduction. This raises caveats for several
1701 Bison features you might use in a semantic action in a GLR parser.
1702
1703 In any semantic action, you can examine `yychar' to determine the
1704 type of the lookahead token present at the time of the associated
1705 reduction. After checking that `yychar' is not set to `YYEMPTY' or
1706 `YYEOF', you can then examine `yylval' and `yylloc' to determine the
1707 lookahead token's semantic value and location, if any. In a
1708 nondeferred semantic action, you can also modify any of these variables
1709 to influence syntax analysis. *Note Lookahead Tokens: Lookahead.
1710
1711 In a deferred semantic action, it's too late to influence syntax
1712 analysis. In this case, `yychar', `yylval', and `yylloc' are set to
1713 shallow copies of the values they had at the time of the associated
1714 reduction. For this reason alone, modifying them is dangerous.
1715 Moreover, the result of modifying them is undefined and subject to
1716 change with future versions of Bison. For example, if a semantic
1717 action might be deferred, you should never write it to invoke
1718 `yyclearin' (*note Action Features::) or to attempt to free memory
1719 referenced by `yylval'.
1720
1721 Another Bison feature requiring special consideration is `YYERROR'
1722 (*note Action Features::), which you can invoke in a semantic action to
1723 initiate error recovery. During deterministic GLR operation, the
1724 effect of `YYERROR' is the same as its effect in an LALR(1) parser. In
1725 a deferred semantic action, its effect is undefined.
1726
1727 Also, see *Note Default Action for Locations: Location Default
1728 Action, which describes a special usage of `YYLLOC_DEFAULT' in GLR
1729 parsers.
1730
1731 
1732 File: bison.info, Node: Compiler Requirements, Prev: GLR Semantic Actions, Up : GLR Parsers
1733
1734 1.5.4 Considerations when Compiling GLR Parsers
1735 -----------------------------------------------
1736
1737 The GLR parsers require a compiler for ISO C89 or later. In addition,
1738 they use the `inline' keyword, which is not C89, but is C99 and is a
1739 common extension in pre-C99 compilers. It is up to the user of these
1740 parsers to handle portability issues. For instance, if using Autoconf
1741 and the Autoconf macro `AC_C_INLINE', a mere
1742
1743 %{
1744 #include <config.h>
1745 %}
1746
1747 will suffice. Otherwise, we suggest
1748
1749 %{
1750 #if __STDC_VERSION__ < 199901 && ! defined __GNUC__ && ! defined inline
1751 #define inline
1752 #endif
1753 %}
1754
1755 
1756 File: bison.info, Node: Locations Overview, Next: Bison Parser, Prev: GLR Par sers, Up: Concepts
1757
1758 1.6 Locations
1759 =============
1760
1761 Many applications, like interpreters or compilers, have to produce
1762 verbose and useful error messages. To achieve this, one must be able
1763 to keep track of the "textual location", or "location", of each
1764 syntactic construct. Bison provides a mechanism for handling these
1765 locations.
1766
1767 Each token has a semantic value. In a similar fashion, each token
1768 has an associated location, but the type of locations is the same for
1769 all tokens and groupings. Moreover, the output parser is equipped with
1770 a default data structure for storing locations (*note Locations::, for
1771 more details).
1772
1773 Like semantic values, locations can be reached in actions using a
1774 dedicated set of constructs. In the example above, the location of the
1775 whole grouping is `@$', while the locations of the subexpressions are
1776 `@1' and `@3'.
1777
1778 When a rule is matched, a default action is used to compute the
1779 semantic value of its left hand side (*note Actions::). In the same
1780 way, another default action is used for locations. However, the action
1781 for locations is general enough for most cases, meaning there is
1782 usually no need to describe for each rule how `@$' should be formed.
1783 When building a new location for a given grouping, the default behavior
1784 of the output parser is to take the beginning of the first symbol, and
1785 the end of the last symbol.
1786
1787 
1788 File: bison.info, Node: Bison Parser, Next: Stages, Prev: Locations Overview, Up: Concepts
1789
1790 1.7 Bison Output: the Parser File
1791 =================================
1792
1793 When you run Bison, you give it a Bison grammar file as input. The
1794 output is a C source file that parses the language described by the
1795 grammar. This file is called a "Bison parser". Keep in mind that the
1796 Bison utility and the Bison parser are two distinct programs: the Bison
1797 utility is a program whose output is the Bison parser that becomes part
1798 of your program.
1799
1800 The job of the Bison parser is to group tokens into groupings
1801 according to the grammar rules--for example, to build identifiers and
1802 operators into expressions. As it does this, it runs the actions for
1803 the grammar rules it uses.
1804
1805 The tokens come from a function called the "lexical analyzer" that
1806 you must supply in some fashion (such as by writing it in C). The Bison
1807 parser calls the lexical analyzer each time it wants a new token. It
1808 doesn't know what is "inside" the tokens (though their semantic values
1809 may reflect this). Typically the lexical analyzer makes the tokens by
1810 parsing characters of text, but Bison does not depend on this. *Note
1811 The Lexical Analyzer Function `yylex': Lexical.
1812
1813 The Bison parser file is C code which defines a function named
1814 `yyparse' which implements that grammar. This function does not make a
1815 complete C program: you must supply some additional functions. One is
1816 the lexical analyzer. Another is an error-reporting function which the
1817 parser calls to report an error. In addition, a complete C program must
1818 start with a function called `main'; you have to provide this, and
1819 arrange for it to call `yyparse' or the parser will never run. *Note
1820 Parser C-Language Interface: Interface.
1821
1822 Aside from the token type names and the symbols in the actions you
1823 write, all symbols defined in the Bison parser file itself begin with
1824 `yy' or `YY'. This includes interface functions such as the lexical
1825 analyzer function `yylex', the error reporting function `yyerror' and
1826 the parser function `yyparse' itself. This also includes numerous
1827 identifiers used for internal purposes. Therefore, you should avoid
1828 using C identifiers starting with `yy' or `YY' in the Bison grammar
1829 file except for the ones defined in this manual. Also, you should
1830 avoid using the C identifiers `malloc' and `free' for anything other
1831 than their usual meanings.
1832
1833 In some cases the Bison parser file includes system headers, and in
1834 those cases your code should respect the identifiers reserved by those
1835 headers. On some non-GNU hosts, `<alloca.h>', `<malloc.h>',
1836 `<stddef.h>', and `<stdlib.h>' are included as needed to declare memory
1837 allocators and related types. `<libintl.h>' is included if message
1838 translation is in use (*note Internationalization::). Other system
1839 headers may be included if you define `YYDEBUG' to a nonzero value
1840 (*note Tracing Your Parser: Tracing.).
1841
1842 
1843 File: bison.info, Node: Stages, Next: Grammar Layout, Prev: Bison Parser, Up : Concepts
1844
1845 1.8 Stages in Using Bison
1846 =========================
1847
1848 The actual language-design process using Bison, from grammar
1849 specification to a working compiler or interpreter, has these parts:
1850
1851 1. Formally specify the grammar in a form recognized by Bison (*note
1852 Bison Grammar Files: Grammar File.). For each grammatical rule in
1853 the language, describe the action that is to be taken when an
1854 instance of that rule is recognized. The action is described by a
1855 sequence of C statements.
1856
1857 2. Write a lexical analyzer to process input and pass tokens to the
1858 parser. The lexical analyzer may be written by hand in C (*note
1859 The Lexical Analyzer Function `yylex': Lexical.). It could also
1860 be produced using Lex, but the use of Lex is not discussed in this
1861 manual.
1862
1863 3. Write a controlling function that calls the Bison-produced parser.
1864
1865 4. Write error-reporting routines.
1866
1867 To turn this source code as written into a runnable program, you
1868 must follow these steps:
1869
1870 1. Run Bison on the grammar to produce the parser.
1871
1872 2. Compile the code output by Bison, as well as any other source
1873 files.
1874
1875 3. Link the object files to produce the finished product.
1876
1877 
1878 File: bison.info, Node: Grammar Layout, Prev: Stages, Up: Concepts
1879
1880 1.9 The Overall Layout of a Bison Grammar
1881 =========================================
1882
1883 The input file for the Bison utility is a "Bison grammar file". The
1884 general form of a Bison grammar file is as follows:
1885
1886 %{
1887 PROLOGUE
1888 %}
1889
1890 BISON DECLARATIONS
1891
1892 %%
1893 GRAMMAR RULES
1894 %%
1895 EPILOGUE
1896
1897 The `%%', `%{' and `%}' are punctuation that appears in every Bison
1898 grammar file to separate the sections.
1899
1900 The prologue may define types and variables used in the actions.
1901 You can also use preprocessor commands to define macros used there, and
1902 use `#include' to include header files that do any of these things.
1903 You need to declare the lexical analyzer `yylex' and the error printer
1904 `yyerror' here, along with any other global identifiers used by the
1905 actions in the grammar rules.
1906
1907 The Bison declarations declare the names of the terminal and
1908 nonterminal symbols, and may also describe operator precedence and the
1909 data types of semantic values of various symbols.
1910
1911 The grammar rules define how to construct each nonterminal symbol
1912 from its parts.
1913
1914 The epilogue can contain any code you want to use. Often the
1915 definitions of functions declared in the prologue go here. In a simple
1916 program, all the rest of the program can go here.
1917
1918 
1919 File: bison.info, Node: Examples, Next: Grammar File, Prev: Concepts, Up: To p
1920
1921 2 Examples
1922 **********
1923
1924 Now we show and explain three sample programs written using Bison: a
1925 reverse polish notation calculator, an algebraic (infix) notation
1926 calculator, and a multi-function calculator. All three have been tested
1927 under BSD Unix 4.3; each produces a usable, though limited, interactive
1928 desk-top calculator.
1929
1930 These examples are simple, but Bison grammars for real programming
1931 languages are written the same way. You can copy these examples into a
1932 source file to try them.
1933
1934 * Menu:
1935
1936 * RPN Calc:: Reverse polish notation calculator;
1937 a first example with no operator precedence.
1938 * Infix Calc:: Infix (algebraic) notation calculator.
1939 Operator precedence is introduced.
1940 * Simple Error Recovery:: Continuing after syntax errors.
1941 * Location Tracking Calc:: Demonstrating the use of @N and @$.
1942 * Multi-function Calc:: Calculator with memory and trig functions.
1943 It uses multiple data-types for semantic values.
1944 * Exercises:: Ideas for improving the multi-function calculator.
1945
1946 
1947 File: bison.info, Node: RPN Calc, Next: Infix Calc, Up: Examples
1948
1949 2.1 Reverse Polish Notation Calculator
1950 ======================================
1951
1952 The first example is that of a simple double-precision "reverse polish
1953 notation" calculator (a calculator using postfix operators). This
1954 example provides a good starting point, since operator precedence is
1955 not an issue. The second example will illustrate how operator
1956 precedence is handled.
1957
1958 The source code for this calculator is named `rpcalc.y'. The `.y'
1959 extension is a convention used for Bison input files.
1960
1961 * Menu:
1962
1963 * Rpcalc Declarations:: Prologue (declarations) for rpcalc.
1964 * Rpcalc Rules:: Grammar Rules for rpcalc, with explanation.
1965 * Rpcalc Lexer:: The lexical analyzer.
1966 * Rpcalc Main:: The controlling function.
1967 * Rpcalc Error:: The error reporting function.
1968 * Rpcalc Generate:: Running Bison on the grammar file.
1969 * Rpcalc Compile:: Run the C compiler on the output code.
1970
1971 
1972 File: bison.info, Node: Rpcalc Declarations, Next: Rpcalc Rules, Up: RPN Calc
1973
1974 2.1.1 Declarations for `rpcalc'
1975 -------------------------------
1976
1977 Here are the C and Bison declarations for the reverse polish notation
1978 calculator. As in C, comments are placed between `/*...*/'.
1979
1980 /* Reverse polish notation calculator. */
1981
1982 %{
1983 #define YYSTYPE double
1984 #include <math.h>
1985 int yylex (void);
1986 void yyerror (char const *);
1987 %}
1988
1989 %token NUM
1990
1991 %% /* Grammar rules and actions follow. */
1992
1993 The declarations section (*note The prologue: Prologue.) contains two
1994 preprocessor directives and two forward declarations.
1995
1996 The `#define' directive defines the macro `YYSTYPE', thus specifying
1997 the C data type for semantic values of both tokens and groupings (*note
1998 Data Types of Semantic Values: Value Type.). The Bison parser will use
1999 whatever type `YYSTYPE' is defined as; if you don't define it, `int' is
2000 the default. Because we specify `double', each token and each
2001 expression has an associated value, which is a floating point number.
2002
2003 The `#include' directive is used to declare the exponentiation
2004 function `pow'.
2005
2006 The forward declarations for `yylex' and `yyerror' are needed
2007 because the C language requires that functions be declared before they
2008 are used. These functions will be defined in the epilogue, but the
2009 parser calls them so they must be declared in the prologue.
2010
2011 The second section, Bison declarations, provides information to Bison
2012 about the token types (*note The Bison Declarations Section: Bison
2013 Declarations.). Each terminal symbol that is not a single-character
2014 literal must be declared here. (Single-character literals normally
2015 don't need to be declared.) In this example, all the arithmetic
2016 operators are designated by single-character literals, so the only
2017 terminal symbol that needs to be declared is `NUM', the token type for
2018 numeric constants.
2019
2020 
2021 File: bison.info, Node: Rpcalc Rules, Next: Rpcalc Lexer, Prev: Rpcalc Declar ations, Up: RPN Calc
2022
2023 2.1.2 Grammar Rules for `rpcalc'
2024 --------------------------------
2025
2026 Here are the grammar rules for the reverse polish notation calculator.
2027
2028 input: /* empty */
2029 | input line
2030 ;
2031
2032 line: '\n'
2033 | exp '\n' { printf ("\t%.10g\n", $1); }
2034 ;
2035
2036 exp: NUM { $$ = $1; }
2037 | exp exp '+' { $$ = $1 + $2; }
2038 | exp exp '-' { $$ = $1 - $2; }
2039 | exp exp '*' { $$ = $1 * $2; }
2040 | exp exp '/' { $$ = $1 / $2; }
2041 /* Exponentiation */
2042 | exp exp '^' { $$ = pow ($1, $2); }
2043 /* Unary minus */
2044 | exp 'n' { $$ = -$1; }
2045 ;
2046 %%
2047
2048 The groupings of the rpcalc "language" defined here are the
2049 expression (given the name `exp'), the line of input (`line'), and the
2050 complete input transcript (`input'). Each of these nonterminal symbols
2051 has several alternate rules, joined by the vertical bar `|' which is
2052 read as "or". The following sections explain what these rules mean.
2053
2054 The semantics of the language is determined by the actions taken
2055 when a grouping is recognized. The actions are the C code that appears
2056 inside braces. *Note Actions::.
2057
2058 You must specify these actions in C, but Bison provides the means for
2059 passing semantic values between the rules. In each action, the
2060 pseudo-variable `$$' stands for the semantic value for the grouping
2061 that the rule is going to construct. Assigning a value to `$$' is the
2062 main job of most actions. The semantic values of the components of the
2063 rule are referred to as `$1', `$2', and so on.
2064
2065 * Menu:
2066
2067 * Rpcalc Input::
2068 * Rpcalc Line::
2069 * Rpcalc Expr::
2070
2071 
2072 File: bison.info, Node: Rpcalc Input, Next: Rpcalc Line, Up: Rpcalc Rules
2073
2074 2.1.2.1 Explanation of `input'
2075 ..............................
2076
2077 Consider the definition of `input':
2078
2079 input: /* empty */
2080 | input line
2081 ;
2082
2083 This definition reads as follows: "A complete input is either an
2084 empty string, or a complete input followed by an input line". Notice
2085 that "complete input" is defined in terms of itself. This definition
2086 is said to be "left recursive" since `input' appears always as the
2087 leftmost symbol in the sequence. *Note Recursive Rules: Recursion.
2088
2089 The first alternative is empty because there are no symbols between
2090 the colon and the first `|'; this means that `input' can match an empty
2091 string of input (no tokens). We write the rules this way because it is
2092 legitimate to type `Ctrl-d' right after you start the calculator. It's
2093 conventional to put an empty alternative first and write the comment
2094 `/* empty */' in it.
2095
2096 The second alternate rule (`input line') handles all nontrivial
2097 input. It means, "After reading any number of lines, read one more
2098 line if possible." The left recursion makes this rule into a loop.
2099 Since the first alternative matches empty input, the loop can be
2100 executed zero or more times.
2101
2102 The parser function `yyparse' continues to process input until a
2103 grammatical error is seen or the lexical analyzer says there are no more
2104 input tokens; we will arrange for the latter to happen at end-of-input.
2105
2106 
2107 File: bison.info, Node: Rpcalc Line, Next: Rpcalc Expr, Prev: Rpcalc Input, Up: Rpcalc Rules
2108
2109 2.1.2.2 Explanation of `line'
2110 .............................
2111
2112 Now consider the definition of `line':
2113
2114 line: '\n'
2115 | exp '\n' { printf ("\t%.10g\n", $1); }
2116 ;
2117
2118 The first alternative is a token which is a newline character; this
2119 means that rpcalc accepts a blank line (and ignores it, since there is
2120 no action). The second alternative is an expression followed by a
2121 newline. This is the alternative that makes rpcalc useful. The
2122 semantic value of the `exp' grouping is the value of `$1' because the
2123 `exp' in question is the first symbol in the alternative. The action
2124 prints this value, which is the result of the computation the user
2125 asked for.
2126
2127 This action is unusual because it does not assign a value to `$$'.
2128 As a consequence, the semantic value associated with the `line' is
2129 uninitialized (its value will be unpredictable). This would be a bug if
2130 that value were ever used, but we don't use it: once rpcalc has printed
2131 the value of the user's input line, that value is no longer needed.
2132
2133 
2134 File: bison.info, Node: Rpcalc Expr, Prev: Rpcalc Line, Up: Rpcalc Rules
2135
2136 2.1.2.3 Explanation of `expr'
2137 .............................
2138
2139 The `exp' grouping has several rules, one for each kind of expression.
2140 The first rule handles the simplest expressions: those that are just
2141 numbers. The second handles an addition-expression, which looks like
2142 two expressions followed by a plus-sign. The third handles
2143 subtraction, and so on.
2144
2145 exp: NUM
2146 | exp exp '+' { $$ = $1 + $2; }
2147 | exp exp '-' { $$ = $1 - $2; }
2148 ...
2149 ;
2150
2151 We have used `|' to join all the rules for `exp', but we could
2152 equally well have written them separately:
2153
2154 exp: NUM ;
2155 exp: exp exp '+' { $$ = $1 + $2; } ;
2156 exp: exp exp '-' { $$ = $1 - $2; } ;
2157 ...
2158
2159 Most of the rules have actions that compute the value of the
2160 expression in terms of the value of its parts. For example, in the
2161 rule for addition, `$1' refers to the first component `exp' and `$2'
2162 refers to the second one. The third component, `'+'', has no meaningful
2163 associated semantic value, but if it had one you could refer to it as
2164 `$3'. When `yyparse' recognizes a sum expression using this rule, the
2165 sum of the two subexpressions' values is produced as the value of the
2166 entire expression. *Note Actions::.
2167
2168 You don't have to give an action for every rule. When a rule has no
2169 action, Bison by default copies the value of `$1' into `$$'. This is
2170 what happens in the first rule (the one that uses `NUM').
2171
2172 The formatting shown here is the recommended convention, but Bison
2173 does not require it. You can add or change white space as much as you
2174 wish. For example, this:
2175
2176 exp : NUM | exp exp '+' {$$ = $1 + $2; } | ... ;
2177
2178 means the same thing as this:
2179
2180 exp: NUM
2181 | exp exp '+' { $$ = $1 + $2; }
2182 | ...
2183 ;
2184
2185 The latter, however, is much more readable.
2186
2187 
2188 File: bison.info, Node: Rpcalc Lexer, Next: Rpcalc Main, Prev: Rpcalc Rules, Up: RPN Calc
2189
2190 2.1.3 The `rpcalc' Lexical Analyzer
2191 -----------------------------------
2192
2193 The lexical analyzer's job is low-level parsing: converting characters
2194 or sequences of characters into tokens. The Bison parser gets its
2195 tokens by calling the lexical analyzer. *Note The Lexical Analyzer
2196 Function `yylex': Lexical.
2197
2198 Only a simple lexical analyzer is needed for the RPN calculator.
2199 This lexical analyzer skips blanks and tabs, then reads in numbers as
2200 `double' and returns them as `NUM' tokens. Any other character that
2201 isn't part of a number is a separate token. Note that the token-code
2202 for such a single-character token is the character itself.
2203
2204 The return value of the lexical analyzer function is a numeric code
2205 which represents a token type. The same text used in Bison rules to
2206 stand for this token type is also a C expression for the numeric code
2207 for the type. This works in two ways. If the token type is a
2208 character literal, then its numeric code is that of the character; you
2209 can use the same character literal in the lexical analyzer to express
2210 the number. If the token type is an identifier, that identifier is
2211 defined by Bison as a C macro whose definition is the appropriate
2212 number. In this example, therefore, `NUM' becomes a macro for `yylex'
2213 to use.
2214
2215 The semantic value of the token (if it has one) is stored into the
2216 global variable `yylval', which is where the Bison parser will look for
2217 it. (The C data type of `yylval' is `YYSTYPE', which was defined at
2218 the beginning of the grammar; *note Declarations for `rpcalc': Rpcalc
2219 Declarations.)
2220
2221 A token type code of zero is returned if the end-of-input is
2222 encountered. (Bison recognizes any nonpositive value as indicating
2223 end-of-input.)
2224
2225 Here is the code for the lexical analyzer:
2226
2227 /* The lexical analyzer returns a double floating point
2228 number on the stack and the token NUM, or the numeric code
2229 of the character read if not a number. It skips all blanks
2230 and tabs, and returns 0 for end-of-input. */
2231
2232 #include <ctype.h>
2233
2234 int
2235 yylex (void)
2236 {
2237 int c;
2238
2239 /* Skip white space. */
2240 while ((c = getchar ()) == ' ' || c == '\t')
2241 ;
2242 /* Process numbers. */
2243 if (c == '.' || isdigit (c))
2244 {
2245 ungetc (c, stdin);
2246 scanf ("%lf", &yylval);
2247 return NUM;
2248 }
2249 /* Return end-of-input. */
2250 if (c == EOF)
2251 return 0;
2252 /* Return a single char. */
2253 return c;
2254 }
2255
2256 
2257 File: bison.info, Node: Rpcalc Main, Next: Rpcalc Error, Prev: Rpcalc Lexer, Up: RPN Calc
2258
2259 2.1.4 The Controlling Function
2260 ------------------------------
2261
2262 In keeping with the spirit of this example, the controlling function is
2263 kept to the bare minimum. The only requirement is that it call
2264 `yyparse' to start the process of parsing.
2265
2266 int
2267 main (void)
2268 {
2269 return yyparse ();
2270 }
2271
2272 
2273 File: bison.info, Node: Rpcalc Error, Next: Rpcalc Generate, Prev: Rpcalc Mai n, Up: RPN Calc
2274
2275 2.1.5 The Error Reporting Routine
2276 ---------------------------------
2277
2278 When `yyparse' detects a syntax error, it calls the error reporting
2279 function `yyerror' to print an error message (usually but not always
2280 `"syntax error"'). It is up to the programmer to supply `yyerror'
2281 (*note Parser C-Language Interface: Interface.), so here is the
2282 definition we will use:
2283
2284 #include <stdio.h>
2285
2286 /* Called by yyparse on error. */
2287 void
2288 yyerror (char const *s)
2289 {
2290 fprintf (stderr, "%s\n", s);
2291 }
2292
2293 After `yyerror' returns, the Bison parser may recover from the error
2294 and continue parsing if the grammar contains a suitable error rule
2295 (*note Error Recovery::). Otherwise, `yyparse' returns nonzero. We
2296 have not written any error rules in this example, so any invalid input
2297 will cause the calculator program to exit. This is not clean behavior
2298 for a real calculator, but it is adequate for the first example.
2299
2300 
2301 File: bison.info, Node: Rpcalc Generate, Next: Rpcalc Compile, Prev: Rpcalc E rror, Up: RPN Calc
2302
2303 2.1.6 Running Bison to Make the Parser
2304 --------------------------------------
2305
2306 Before running Bison to produce a parser, we need to decide how to
2307 arrange all the source code in one or more source files. For such a
2308 simple example, the easiest thing is to put everything in one file. The
2309 definitions of `yylex', `yyerror' and `main' go at the end, in the
2310 epilogue of the file (*note The Overall Layout of a Bison Grammar:
2311 Grammar Layout.).
2312
2313 For a large project, you would probably have several source files,
2314 and use `make' to arrange to recompile them.
2315
2316 With all the source in a single file, you use the following command
2317 to convert it into a parser file:
2318
2319 bison FILE.y
2320
2321 In this example the file was called `rpcalc.y' (for "Reverse Polish
2322 CALCulator"). Bison produces a file named `FILE.tab.c', removing the
2323 `.y' from the original file name. The file output by Bison contains
2324 the source code for `yyparse'. The additional functions in the input
2325 file (`yylex', `yyerror' and `main') are copied verbatim to the output.
2326
2327 
2328 File: bison.info, Node: Rpcalc Compile, Prev: Rpcalc Generate, Up: RPN Calc
2329
2330 2.1.7 Compiling the Parser File
2331 -------------------------------
2332
2333 Here is how to compile and run the parser file:
2334
2335 # List files in current directory.
2336 $ ls
2337 rpcalc.tab.c rpcalc.y
2338
2339 # Compile the Bison parser.
2340 # `-lm' tells compiler to search math library for `pow'.
2341 $ cc -lm -o rpcalc rpcalc.tab.c
2342
2343 # List files again.
2344 $ ls
2345 rpcalc rpcalc.tab.c rpcalc.y
2346
2347 The file `rpcalc' now contains the executable code. Here is an
2348 example session using `rpcalc'.
2349
2350 $ rpcalc
2351 4 9 +
2352 13
2353 3 7 + 3 4 5 *+-
2354 -13
2355 3 7 + 3 4 5 * + - n Note the unary minus, `n'
2356 13
2357 5 6 / 4 n +
2358 -3.166666667
2359 3 4 ^ Exponentiation
2360 81
2361 ^D End-of-file indicator
2362 $
2363
2364 
2365 File: bison.info, Node: Infix Calc, Next: Simple Error Recovery, Prev: RPN Ca lc, Up: Examples
2366
2367 2.2 Infix Notation Calculator: `calc'
2368 =====================================
2369
2370 We now modify rpcalc to handle infix operators instead of postfix.
2371 Infix notation involves the concept of operator precedence and the need
2372 for parentheses nested to arbitrary depth. Here is the Bison code for
2373 `calc.y', an infix desk-top calculator.
2374
2375 /* Infix notation calculator. */
2376
2377 %{
2378 #define YYSTYPE double
2379 #include <math.h>
2380 #include <stdio.h>
2381 int yylex (void);
2382 void yyerror (char const *);
2383 %}
2384
2385 /* Bison declarations. */
2386 %token NUM
2387 %left '-' '+'
2388 %left '*' '/'
2389 %left NEG /* negation--unary minus */
2390 %right '^' /* exponentiation */
2391
2392 %% /* The grammar follows. */
2393 input: /* empty */
2394 | input line
2395 ;
2396
2397 line: '\n'
2398 | exp '\n' { printf ("\t%.10g\n", $1); }
2399 ;
2400
2401 exp: NUM { $$ = $1; }
2402 | exp '+' exp { $$ = $1 + $3; }
2403 | exp '-' exp { $$ = $1 - $3; }
2404 | exp '*' exp { $$ = $1 * $3; }
2405 | exp '/' exp { $$ = $1 / $3; }
2406 | '-' exp %prec NEG { $$ = -$2; }
2407 | exp '^' exp { $$ = pow ($1, $3); }
2408 | '(' exp ')' { $$ = $2; }
2409 ;
2410 %%
2411
2412 The functions `yylex', `yyerror' and `main' can be the same as before.
2413
2414 There are two important new features shown in this code.
2415
2416 In the second section (Bison declarations), `%left' declares token
2417 types and says they are left-associative operators. The declarations
2418 `%left' and `%right' (right associativity) take the place of `%token'
2419 which is used to declare a token type name without associativity.
2420 (These tokens are single-character literals, which ordinarily don't
2421 need to be declared. We declare them here to specify the
2422 associativity.)
2423
2424 Operator precedence is determined by the line ordering of the
2425 declarations; the higher the line number of the declaration (lower on
2426 the page or screen), the higher the precedence. Hence, exponentiation
2427 has the highest precedence, unary minus (`NEG') is next, followed by
2428 `*' and `/', and so on. *Note Operator Precedence: Precedence.
2429
2430 The other important new feature is the `%prec' in the grammar
2431 section for the unary minus operator. The `%prec' simply instructs
2432 Bison that the rule `| '-' exp' has the same precedence as `NEG'--in
2433 this case the next-to-highest. *Note Context-Dependent Precedence:
2434 Contextual Precedence.
2435
2436 Here is a sample run of `calc.y':
2437
2438 $ calc
2439 4 + 4.5 - (34/(8*3+-3))
2440 6.880952381
2441 -56 + 2
2442 -54
2443 3 ^ 2
2444 9
2445
2446 
2447 File: bison.info, Node: Simple Error Recovery, Next: Location Tracking Calc, Prev: Infix Calc, Up: Examples
2448
2449 2.3 Simple Error Recovery
2450 =========================
2451
2452 Up to this point, this manual has not addressed the issue of "error
2453 recovery"--how to continue parsing after the parser detects a syntax
2454 error. All we have handled is error reporting with `yyerror'. Recall
2455 that by default `yyparse' returns after calling `yyerror'. This means
2456 that an erroneous input line causes the calculator program to exit.
2457 Now we show how to rectify this deficiency.
2458
2459 The Bison language itself includes the reserved word `error', which
2460 may be included in the grammar rules. In the example below it has been
2461 added to one of the alternatives for `line':
2462
2463 line: '\n'
2464 | exp '\n' { printf ("\t%.10g\n", $1); }
2465 | error '\n' { yyerrok; }
2466 ;
2467
2468 This addition to the grammar allows for simple error recovery in the
2469 event of a syntax error. If an expression that cannot be evaluated is
2470 read, the error will be recognized by the third rule for `line', and
2471 parsing will continue. (The `yyerror' function is still called upon to
2472 print its message as well.) The action executes the statement
2473 `yyerrok', a macro defined automatically by Bison; its meaning is that
2474 error recovery is complete (*note Error Recovery::). Note the
2475 difference between `yyerrok' and `yyerror'; neither one is a misprint.
2476
2477 This form of error recovery deals with syntax errors. There are
2478 other kinds of errors; for example, division by zero, which raises an
2479 exception signal that is normally fatal. A real calculator program
2480 must handle this signal and use `longjmp' to return to `main' and
2481 resume parsing input lines; it would also have to discard the rest of
2482 the current line of input. We won't discuss this issue further because
2483 it is not specific to Bison programs.
2484
2485 
2486 File: bison.info, Node: Location Tracking Calc, Next: Multi-function Calc, Pr ev: Simple Error Recovery, Up: Examples
2487
2488 2.4 Location Tracking Calculator: `ltcalc'
2489 ==========================================
2490
2491 This example extends the infix notation calculator with location
2492 tracking. This feature will be used to improve the error messages. For
2493 the sake of clarity, this example is a simple integer calculator, since
2494 most of the work needed to use locations will be done in the lexical
2495 analyzer.
2496
2497 * Menu:
2498
2499 * Ltcalc Declarations:: Bison and C declarations for ltcalc.
2500 * Ltcalc Rules:: Grammar rules for ltcalc, with explanations.
2501 * Ltcalc Lexer:: The lexical analyzer.
2502
2503 
2504 File: bison.info, Node: Ltcalc Declarations, Next: Ltcalc Rules, Up: Location Tracking Calc
2505
2506 2.4.1 Declarations for `ltcalc'
2507 -------------------------------
2508
2509 The C and Bison declarations for the location tracking calculator are
2510 the same as the declarations for the infix notation calculator.
2511
2512 /* Location tracking calculator. */
2513
2514 %{
2515 #define YYSTYPE int
2516 #include <math.h>
2517 int yylex (void);
2518 void yyerror (char const *);
2519 %}
2520
2521 /* Bison declarations. */
2522 %token NUM
2523
2524 %left '-' '+'
2525 %left '*' '/'
2526 %left NEG
2527 %right '^'
2528
2529 %% /* The grammar follows. */
2530
2531 Note there are no declarations specific to locations. Defining a data
2532 type for storing locations is not needed: we will use the type provided
2533 by default (*note Data Types of Locations: Location Type.), which is a
2534 four member structure with the following integer fields: `first_line',
2535 `first_column', `last_line' and `last_column'. By conventions, and in
2536 accordance with the GNU Coding Standards and common practice, the line
2537 and column count both start at 1.
2538
2539 
2540 File: bison.info, Node: Ltcalc Rules, Next: Ltcalc Lexer, Prev: Ltcalc Declar ations, Up: Location Tracking Calc
2541
2542 2.4.2 Grammar Rules for `ltcalc'
2543 --------------------------------
2544
2545 Whether handling locations or not has no effect on the syntax of your
2546 language. Therefore, grammar rules for this example will be very close
2547 to those of the previous example: we will only modify them to benefit
2548 from the new information.
2549
2550 Here, we will use locations to report divisions by zero, and locate
2551 the wrong expressions or subexpressions.
2552
2553 input : /* empty */
2554 | input line
2555 ;
2556
2557 line : '\n'
2558 | exp '\n' { printf ("%d\n", $1); }
2559 ;
2560
2561 exp : NUM { $$ = $1; }
2562 | exp '+' exp { $$ = $1 + $3; }
2563 | exp '-' exp { $$ = $1 - $3; }
2564 | exp '*' exp { $$ = $1 * $3; }
2565 | exp '/' exp
2566 {
2567 if ($3)
2568 $$ = $1 / $3;
2569 else
2570 {
2571 $$ = 1;
2572 fprintf (stderr, "%d.%d-%d.%d: division by zero",
2573 @3.first_line, @3.first_column,
2574 @3.last_line, @3.last_column);
2575 }
2576 }
2577 | '-' exp %prec NEG { $$ = -$2; }
2578 | exp '^' exp { $$ = pow ($1, $3); }
2579 | '(' exp ')' { $$ = $2; }
2580
2581 This code shows how to reach locations inside of semantic actions, by
2582 using the pseudo-variables `@N' for rule components, and the
2583 pseudo-variable `@$' for groupings.
2584
2585 We don't need to assign a value to `@$': the output parser does it
2586 automatically. By default, before executing the C code of each action,
2587 `@$' is set to range from the beginning of `@1' to the end of `@N', for
2588 a rule with N components. This behavior can be redefined (*note
2589 Default Action for Locations: Location Default Action.), and for very
2590 specific rules, `@$' can be computed by hand.
2591
2592 
2593 File: bison.info, Node: Ltcalc Lexer, Prev: Ltcalc Rules, Up: Location Tracki ng Calc
2594
2595 2.4.3 The `ltcalc' Lexical Analyzer.
2596 ------------------------------------
2597
2598 Until now, we relied on Bison's defaults to enable location tracking.
2599 The next step is to rewrite the lexical analyzer, and make it able to
2600 feed the parser with the token locations, as it already does for
2601 semantic values.
2602
2603 To this end, we must take into account every single character of the
2604 input text, to avoid the computed locations of being fuzzy or wrong:
2605
2606 int
2607 yylex (void)
2608 {
2609 int c;
2610
2611 /* Skip white space. */
2612 while ((c = getchar ()) == ' ' || c == '\t')
2613 ++yylloc.last_column;
2614
2615 /* Step. */
2616 yylloc.first_line = yylloc.last_line;
2617 yylloc.first_column = yylloc.last_column;
2618
2619 /* Process numbers. */
2620 if (isdigit (c))
2621 {
2622 yylval = c - '0';
2623 ++yylloc.last_column;
2624 while (isdigit (c = getchar ()))
2625 {
2626 ++yylloc.last_column;
2627 yylval = yylval * 10 + c - '0';
2628 }
2629 ungetc (c, stdin);
2630 return NUM;
2631 }
2632
2633 /* Return end-of-input. */
2634 if (c == EOF)
2635 return 0;
2636
2637 /* Return a single char, and update location. */
2638 if (c == '\n')
2639 {
2640 ++yylloc.last_line;
2641 yylloc.last_column = 0;
2642 }
2643 else
2644 ++yylloc.last_column;
2645 return c;
2646 }
2647
2648 Basically, the lexical analyzer performs the same processing as
2649 before: it skips blanks and tabs, and reads numbers or single-character
2650 tokens. In addition, it updates `yylloc', the global variable (of type
2651 `YYLTYPE') containing the token's location.
2652
2653 Now, each time this function returns a token, the parser has its
2654 number as well as its semantic value, and its location in the text.
2655 The last needed change is to initialize `yylloc', for example in the
2656 controlling function:
2657
2658 int
2659 main (void)
2660 {
2661 yylloc.first_line = yylloc.last_line = 1;
2662 yylloc.first_column = yylloc.last_column = 0;
2663 return yyparse ();
2664 }
2665
2666 Remember that computing locations is not a matter of syntax. Every
2667 character must be associated to a location update, whether it is in
2668 valid input, in comments, in literal strings, and so on.
2669
2670 
2671 File: bison.info, Node: Multi-function Calc, Next: Exercises, Prev: Location Tracking Calc, Up: Examples
2672
2673 2.5 Multi-Function Calculator: `mfcalc'
2674 =======================================
2675
2676 Now that the basics of Bison have been discussed, it is time to move on
2677 to a more advanced problem. The above calculators provided only five
2678 functions, `+', `-', `*', `/' and `^'. It would be nice to have a
2679 calculator that provides other mathematical functions such as `sin',
2680 `cos', etc.
2681
2682 It is easy to add new operators to the infix calculator as long as
2683 they are only single-character literals. The lexical analyzer `yylex'
2684 passes back all nonnumeric characters as tokens, so new grammar rules
2685 suffice for adding a new operator. But we want something more
2686 flexible: built-in functions whose syntax has this form:
2687
2688 FUNCTION_NAME (ARGUMENT)
2689
2690 At the same time, we will add memory to the calculator, by allowing you
2691 to create named variables, store values in them, and use them later.
2692 Here is a sample session with the multi-function calculator:
2693
2694 $ mfcalc
2695 pi = 3.141592653589
2696 3.1415926536
2697 sin(pi)
2698 0.0000000000
2699 alpha = beta1 = 2.3
2700 2.3000000000
2701 alpha
2702 2.3000000000
2703 ln(alpha)
2704 0.8329091229
2705 exp(ln(beta1))
2706 2.3000000000
2707 $
2708
2709 Note that multiple assignment and nested function calls are
2710 permitted.
2711
2712 * Menu:
2713
2714 * Mfcalc Declarations:: Bison declarations for multi-function calculator.
2715 * Mfcalc Rules:: Grammar rules for the calculator.
2716 * Mfcalc Symbol Table:: Symbol table management subroutines.
2717
2718 
2719 File: bison.info, Node: Mfcalc Declarations, Next: Mfcalc Rules, Up: Multi-fu nction Calc
2720
2721 2.5.1 Declarations for `mfcalc'
2722 -------------------------------
2723
2724 Here are the C and Bison declarations for the multi-function calculator.
2725
2726 %{
2727 #include <math.h> /* For math functions, cos(), sin(), etc. */
2728 #include "calc.h" /* Contains definition of `symrec'. */
2729 int yylex (void);
2730 void yyerror (char const *);
2731 %}
2732 %union {
2733 double val; /* For returning numbers. */
2734 symrec *tptr; /* For returning symbol-table pointers. */
2735 }
2736 %token <val> NUM /* Simple double precision number. */
2737 %token <tptr> VAR FNCT /* Variable and Function. */
2738 %type <val> exp
2739
2740 %right '='
2741 %left '-' '+'
2742 %left '*' '/'
2743 %left NEG /* negation--unary minus */
2744 %right '^' /* exponentiation */
2745 %% /* The grammar follows. */
2746
2747 The above grammar introduces only two new features of the Bison
2748 language. These features allow semantic values to have various data
2749 types (*note More Than One Value Type: Multiple Types.).
2750
2751 The `%union' declaration specifies the entire list of possible types;
2752 this is instead of defining `YYSTYPE'. The allowable types are now
2753 double-floats (for `exp' and `NUM') and pointers to entries in the
2754 symbol table. *Note The Collection of Value Types: Union Decl.
2755
2756 Since values can now have various types, it is necessary to
2757 associate a type with each grammar symbol whose semantic value is used.
2758 These symbols are `NUM', `VAR', `FNCT', and `exp'. Their declarations
2759 are augmented with information about their data type (placed between
2760 angle brackets).
2761
2762 The Bison construct `%type' is used for declaring nonterminal
2763 symbols, just as `%token' is used for declaring token types. We have
2764 not used `%type' before because nonterminal symbols are normally
2765 declared implicitly by the rules that define them. But `exp' must be
2766 declared explicitly so we can specify its value type. *Note
2767 Nonterminal Symbols: Type Decl.
2768
2769 
2770 File: bison.info, Node: Mfcalc Rules, Next: Mfcalc Symbol Table, Prev: Mfcalc Declarations, Up: Multi-function Calc
2771
2772 2.5.2 Grammar Rules for `mfcalc'
2773 --------------------------------
2774
2775 Here are the grammar rules for the multi-function calculator. Most of
2776 them are copied directly from `calc'; three rules, those which mention
2777 `VAR' or `FNCT', are new.
2778
2779 input: /* empty */
2780 | input line
2781 ;
2782
2783 line:
2784 '\n'
2785 | exp '\n' { printf ("\t%.10g\n", $1); }
2786 | error '\n' { yyerrok; }
2787 ;
2788
2789 exp: NUM { $$ = $1; }
2790 | VAR { $$ = $1->value.var; }
2791 | VAR '=' exp { $$ = $3; $1->value.var = $3; }
2792 | FNCT '(' exp ')' { $$ = (*($1->value.fnctptr))($3); }
2793 | exp '+' exp { $$ = $1 + $3; }
2794 | exp '-' exp { $$ = $1 - $3; }
2795 | exp '*' exp { $$ = $1 * $3; }
2796 | exp '/' exp { $$ = $1 / $3; }
2797 | '-' exp %prec NEG { $$ = -$2; }
2798 | exp '^' exp { $$ = pow ($1, $3); }
2799 | '(' exp ')' { $$ = $2; }
2800 ;
2801 /* End of grammar. */
2802 %%
2803
2804 
2805 File: bison.info, Node: Mfcalc Symbol Table, Prev: Mfcalc Rules, Up: Multi-fu nction Calc
2806
2807 2.5.3 The `mfcalc' Symbol Table
2808 -------------------------------
2809
2810 The multi-function calculator requires a symbol table to keep track of
2811 the names and meanings of variables and functions. This doesn't affect
2812 the grammar rules (except for the actions) or the Bison declarations,
2813 but it requires some additional C functions for support.
2814
2815 The symbol table itself consists of a linked list of records. Its
2816 definition, which is kept in the header `calc.h', is as follows. It
2817 provides for either functions or variables to be placed in the table.
2818
2819 /* Function type. */
2820 typedef double (*func_t) (double);
2821
2822 /* Data type for links in the chain of symbols. */
2823 struct symrec
2824 {
2825 char *name; /* name of symbol */
2826 int type; /* type of symbol: either VAR or FNCT */
2827 union
2828 {
2829 double var; /* value of a VAR */
2830 func_t fnctptr; /* value of a FNCT */
2831 } value;
2832 struct symrec *next; /* link field */
2833 };
2834
2835 typedef struct symrec symrec;
2836
2837 /* The symbol table: a chain of `struct symrec'. */
2838 extern symrec *sym_table;
2839
2840 symrec *putsym (char const *, int);
2841 symrec *getsym (char const *);
2842
2843 The new version of `main' includes a call to `init_table', a
2844 function that initializes the symbol table. Here it is, and
2845 `init_table' as well:
2846
2847 #include <stdio.h>
2848
2849 /* Called by yyparse on error. */
2850 void
2851 yyerror (char const *s)
2852 {
2853 printf ("%s\n", s);
2854 }
2855
2856 struct init
2857 {
2858 char const *fname;
2859 double (*fnct) (double);
2860 };
2861
2862 struct init const arith_fncts[] =
2863 {
2864 "sin", sin,
2865 "cos", cos,
2866 "atan", atan,
2867 "ln", log,
2868 "exp", exp,
2869 "sqrt", sqrt,
2870 0, 0
2871 };
2872
2873 /* The symbol table: a chain of `struct symrec'. */
2874 symrec *sym_table;
2875
2876 /* Put arithmetic functions in table. */
2877 void
2878 init_table (void)
2879 {
2880 int i;
2881 symrec *ptr;
2882 for (i = 0; arith_fncts[i].fname != 0; i++)
2883 {
2884 ptr = putsym (arith_fncts[i].fname, FNCT);
2885 ptr->value.fnctptr = arith_fncts[i].fnct;
2886 }
2887 }
2888
2889 int
2890 main (void)
2891 {
2892 init_table ();
2893 return yyparse ();
2894 }
2895
2896 By simply editing the initialization list and adding the necessary
2897 include files, you can add additional functions to the calculator.
2898
2899 Two important functions allow look-up and installation of symbols in
2900 the symbol table. The function `putsym' is passed a name and the type
2901 (`VAR' or `FNCT') of the object to be installed. The object is linked
2902 to the front of the list, and a pointer to the object is returned. The
2903 function `getsym' is passed the name of the symbol to look up. If
2904 found, a pointer to that symbol is returned; otherwise zero is returned.
2905
2906 symrec *
2907 putsym (char const *sym_name, int sym_type)
2908 {
2909 symrec *ptr;
2910 ptr = (symrec *) malloc (sizeof (symrec));
2911 ptr->name = (char *) malloc (strlen (sym_name) + 1);
2912 strcpy (ptr->name,sym_name);
2913 ptr->type = sym_type;
2914 ptr->value.var = 0; /* Set value to 0 even if fctn. */
2915 ptr->next = (struct symrec *)sym_table;
2916 sym_table = ptr;
2917 return ptr;
2918 }
2919
2920 symrec *
2921 getsym (char const *sym_name)
2922 {
2923 symrec *ptr;
2924 for (ptr = sym_table; ptr != (symrec *) 0;
2925 ptr = (symrec *)ptr->next)
2926 if (strcmp (ptr->name,sym_name) == 0)
2927 return ptr;
2928 return 0;
2929 }
2930
2931 The function `yylex' must now recognize variables, numeric values,
2932 and the single-character arithmetic operators. Strings of alphanumeric
2933 characters with a leading letter are recognized as either variables or
2934 functions depending on what the symbol table says about them.
2935
2936 The string is passed to `getsym' for look up in the symbol table. If
2937 the name appears in the table, a pointer to its location and its type
2938 (`VAR' or `FNCT') is returned to `yyparse'. If it is not already in
2939 the table, then it is installed as a `VAR' using `putsym'. Again, a
2940 pointer and its type (which must be `VAR') is returned to `yyparse'.
2941
2942 No change is needed in the handling of numeric values and arithmetic
2943 operators in `yylex'.
2944
2945 #include <ctype.h>
2946
2947 int
2948 yylex (void)
2949 {
2950 int c;
2951
2952 /* Ignore white space, get first nonwhite character. */
2953 while ((c = getchar ()) == ' ' || c == '\t');
2954
2955 if (c == EOF)
2956 return 0;
2957
2958 /* Char starts a number => parse the number. */
2959 if (c == '.' || isdigit (c))
2960 {
2961 ungetc (c, stdin);
2962 scanf ("%lf", &yylval.val);
2963 return NUM;
2964 }
2965
2966 /* Char starts an identifier => read the name. */
2967 if (isalpha (c))
2968 {
2969 symrec *s;
2970 static char *symbuf = 0;
2971 static int length = 0;
2972 int i;
2973
2974 /* Initially make the buffer long enough
2975 for a 40-character symbol name. */
2976 if (length == 0)
2977 length = 40, symbuf = (char *)malloc (length + 1);
2978
2979 i = 0;
2980 do
2981 {
2982 /* If buffer is full, make it bigger. */
2983 if (i == length)
2984 {
2985 length *= 2;
2986 symbuf = (char *) realloc (symbuf, length + 1);
2987 }
2988 /* Add this character to the buffer. */
2989 symbuf[i++] = c;
2990 /* Get another character. */
2991 c = getchar ();
2992 }
2993 while (isalnum (c));
2994
2995 ungetc (c, stdin);
2996 symbuf[i] = '\0';
2997
2998 s = getsym (symbuf);
2999 if (s == 0)
3000 s = putsym (symbuf, VAR);
3001 yylval.tptr = s;
3002 return s->type;
3003 }
3004
3005 /* Any other character is a token by itself. */
3006 return c;
3007 }
3008
3009 This program is both powerful and flexible. You may easily add new
3010 functions, and it is a simple job to modify this code to install
3011 predefined variables such as `pi' or `e' as well.
3012
3013 
3014 File: bison.info, Node: Exercises, Prev: Multi-function Calc, Up: Examples
3015
3016 2.6 Exercises
3017 =============
3018
3019 1. Add some new functions from `math.h' to the initialization list.
3020
3021 2. Add another array that contains constants and their values. Then
3022 modify `init_table' to add these constants to the symbol table.
3023 It will be easiest to give the constants type `VAR'.
3024
3025 3. Make the program report an error if the user refers to an
3026 uninitialized variable in any way except to store a value in it.
3027
3028 
3029 File: bison.info, Node: Grammar File, Next: Interface, Prev: Examples, Up: T op
3030
3031 3 Bison Grammar Files
3032 *********************
3033
3034 Bison takes as input a context-free grammar specification and produces a
3035 C-language function that recognizes correct instances of the grammar.
3036
3037 The Bison grammar input file conventionally has a name ending in
3038 `.y'. *Note Invoking Bison: Invocation.
3039
3040 * Menu:
3041
3042 * Grammar Outline:: Overall layout of the grammar file.
3043 * Symbols:: Terminal and nonterminal symbols.
3044 * Rules:: How to write grammar rules.
3045 * Recursion:: Writing recursive rules.
3046 * Semantics:: Semantic values and actions.
3047 * Locations:: Locations and actions.
3048 * Declarations:: All kinds of Bison declarations are described here.
3049 * Multiple Parsers:: Putting more than one Bison parser in one program.
3050
3051 
3052 File: bison.info, Node: Grammar Outline, Next: Symbols, Up: Grammar File
3053
3054 3.1 Outline of a Bison Grammar
3055 ==============================
3056
3057 A Bison grammar file has four main sections, shown here with the
3058 appropriate delimiters:
3059
3060 %{
3061 PROLOGUE
3062 %}
3063
3064 BISON DECLARATIONS
3065
3066 %%
3067 GRAMMAR RULES
3068 %%
3069
3070 EPILOGUE
3071
3072 Comments enclosed in `/* ... */' may appear in any of the sections.
3073 As a GNU extension, `//' introduces a comment that continues until end
3074 of line.
3075
3076 * Menu:
3077
3078 * Prologue:: Syntax and usage of the prologue.
3079 * Prologue Alternatives:: Syntax and usage of alternatives to the prologue.
3080 * Bison Declarations:: Syntax and usage of the Bison declarations section.
3081 * Grammar Rules:: Syntax and usage of the grammar rules section.
3082 * Epilogue:: Syntax and usage of the epilogue.
3083
3084 
3085 File: bison.info, Node: Prologue, Next: Prologue Alternatives, Up: Grammar Ou tline
3086
3087 3.1.1 The prologue
3088 ------------------
3089
3090 The PROLOGUE section contains macro definitions and declarations of
3091 functions and variables that are used in the actions in the grammar
3092 rules. These are copied to the beginning of the parser file so that
3093 they precede the definition of `yyparse'. You can use `#include' to
3094 get the declarations from a header file. If you don't need any C
3095 declarations, you may omit the `%{' and `%}' delimiters that bracket
3096 this section.
3097
3098 The PROLOGUE section is terminated by the first occurrence of `%}'
3099 that is outside a comment, a string literal, or a character constant.
3100
3101 You may have more than one PROLOGUE section, intermixed with the
3102 BISON DECLARATIONS. This allows you to have C and Bison declarations
3103 that refer to each other. For example, the `%union' declaration may
3104 use types defined in a header file, and you may wish to prototype
3105 functions that take arguments of type `YYSTYPE'. This can be done with
3106 two PROLOGUE blocks, one before and one after the `%union' declaration.
3107
3108 %{
3109 #define _GNU_SOURCE
3110 #include <stdio.h>
3111 #include "ptypes.h"
3112 %}
3113
3114 %union {
3115 long int n;
3116 tree t; /* `tree' is defined in `ptypes.h'. */
3117 }
3118
3119 %{
3120 static void print_token_value (FILE *, int, YYSTYPE);
3121 #define YYPRINT(F, N, L) print_token_value (F, N, L)
3122 %}
3123
3124 ...
3125
3126 When in doubt, it is usually safer to put prologue code before all
3127 Bison declarations, rather than after. For example, any definitions of
3128 feature test macros like `_GNU_SOURCE' or `_POSIX_C_SOURCE' should
3129 appear before all Bison declarations, as feature test macros can affect
3130 the behavior of Bison-generated `#include' directives.
3131
3132 
3133 File: bison.info, Node: Prologue Alternatives, Next: Bison Declarations, Prev : Prologue, Up: Grammar Outline
3134
3135 3.1.2 Prologue Alternatives
3136 ---------------------------
3137
3138 (The prologue alternatives described here are experimental. More user
3139 feedback will help to determine whether they should become permanent
3140 features.)
3141
3142 The functionality of PROLOGUE sections can often be subtle and
3143 inflexible. As an alternative, Bison provides a %code directive with
3144 an explicit qualifier field, which identifies the purpose of the code
3145 and thus the location(s) where Bison should generate it. For C/C++,
3146 the qualifier can be omitted for the default location, or it can be one
3147 of `requires', `provides', `top'. *Note %code: Decl Summary.
3148
3149 Look again at the example of the previous section:
3150
3151 %{
3152 #define _GNU_SOURCE
3153 #include <stdio.h>
3154 #include "ptypes.h"
3155 %}
3156
3157 %union {
3158 long int n;
3159 tree t; /* `tree' is defined in `ptypes.h'. */
3160 }
3161
3162 %{
3163 static void print_token_value (FILE *, int, YYSTYPE);
3164 #define YYPRINT(F, N, L) print_token_value (F, N, L)
3165 %}
3166
3167 ...
3168
3169 Notice that there are two PROLOGUE sections here, but there's a subtle
3170 distinction between their functionality. For example, if you decide to
3171 override Bison's default definition for `YYLTYPE', in which PROLOGUE
3172 section should you write your new definition? You should write it in
3173 the first since Bison will insert that code into the parser source code
3174 file _before_ the default `YYLTYPE' definition. In which PROLOGUE
3175 section should you prototype an internal function, `trace_token', that
3176 accepts `YYLTYPE' and `yytokentype' as arguments? You should prototype
3177 it in the second since Bison will insert that code _after_ the
3178 `YYLTYPE' and `yytokentype' definitions.
3179
3180 This distinction in functionality between the two PROLOGUE sections
3181 is established by the appearance of the `%union' between them. This
3182 behavior raises a few questions. First, why should the position of a
3183 `%union' affect definitions related to `YYLTYPE' and `yytokentype'?
3184 Second, what if there is no `%union'? In that case, the second kind of
3185 PROLOGUE section is not available. This behavior is not intuitive.
3186
3187 To avoid this subtle `%union' dependency, rewrite the example using a
3188 `%code top' and an unqualified `%code'. Let's go ahead and add the new
3189 `YYLTYPE' definition and the `trace_token' prototype at the same time:
3190
3191 %code top {
3192 #define _GNU_SOURCE
3193 #include <stdio.h>
3194
3195 /* WARNING: The following code really belongs
3196 * in a `%code requires'; see below. */
3197
3198 #include "ptypes.h"
3199 #define YYLTYPE YYLTYPE
3200 typedef struct YYLTYPE
3201 {
3202 int first_line;
3203 int first_column;
3204 int last_line;
3205 int last_column;
3206 char *filename;
3207 } YYLTYPE;
3208 }
3209
3210 %union {
3211 long int n;
3212 tree t; /* `tree' is defined in `ptypes.h'. */
3213 }
3214
3215 %code {
3216 static void print_token_value (FILE *, int, YYSTYPE);
3217 #define YYPRINT(F, N, L) print_token_value (F, N, L)
3218 static void trace_token (enum yytokentype token, YYLTYPE loc);
3219 }
3220
3221 ...
3222
3223 In this way, `%code top' and the unqualified `%code' achieve the same
3224 functionality as the two kinds of PROLOGUE sections, but it's always
3225 explicit which kind you intend. Moreover, both kinds are always
3226 available even in the absence of `%union'.
3227
3228 The `%code top' block above logically contains two parts. The first
3229 two lines before the warning need to appear near the top of the parser
3230 source code file. The first line after the warning is required by
3231 `YYSTYPE' and thus also needs to appear in the parser source code file.
3232 However, if you've instructed Bison to generate a parser header file
3233 (*note %defines: Decl Summary.), you probably want that line to appear
3234 before the `YYSTYPE' definition in that header file as well. The
3235 `YYLTYPE' definition should also appear in the parser header file to
3236 override the default `YYLTYPE' definition there.
3237
3238 In other words, in the `%code top' block above, all but the first two
3239 lines are dependency code required by the `YYSTYPE' and `YYLTYPE'
3240 definitions. Thus, they belong in one or more `%code requires':
3241
3242 %code top {
3243 #define _GNU_SOURCE
3244 #include <stdio.h>
3245 }
3246
3247 %code requires {
3248 #include "ptypes.h"
3249 }
3250 %union {
3251 long int n;
3252 tree t; /* `tree' is defined in `ptypes.h'. */
3253 }
3254
3255 %code requires {
3256 #define YYLTYPE YYLTYPE
3257 typedef struct YYLTYPE
3258 {
3259 int first_line;
3260 int first_column;
3261 int last_line;
3262 int last_column;
3263 char *filename;
3264 } YYLTYPE;
3265 }
3266
3267 %code {
3268 static void print_token_value (FILE *, int, YYSTYPE);
3269 #define YYPRINT(F, N, L) print_token_value (F, N, L)
3270 static void trace_token (enum yytokentype token, YYLTYPE loc);
3271 }
3272
3273 ...
3274
3275 Now Bison will insert `#include "ptypes.h"' and the new `YYLTYPE'
3276 definition before the Bison-generated `YYSTYPE' and `YYLTYPE'
3277 definitions in both the parser source code file and the parser header
3278 file. (By the same reasoning, `%code requires' would also be the
3279 appropriate place to write your own definition for `YYSTYPE'.)
3280
3281 When you are writing dependency code for `YYSTYPE' and `YYLTYPE', you
3282 should prefer `%code requires' over `%code top' regardless of whether
3283 you instruct Bison to generate a parser header file. When you are
3284 writing code that you need Bison to insert only into the parser source
3285 code file and that has no special need to appear at the top of that
3286 file, you should prefer the unqualified `%code' over `%code top'.
3287 These practices will make the purpose of each block of your code
3288 explicit to Bison and to other developers reading your grammar file.
3289 Following these practices, we expect the unqualified `%code' and `%code
3290 requires' to be the most important of the four PROLOGUE alternatives.
3291
3292 At some point while developing your parser, you might decide to
3293 provide `trace_token' to modules that are external to your parser.
3294 Thus, you might wish for Bison to insert the prototype into both the
3295 parser header file and the parser source code file. Since this
3296 function is not a dependency required by `YYSTYPE' or `YYLTYPE', it
3297 doesn't make sense to move its prototype to a `%code requires'. More
3298 importantly, since it depends upon `YYLTYPE' and `yytokentype', `%code
3299 requires' is not sufficient. Instead, move its prototype from the
3300 unqualified `%code' to a `%code provides':
3301
3302 %code top {
3303 #define _GNU_SOURCE
3304 #include <stdio.h>
3305 }
3306
3307 %code requires {
3308 #include "ptypes.h"
3309 }
3310 %union {
3311 long int n;
3312 tree t; /* `tree' is defined in `ptypes.h'. */
3313 }
3314
3315 %code requires {
3316 #define YYLTYPE YYLTYPE
3317 typedef struct YYLTYPE
3318 {
3319 int first_line;
3320 int first_column;
3321 int last_line;
3322 int last_column;
3323 char *filename;
3324 } YYLTYPE;
3325 }
3326
3327 %code provides {
3328 void trace_token (enum yytokentype token, YYLTYPE loc);
3329 }
3330
3331 %code {
3332 static void print_token_value (FILE *, int, YYSTYPE);
3333 #define YYPRINT(F, N, L) print_token_value (F, N, L)
3334 }
3335
3336 ...
3337
3338 Bison will insert the `trace_token' prototype into both the parser
3339 header file and the parser source code file after the definitions for
3340 `yytokentype', `YYLTYPE', and `YYSTYPE'.
3341
3342 The above examples are careful to write directives in an order that
3343 reflects the layout of the generated parser source code and header
3344 files: `%code top', `%code requires', `%code provides', and then
3345 `%code'. While your grammar files may generally be easier to read if
3346 you also follow this order, Bison does not require it. Instead, Bison
3347 lets you choose an organization that makes sense to you.
3348
3349 You may declare any of these directives multiple times in the
3350 grammar file. In that case, Bison concatenates the contained code in
3351 declaration order. This is the only way in which the position of one
3352 of these directives within the grammar file affects its functionality.
3353
3354 The result of the previous two properties is greater flexibility in
3355 how you may organize your grammar file. For example, you may organize
3356 semantic-type-related directives by semantic type:
3357
3358 %code requires { #include "type1.h" }
3359 %union { type1 field1; }
3360 %destructor { type1_free ($$); } <field1>
3361 %printer { type1_print ($$); } <field1>
3362
3363 %code requires { #include "type2.h" }
3364 %union { type2 field2; }
3365 %destructor { type2_free ($$); } <field2>
3366 %printer { type2_print ($$); } <field2>
3367
3368 You could even place each of the above directive groups in the rules
3369 section of the grammar file next to the set of rules that uses the
3370 associated semantic type. (In the rules section, you must terminate
3371 each of those directives with a semicolon.) And you don't have to
3372 worry that some directive (like a `%union') in the definitions section
3373 is going to adversely affect their functionality in some
3374 counter-intuitive manner just because it comes first. Such an
3375 organization is not possible using PROLOGUE sections.
3376
3377 This section has been concerned with explaining the advantages of
3378 the four PROLOGUE alternatives over the original Yacc PROLOGUE.
3379 However, in most cases when using these directives, you shouldn't need
3380 to think about all the low-level ordering issues discussed here.
3381 Instead, you should simply use these directives to label each block of
3382 your code according to its purpose and let Bison handle the ordering.
3383 `%code' is the most generic label. Move code to `%code requires',
3384 `%code provides', or `%code top' as needed.
3385
3386 
3387 File: bison.info, Node: Bison Declarations, Next: Grammar Rules, Prev: Prolog ue Alternatives, Up: Grammar Outline
3388
3389 3.1.3 The Bison Declarations Section
3390 ------------------------------------
3391
3392 The BISON DECLARATIONS section contains declarations that define
3393 terminal and nonterminal symbols, specify precedence, and so on. In
3394 some simple grammars you may not need any declarations. *Note Bison
3395 Declarations: Declarations.
3396
3397 
3398 File: bison.info, Node: Grammar Rules, Next: Epilogue, Prev: Bison Declaratio ns, Up: Grammar Outline
3399
3400 3.1.4 The Grammar Rules Section
3401 -------------------------------
3402
3403 The "grammar rules" section contains one or more Bison grammar rules,
3404 and nothing else. *Note Syntax of Grammar Rules: Rules.
3405
3406 There must always be at least one grammar rule, and the first `%%'
3407 (which precedes the grammar rules) may never be omitted even if it is
3408 the first thing in the file.
3409
3410 
3411 File: bison.info, Node: Epilogue, Prev: Grammar Rules, Up: Grammar Outline
3412
3413 3.1.5 The epilogue
3414 ------------------
3415
3416 The EPILOGUE is copied verbatim to the end of the parser file, just as
3417 the PROLOGUE is copied to the beginning. This is the most convenient
3418 place to put anything that you want to have in the parser file but
3419 which need not come before the definition of `yyparse'. For example,
3420 the definitions of `yylex' and `yyerror' often go here. Because C
3421 requires functions to be declared before being used, you often need to
3422 declare functions like `yylex' and `yyerror' in the Prologue, even if
3423 you define them in the Epilogue. *Note Parser C-Language Interface:
3424 Interface.
3425
3426 If the last section is empty, you may omit the `%%' that separates it
3427 from the grammar rules.
3428
3429 The Bison parser itself contains many macros and identifiers whose
3430 names start with `yy' or `YY', so it is a good idea to avoid using any
3431 such names (except those documented in this manual) in the epilogue of
3432 the grammar file.
3433
3434 
3435 File: bison.info, Node: Symbols, Next: Rules, Prev: Grammar Outline, Up: Gra mmar File
3436
3437 3.2 Symbols, Terminal and Nonterminal
3438 =====================================
3439
3440 "Symbols" in Bison grammars represent the grammatical classifications
3441 of the language.
3442
3443 A "terminal symbol" (also known as a "token type") represents a
3444 class of syntactically equivalent tokens. You use the symbol in grammar
3445 rules to mean that a token in that class is allowed. The symbol is
3446 represented in the Bison parser by a numeric code, and the `yylex'
3447 function returns a token type code to indicate what kind of token has
3448 been read. You don't need to know what the code value is; you can use
3449 the symbol to stand for it.
3450
3451 A "nonterminal symbol" stands for a class of syntactically
3452 equivalent groupings. The symbol name is used in writing grammar rules.
3453 By convention, it should be all lower case.
3454
3455 Symbol names can contain letters, digits (not at the beginning),
3456 underscores and periods. Periods make sense only in nonterminals.
3457
3458 There are three ways of writing terminal symbols in the grammar:
3459
3460 * A "named token type" is written with an identifier, like an
3461 identifier in C. By convention, it should be all upper case. Each
3462 such name must be defined with a Bison declaration such as
3463 `%token'. *Note Token Type Names: Token Decl.
3464
3465 * A "character token type" (or "literal character token") is written
3466 in the grammar using the same syntax used in C for character
3467 constants; for example, `'+'' is a character token type. A
3468 character token type doesn't need to be declared unless you need to
3469 specify its semantic value data type (*note Data Types of Semantic
3470 Values: Value Type.), associativity, or precedence (*note Operator
3471 Precedence: Precedence.).
3472
3473 By convention, a character token type is used only to represent a
3474 token that consists of that particular character. Thus, the token
3475 type `'+'' is used to represent the character `+' as a token.
3476 Nothing enforces this convention, but if you depart from it, your
3477 program will confuse other readers.
3478
3479 All the usual escape sequences used in character literals in C can
3480 be used in Bison as well, but you must not use the null character
3481 as a character literal because its numeric code, zero, signifies
3482 end-of-input (*note Calling Convention for `yylex': Calling
3483 Convention.). Also, unlike standard C, trigraphs have no special
3484 meaning in Bison character literals, nor is backslash-newline
3485 allowed.
3486
3487 * A "literal string token" is written like a C string constant; for
3488 example, `"<="' is a literal string token. A literal string token
3489 doesn't need to be declared unless you need to specify its semantic
3490 value data type (*note Value Type::), associativity, or precedence
3491 (*note Precedence::).
3492
3493 You can associate the literal string token with a symbolic name as
3494 an alias, using the `%token' declaration (*note Token
3495 Declarations: Token Decl.). If you don't do that, the lexical
3496 analyzer has to retrieve the token number for the literal string
3497 token from the `yytname' table (*note Calling Convention::).
3498
3499 *Warning*: literal string tokens do not work in Yacc.
3500
3501 By convention, a literal string token is used only to represent a
3502 token that consists of that particular string. Thus, you should
3503 use the token type `"<="' to represent the string `<=' as a token.
3504 Bison does not enforce this convention, but if you depart from
3505 it, people who read your program will be confused.
3506
3507 All the escape sequences used in string literals in C can be used
3508 in Bison as well, except that you must not use a null character
3509 within a string literal. Also, unlike Standard C, trigraphs have
3510 no special meaning in Bison string literals, nor is
3511 backslash-newline allowed. A literal string token must contain
3512 two or more characters; for a token containing just one character,
3513 use a character token (see above).
3514
3515 How you choose to write a terminal symbol has no effect on its
3516 grammatical meaning. That depends only on where it appears in rules and
3517 on when the parser function returns that symbol.
3518
3519 The value returned by `yylex' is always one of the terminal symbols,
3520 except that a zero or negative value signifies end-of-input. Whichever
3521 way you write the token type in the grammar rules, you write it the
3522 same way in the definition of `yylex'. The numeric code for a
3523 character token type is simply the positive numeric code of the
3524 character, so `yylex' can use the identical value to generate the
3525 requisite code, though you may need to convert it to `unsigned char' to
3526 avoid sign-extension on hosts where `char' is signed. Each named token
3527 type becomes a C macro in the parser file, so `yylex' can use the name
3528 to stand for the code. (This is why periods don't make sense in
3529 terminal symbols.) *Note Calling Convention for `yylex': Calling
3530 Convention.
3531
3532 If `yylex' is defined in a separate file, you need to arrange for the
3533 token-type macro definitions to be available there. Use the `-d'
3534 option when you run Bison, so that it will write these macro definitions
3535 into a separate header file `NAME.tab.h' which you can include in the
3536 other source files that need it. *Note Invoking Bison: Invocation.
3537
3538 If you want to write a grammar that is portable to any Standard C
3539 host, you must use only nonnull character tokens taken from the basic
3540 execution character set of Standard C. This set consists of the ten
3541 digits, the 52 lower- and upper-case English letters, and the
3542 characters in the following C-language string:
3543
3544 "\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_{|}~"
3545
3546 The `yylex' function and Bison must use a consistent character set
3547 and encoding for character tokens. For example, if you run Bison in an
3548 ASCII environment, but then compile and run the resulting program in an
3549 environment that uses an incompatible character set like EBCDIC, the
3550 resulting program may not work because the tables generated by Bison
3551 will assume ASCII numeric values for character tokens. It is standard
3552 practice for software distributions to contain C source files that were
3553 generated by Bison in an ASCII environment, so installers on platforms
3554 that are incompatible with ASCII must rebuild those files before
3555 compiling them.
3556
3557 The symbol `error' is a terminal symbol reserved for error recovery
3558 (*note Error Recovery::); you shouldn't use it for any other purpose.
3559 In particular, `yylex' should never return this value. The default
3560 value of the error token is 256, unless you explicitly assigned 256 to
3561 one of your tokens with a `%token' declaration.
3562
3563 
3564 File: bison.info, Node: Rules, Next: Recursion, Prev: Symbols, Up: Grammar F ile
3565
3566 3.3 Syntax of Grammar Rules
3567 ===========================
3568
3569 A Bison grammar rule has the following general form:
3570
3571 RESULT: COMPONENTS...
3572 ;
3573
3574 where RESULT is the nonterminal symbol that this rule describes, and
3575 COMPONENTS are various terminal and nonterminal symbols that are put
3576 together by this rule (*note Symbols::).
3577
3578 For example,
3579
3580 exp: exp '+' exp
3581 ;
3582
3583 says that two groupings of type `exp', with a `+' token in between, can
3584 be combined into a larger grouping of type `exp'.
3585
3586 White space in rules is significant only to separate symbols. You
3587 can add extra white space as you wish.
3588
3589 Scattered among the components can be ACTIONS that determine the
3590 semantics of the rule. An action looks like this:
3591
3592 {C STATEMENTS}
3593
3594 This is an example of "braced code", that is, C code surrounded by
3595 braces, much like a compound statement in C. Braced code can contain
3596 any sequence of C tokens, so long as its braces are balanced. Bison
3597 does not check the braced code for correctness directly; it merely
3598 copies the code to the output file, where the C compiler can check it.
3599
3600 Within braced code, the balanced-brace count is not affected by
3601 braces within comments, string literals, or character constants, but it
3602 is affected by the C digraphs `<%' and `%>' that represent braces. At
3603 the top level braced code must be terminated by `}' and not by a
3604 digraph. Bison does not look for trigraphs, so if braced code uses
3605 trigraphs you should ensure that they do not affect the nesting of
3606 braces or the boundaries of comments, string literals, or character
3607 constants.
3608
3609 Usually there is only one action and it follows the components.
3610 *Note Actions::.
3611
3612 Multiple rules for the same RESULT can be written separately or can
3613 be joined with the vertical-bar character `|' as follows:
3614
3615 RESULT: RULE1-COMPONENTS...
3616 | RULE2-COMPONENTS...
3617 ...
3618 ;
3619
3620 They are still considered distinct rules even when joined in this way.
3621
3622 If COMPONENTS in a rule is empty, it means that RESULT can match the
3623 empty string. For example, here is how to define a comma-separated
3624 sequence of zero or more `exp' groupings:
3625
3626 expseq: /* empty */
3627 | expseq1
3628 ;
3629
3630 expseq1: exp
3631 | expseq1 ',' exp
3632 ;
3633
3634 It is customary to write a comment `/* empty */' in each rule with no
3635 components.
3636
3637 
3638 File: bison.info, Node: Recursion, Next: Semantics, Prev: Rules, Up: Grammar File
3639
3640 3.4 Recursive Rules
3641 ===================
3642
3643 A rule is called "recursive" when its RESULT nonterminal appears also
3644 on its right hand side. Nearly all Bison grammars need to use
3645 recursion, because that is the only way to define a sequence of any
3646 number of a particular thing. Consider this recursive definition of a
3647 comma-separated sequence of one or more expressions:
3648
3649 expseq1: exp
3650 | expseq1 ',' exp
3651 ;
3652
3653 Since the recursive use of `expseq1' is the leftmost symbol in the
3654 right hand side, we call this "left recursion". By contrast, here the
3655 same construct is defined using "right recursion":
3656
3657 expseq1: exp
3658 | exp ',' expseq1
3659 ;
3660
3661 Any kind of sequence can be defined using either left recursion or right
3662 recursion, but you should always use left recursion, because it can
3663 parse a sequence of any number of elements with bounded stack space.
3664 Right recursion uses up space on the Bison stack in proportion to the
3665 number of elements in the sequence, because all the elements must be
3666 shifted onto the stack before the rule can be applied even once. *Note
3667 The Bison Parser Algorithm: Algorithm, for further explanation of this.
3668
3669 "Indirect" or "mutual" recursion occurs when the result of the rule
3670 does not appear directly on its right hand side, but does appear in
3671 rules for other nonterminals which do appear on its right hand side.
3672
3673 For example:
3674
3675 expr: primary
3676 | primary '+' primary
3677 ;
3678
3679 primary: constant
3680 | '(' expr ')'
3681 ;
3682
3683 defines two mutually-recursive nonterminals, since each refers to the
3684 other.
3685
3686 
3687 File: bison.info, Node: Semantics, Next: Locations, Prev: Recursion, Up: Gra mmar File
3688
3689 3.5 Defining Language Semantics
3690 ===============================
3691
3692 The grammar rules for a language determine only the syntax. The
3693 semantics are determined by the semantic values associated with various
3694 tokens and groupings, and by the actions taken when various groupings
3695 are recognized.
3696
3697 For example, the calculator calculates properly because the value
3698 associated with each expression is the proper number; it adds properly
3699 because the action for the grouping `X + Y' is to add the numbers
3700 associated with X and Y.
3701
3702 * Menu:
3703
3704 * Value Type:: Specifying one data type for all semantic values.
3705 * Multiple Types:: Specifying several alternative data types.
3706 * Actions:: An action is the semantic definition of a grammar rule.
3707 * Action Types:: Specifying data types for actions to operate on.
3708 * Mid-Rule Actions:: Most actions go at the end of a rule.
3709 This says when, why and how to use the exceptional
3710 action in the middle of a rule.
3711
3712 
3713 File: bison.info, Node: Value Type, Next: Multiple Types, Up: Semantics
3714
3715 3.5.1 Data Types of Semantic Values
3716 -----------------------------------
3717
3718 In a simple program it may be sufficient to use the same data type for
3719 the semantic values of all language constructs. This was true in the
3720 RPN and infix calculator examples (*note Reverse Polish Notation
3721 Calculator: RPN Calc.).
3722
3723 Bison normally uses the type `int' for semantic values if your
3724 program uses the same data type for all language constructs. To
3725 specify some other type, define `YYSTYPE' as a macro, like this:
3726
3727 #define YYSTYPE double
3728
3729 `YYSTYPE''s replacement list should be a type name that does not
3730 contain parentheses or square brackets. This macro definition must go
3731 in the prologue of the grammar file (*note Outline of a Bison Grammar:
3732 Grammar Outline.).
3733
3734 
3735 File: bison.info, Node: Multiple Types, Next: Actions, Prev: Value Type, Up: Semantics
3736
3737 3.5.2 More Than One Value Type
3738 ------------------------------
3739
3740 In most programs, you will need different data types for different kinds
3741 of tokens and groupings. For example, a numeric constant may need type
3742 `int' or `long int', while a string constant needs type `char *', and
3743 an identifier might need a pointer to an entry in the symbol table.
3744
3745 To use more than one data type for semantic values in one parser,
3746 Bison requires you to do two things:
3747
3748 * Specify the entire collection of possible data types, either by
3749 using the `%union' Bison declaration (*note The Collection of
3750 Value Types: Union Decl.), or by using a `typedef' or a `#define'
3751 to define `YYSTYPE' to be a union type whose member names are the
3752 type tags.
3753
3754 * Choose one of those types for each symbol (terminal or
3755 nonterminal) for which semantic values are used. This is done for
3756 tokens with the `%token' Bison declaration (*note Token Type
3757 Names: Token Decl.) and for groupings with the `%type' Bison
3758 declaration (*note Nonterminal Symbols: Type Decl.).
3759
3760 
3761 File: bison.info, Node: Actions, Next: Action Types, Prev: Multiple Types, U p: Semantics
3762
3763 3.5.3 Actions
3764 -------------
3765
3766 An action accompanies a syntactic rule and contains C code to be
3767 executed each time an instance of that rule is recognized. The task of
3768 most actions is to compute a semantic value for the grouping built by
3769 the rule from the semantic values associated with tokens or smaller
3770 groupings.
3771
3772 An action consists of braced code containing C statements, and can be
3773 placed at any position in the rule; it is executed at that position.
3774 Most rules have just one action at the end of the rule, following all
3775 the components. Actions in the middle of a rule are tricky and used
3776 only for special purposes (*note Actions in Mid-Rule: Mid-Rule
3777 Actions.).
3778
3779 The C code in an action can refer to the semantic values of the
3780 components matched by the rule with the construct `$N', which stands for
3781 the value of the Nth component. The semantic value for the grouping
3782 being constructed is `$$'. Bison translates both of these constructs
3783 into expressions of the appropriate type when it copies the actions
3784 into the parser file. `$$' is translated to a modifiable lvalue, so it
3785 can be assigned to.
3786
3787 Here is a typical example:
3788
3789 exp: ...
3790 | exp '+' exp
3791 { $$ = $1 + $3; }
3792
3793 This rule constructs an `exp' from two smaller `exp' groupings
3794 connected by a plus-sign token. In the action, `$1' and `$3' refer to
3795 the semantic values of the two component `exp' groupings, which are the
3796 first and third symbols on the right hand side of the rule. The sum is
3797 stored into `$$' so that it becomes the semantic value of the
3798 addition-expression just recognized by the rule. If there were a
3799 useful semantic value associated with the `+' token, it could be
3800 referred to as `$2'.
3801
3802 Note that the vertical-bar character `|' is really a rule separator,
3803 and actions are attached to a single rule. This is a difference with
3804 tools like Flex, for which `|' stands for either "or", or "the same
3805 action as that of the next rule". In the following example, the action
3806 is triggered only when `b' is found:
3807
3808 a-or-b: 'a'|'b' { a_or_b_found = 1; };
3809
3810 If you don't specify an action for a rule, Bison supplies a default:
3811 `$$ = $1'. Thus, the value of the first symbol in the rule becomes the
3812 value of the whole rule. Of course, the default action is valid only
3813 if the two data types match. There is no meaningful default action for
3814 an empty rule; every empty rule must have an explicit action unless the
3815 rule's value does not matter.
3816
3817 `$N' with N zero or negative is allowed for reference to tokens and
3818 groupings on the stack _before_ those that match the current rule.
3819 This is a very risky practice, and to use it reliably you must be
3820 certain of the context in which the rule is applied. Here is a case in
3821 which you can use this reliably:
3822
3823 foo: expr bar '+' expr { ... }
3824 | expr bar '-' expr { ... }
3825 ;
3826
3827 bar: /* empty */
3828 { previous_expr = $0; }
3829 ;
3830
3831 As long as `bar' is used only in the fashion shown here, `$0' always
3832 refers to the `expr' which precedes `bar' in the definition of `foo'.
3833
3834 It is also possible to access the semantic value of the lookahead
3835 token, if any, from a semantic action. This semantic value is stored
3836 in `yylval'. *Note Special Features for Use in Actions: Action
3837 Features.
3838
3839 
3840 File: bison.info, Node: Action Types, Next: Mid-Rule Actions, Prev: Actions, Up: Semantics
3841
3842 3.5.4 Data Types of Values in Actions
3843 -------------------------------------
3844
3845 If you have chosen a single data type for semantic values, the `$$' and
3846 `$N' constructs always have that data type.
3847
3848 If you have used `%union' to specify a variety of data types, then
3849 you must declare a choice among these types for each terminal or
3850 nonterminal symbol that can have a semantic value. Then each time you
3851 use `$$' or `$N', its data type is determined by which symbol it refers
3852 to in the rule. In this example,
3853
3854 exp: ...
3855 | exp '+' exp
3856 { $$ = $1 + $3; }
3857
3858 `$1' and `$3' refer to instances of `exp', so they all have the data
3859 type declared for the nonterminal symbol `exp'. If `$2' were used, it
3860 would have the data type declared for the terminal symbol `'+'',
3861 whatever that might be.
3862
3863 Alternatively, you can specify the data type when you refer to the
3864 value, by inserting `<TYPE>' after the `$' at the beginning of the
3865 reference. For example, if you have defined types as shown here:
3866
3867 %union {
3868 int itype;
3869 double dtype;
3870 }
3871
3872 then you can write `$<itype>1' to refer to the first subunit of the
3873 rule as an integer, or `$<dtype>1' to refer to it as a double.
3874
3875 
3876 File: bison.info, Node: Mid-Rule Actions, Prev: Action Types, Up: Semantics
3877
3878 3.5.5 Actions in Mid-Rule
3879 -------------------------
3880
3881 Occasionally it is useful to put an action in the middle of a rule.
3882 These actions are written just like usual end-of-rule actions, but they
3883 are executed before the parser even recognizes the following components.
3884
3885 A mid-rule action may refer to the components preceding it using
3886 `$N', but it may not refer to subsequent components because it is run
3887 before they are parsed.
3888
3889 The mid-rule action itself counts as one of the components of the
3890 rule. This makes a difference when there is another action later in
3891 the same rule (and usually there is another at the end): you have to
3892 count the actions along with the symbols when working out which number
3893 N to use in `$N'.
3894
3895 The mid-rule action can also have a semantic value. The action can
3896 set its value with an assignment to `$$', and actions later in the rule
3897 can refer to the value using `$N'. Since there is no symbol to name
3898 the action, there is no way to declare a data type for the value in
3899 advance, so you must use the `$<...>N' construct to specify a data type
3900 each time you refer to this value.
3901
3902 There is no way to set the value of the entire rule with a mid-rule
3903 action, because assignments to `$$' do not have that effect. The only
3904 way to set the value for the entire rule is with an ordinary action at
3905 the end of the rule.
3906
3907 Here is an example from a hypothetical compiler, handling a `let'
3908 statement that looks like `let (VARIABLE) STATEMENT' and serves to
3909 create a variable named VARIABLE temporarily for the duration of
3910 STATEMENT. To parse this construct, we must put VARIABLE into the
3911 symbol table while STATEMENT is parsed, then remove it afterward. Here
3912 is how it is done:
3913
3914 stmt: LET '(' var ')'
3915 { $<context>$ = push_context ();
3916 declare_variable ($3); }
3917 stmt { $$ = $6;
3918 pop_context ($<context>5); }
3919
3920 As soon as `let (VARIABLE)' has been recognized, the first action is
3921 run. It saves a copy of the current semantic context (the list of
3922 accessible variables) as its semantic value, using alternative
3923 `context' in the data-type union. Then it calls `declare_variable' to
3924 add the new variable to that list. Once the first action is finished,
3925 the embedded statement `stmt' can be parsed. Note that the mid-rule
3926 action is component number 5, so the `stmt' is component number 6.
3927
3928 After the embedded statement is parsed, its semantic value becomes
3929 the value of the entire `let'-statement. Then the semantic value from
3930 the earlier action is used to restore the prior list of variables. This
3931 removes the temporary `let'-variable from the list so that it won't
3932 appear to exist while the rest of the program is parsed.
3933
3934 In the above example, if the parser initiates error recovery (*note
3935 Error Recovery::) while parsing the tokens in the embedded statement
3936 `stmt', it might discard the previous semantic context `$<context>5'
3937 without restoring it. Thus, `$<context>5' needs a destructor (*note
3938 Freeing Discarded Symbols: Destructor Decl.). However, Bison currently
3939 provides no means to declare a destructor specific to a particular
3940 mid-rule action's semantic value.
3941
3942 One solution is to bury the mid-rule action inside a nonterminal
3943 symbol and to declare a destructor for that symbol:
3944
3945 %type <context> let
3946 %destructor { pop_context ($$); } let
3947
3948 %%
3949
3950 stmt: let stmt
3951 { $$ = $2;
3952 pop_context ($1); }
3953 ;
3954
3955 let: LET '(' var ')'
3956 { $$ = push_context ();
3957 declare_variable ($3); }
3958 ;
3959
3960 Note that the action is now at the end of its rule. Any mid-rule
3961 action can be converted to an end-of-rule action in this way, and this
3962 is what Bison actually does to implement mid-rule actions.
3963
3964 Taking action before a rule is completely recognized often leads to
3965 conflicts since the parser must commit to a parse in order to execute
3966 the action. For example, the following two rules, without mid-rule
3967 actions, can coexist in a working parser because the parser can shift
3968 the open-brace token and look at what follows before deciding whether
3969 there is a declaration or not:
3970
3971 compound: '{' declarations statements '}'
3972 | '{' statements '}'
3973 ;
3974
3975 But when we add a mid-rule action as follows, the rules become
3976 nonfunctional:
3977
3978 compound: { prepare_for_local_variables (); }
3979 '{' declarations statements '}'
3980 | '{' statements '}'
3981 ;
3982
3983 Now the parser is forced to decide whether to run the mid-rule action
3984 when it has read no farther than the open-brace. In other words, it
3985 must commit to using one rule or the other, without sufficient
3986 information to do it correctly. (The open-brace token is what is called
3987 the "lookahead" token at this time, since the parser is still deciding
3988 what to do about it. *Note Lookahead Tokens: Lookahead.)
3989
3990 You might think that you could correct the problem by putting
3991 identical actions into the two rules, like this:
3992
3993 compound: { prepare_for_local_variables (); }
3994 '{' declarations statements '}'
3995 | { prepare_for_local_variables (); }
3996 '{' statements '}'
3997 ;
3998
3999 But this does not help, because Bison does not realize that the two
4000 actions are identical. (Bison never tries to understand the C code in
4001 an action.)
4002
4003 If the grammar is such that a declaration can be distinguished from a
4004 statement by the first token (which is true in C), then one solution
4005 which does work is to put the action after the open-brace, like this:
4006
4007 compound: '{' { prepare_for_local_variables (); }
4008 declarations statements '}'
4009 | '{' statements '}'
4010 ;
4011
4012 Now the first token of the following declaration or statement, which
4013 would in any case tell Bison which rule to use, can still do so.
4014
4015 Another solution is to bury the action inside a nonterminal symbol
4016 which serves as a subroutine:
4017
4018 subroutine: /* empty */
4019 { prepare_for_local_variables (); }
4020 ;
4021
4022 compound: subroutine
4023 '{' declarations statements '}'
4024 | subroutine
4025 '{' statements '}'
4026 ;
4027
4028 Now Bison can execute the action in the rule for `subroutine' without
4029 deciding which rule for `compound' it will eventually use.
4030
4031 
4032 File: bison.info, Node: Locations, Next: Declarations, Prev: Semantics, Up: Grammar File
4033
4034 3.6 Tracking Locations
4035 ======================
4036
4037 Though grammar rules and semantic actions are enough to write a fully
4038 functional parser, it can be useful to process some additional
4039 information, especially symbol locations.
4040
4041 The way locations are handled is defined by providing a data type,
4042 and actions to take when rules are matched.
4043
4044 * Menu:
4045
4046 * Location Type:: Specifying a data type for locations.
4047 * Actions and Locations:: Using locations in actions.
4048 * Location Default Action:: Defining a general way to compute locations.
4049
4050 
4051 File: bison.info, Node: Location Type, Next: Actions and Locations, Up: Locat ions
4052
4053 3.6.1 Data Type of Locations
4054 ----------------------------
4055
4056 Defining a data type for locations is much simpler than for semantic
4057 values, since all tokens and groupings always use the same type.
4058
4059 You can specify the type of locations by defining a macro called
4060 `YYLTYPE', just as you can specify the semantic value type by defining
4061 a `YYSTYPE' macro (*note Value Type::). When `YYLTYPE' is not defined,
4062 Bison uses a default structure type with four members:
4063
4064 typedef struct YYLTYPE
4065 {
4066 int first_line;
4067 int first_column;
4068 int last_line;
4069 int last_column;
4070 } YYLTYPE;
4071
4072 At the beginning of the parsing, Bison initializes all these fields
4073 to 1 for `yylloc'.
4074
4075 
4076 File: bison.info, Node: Actions and Locations, Next: Location Default Action, Prev: Location Type, Up: Locations
4077
4078 3.6.2 Actions and Locations
4079 ---------------------------
4080
4081 Actions are not only useful for defining language semantics, but also
4082 for describing the behavior of the output parser with locations.
4083
4084 The most obvious way for building locations of syntactic groupings
4085 is very similar to the way semantic values are computed. In a given
4086 rule, several constructs can be used to access the locations of the
4087 elements being matched. The location of the Nth component of the right
4088 hand side is `@N', while the location of the left hand side grouping is
4089 `@$'.
4090
4091 Here is a basic example using the default data type for locations:
4092
4093 exp: ...
4094 | exp '/' exp
4095 {
4096 @$.first_column = @1.first_column;
4097 @$.first_line = @1.first_line;
4098 @$.last_column = @3.last_column;
4099 @$.last_line = @3.last_line;
4100 if ($3)
4101 $$ = $1 / $3;
4102 else
4103 {
4104 $$ = 1;
4105 fprintf (stderr,
4106 "Division by zero, l%d,c%d-l%d,c%d",
4107 @3.first_line, @3.first_column,
4108 @3.last_line, @3.last_column);
4109 }
4110 }
4111
4112 As for semantic values, there is a default action for locations that
4113 is run each time a rule is matched. It sets the beginning of `@$' to
4114 the beginning of the first symbol, and the end of `@$' to the end of the
4115 last symbol.
4116
4117 With this default action, the location tracking can be fully
4118 automatic. The example above simply rewrites this way:
4119
4120 exp: ...
4121 | exp '/' exp
4122 {
4123 if ($3)
4124 $$ = $1 / $3;
4125 else
4126 {
4127 $$ = 1;
4128 fprintf (stderr,
4129 "Division by zero, l%d,c%d-l%d,c%d",
4130 @3.first_line, @3.first_column,
4131 @3.last_line, @3.last_column);
4132 }
4133 }
4134
4135 It is also possible to access the location of the lookahead token,
4136 if any, from a semantic action. This location is stored in `yylloc'.
4137 *Note Special Features for Use in Actions: Action Features.
4138
4139 
4140 File: bison.info, Node: Location Default Action, Prev: Actions and Locations, Up: Locations
4141
4142 3.6.3 Default Action for Locations
4143 ----------------------------------
4144
4145 Actually, actions are not the best place to compute locations. Since
4146 locations are much more general than semantic values, there is room in
4147 the output parser to redefine the default action to take for each rule.
4148 The `YYLLOC_DEFAULT' macro is invoked each time a rule is matched,
4149 before the associated action is run. It is also invoked while
4150 processing a syntax error, to compute the error's location. Before
4151 reporting an unresolvable syntactic ambiguity, a GLR parser invokes
4152 `YYLLOC_DEFAULT' recursively to compute the location of that ambiguity.
4153
4154 Most of the time, this macro is general enough to suppress location
4155 dedicated code from semantic actions.
4156
4157 The `YYLLOC_DEFAULT' macro takes three parameters. The first one is
4158 the location of the grouping (the result of the computation). When a
4159 rule is matched, the second parameter identifies locations of all right
4160 hand side elements of the rule being matched, and the third parameter
4161 is the size of the rule's right hand side. When a GLR parser reports
4162 an ambiguity, which of multiple candidate right hand sides it passes to
4163 `YYLLOC_DEFAULT' is undefined. When processing a syntax error, the
4164 second parameter identifies locations of the symbols that were
4165 discarded during error processing, and the third parameter is the
4166 number of discarded symbols.
4167
4168 By default, `YYLLOC_DEFAULT' is defined this way:
4169
4170 # define YYLLOC_DEFAULT(Current, Rhs, N) \
4171 do \
4172 if (N) \
4173 { \
4174 (Current).first_line = YYRHSLOC(Rhs, 1).first_line; \
4175 (Current).first_column = YYRHSLOC(Rhs, 1).first_column; \
4176 (Current).last_line = YYRHSLOC(Rhs, N).last_line; \
4177 (Current).last_column = YYRHSLOC(Rhs, N).last_column; \
4178 } \
4179 else \
4180 { \
4181 (Current).first_line = (Current).last_line = \
4182 YYRHSLOC(Rhs, 0).last_line; \
4183 (Current).first_column = (Current).last_column = \
4184 YYRHSLOC(Rhs, 0).last_column; \
4185 } \
4186 while (0)
4187
4188 where `YYRHSLOC (rhs, k)' is the location of the Kth symbol in RHS
4189 when K is positive, and the location of the symbol just before the
4190 reduction when K and N are both zero.
4191
4192 When defining `YYLLOC_DEFAULT', you should consider that:
4193
4194 * All arguments are free of side-effects. However, only the first
4195 one (the result) should be modified by `YYLLOC_DEFAULT'.
4196
4197 * For consistency with semantic actions, valid indexes within the
4198 right hand side range from 1 to N. When N is zero, only 0 is a
4199 valid index, and it refers to the symbol just before the reduction.
4200 During error processing N is always positive.
4201
4202 * Your macro should parenthesize its arguments, if need be, since the
4203 actual arguments may not be surrounded by parentheses. Also, your
4204 macro should expand to something that can be used as a single
4205 statement when it is followed by a semicolon.
4206
4207 
4208 File: bison.info, Node: Declarations, Next: Multiple Parsers, Prev: Locations , Up: Grammar File
4209
4210 3.7 Bison Declarations
4211 ======================
4212
4213 The "Bison declarations" section of a Bison grammar defines the symbols
4214 used in formulating the grammar and the data types of semantic values.
4215 *Note Symbols::.
4216
4217 All token type names (but not single-character literal tokens such as
4218 `'+'' and `'*'') must be declared. Nonterminal symbols must be
4219 declared if you need to specify which data type to use for the semantic
4220 value (*note More Than One Value Type: Multiple Types.).
4221
4222 The first rule in the file also specifies the start symbol, by
4223 default. If you want some other symbol to be the start symbol, you
4224 must declare it explicitly (*note Languages and Context-Free Grammars:
4225 Language and Grammar.).
4226
4227 * Menu:
4228
4229 * Require Decl:: Requiring a Bison version.
4230 * Token Decl:: Declaring terminal symbols.
4231 * Precedence Decl:: Declaring terminals with precedence and associativity.
4232 * Union Decl:: Declaring the set of all semantic value types.
4233 * Type Decl:: Declaring the choice of type for a nonterminal symbol.
4234 * Initial Action Decl:: Code run before parsing starts.
4235 * Destructor Decl:: Declaring how symbols are freed.
4236 * Expect Decl:: Suppressing warnings about parsing conflicts.
4237 * Start Decl:: Specifying the start symbol.
4238 * Pure Decl:: Requesting a reentrant parser.
4239 * Push Decl:: Requesting a push parser.
4240 * Decl Summary:: Table of all Bison declarations.
4241
4242 
4243 File: bison.info, Node: Require Decl, Next: Token Decl, Up: Declarations
4244
4245 3.7.1 Require a Version of Bison
4246 --------------------------------
4247
4248 You may require the minimum version of Bison to process the grammar. If
4249 the requirement is not met, `bison' exits with an error (exit status
4250 63).
4251
4252 %require "VERSION"
4253
4254 
4255 File: bison.info, Node: Token Decl, Next: Precedence Decl, Prev: Require Decl , Up: Declarations
4256
4257 3.7.2 Token Type Names
4258 ----------------------
4259
4260 The basic way to declare a token type name (terminal symbol) is as
4261 follows:
4262
4263 %token NAME
4264
4265 Bison will convert this into a `#define' directive in the parser, so
4266 that the function `yylex' (if it is in this file) can use the name NAME
4267 to stand for this token type's code.
4268
4269 Alternatively, you can use `%left', `%right', or `%nonassoc' instead
4270 of `%token', if you wish to specify associativity and precedence.
4271 *Note Operator Precedence: Precedence Decl.
4272
4273 You can explicitly specify the numeric code for a token type by
4274 appending a nonnegative decimal or hexadecimal integer value in the
4275 field immediately following the token name:
4276
4277 %token NUM 300
4278 %token XNUM 0x12d // a GNU extension
4279
4280 It is generally best, however, to let Bison choose the numeric codes for
4281 all token types. Bison will automatically select codes that don't
4282 conflict with each other or with normal characters.
4283
4284 In the event that the stack type is a union, you must augment the
4285 `%token' or other token declaration to include the data type
4286 alternative delimited by angle-brackets (*note More Than One Value
4287 Type: Multiple Types.).
4288
4289 For example:
4290
4291 %union { /* define stack type */
4292 double val;
4293 symrec *tptr;
4294 }
4295 %token <val> NUM /* define token NUM and its type */
4296
4297 You can associate a literal string token with a token type name by
4298 writing the literal string at the end of a `%token' declaration which
4299 declares the name. For example:
4300
4301 %token arrow "=>"
4302
4303 For example, a grammar for the C language might specify these names with
4304 equivalent literal string tokens:
4305
4306 %token <operator> OR "||"
4307 %token <operator> LE 134 "<="
4308 %left OR "<="
4309
4310 Once you equate the literal string and the token name, you can use them
4311 interchangeably in further declarations or the grammar rules. The
4312 `yylex' function can use the token name or the literal string to obtain
4313 the token type code number (*note Calling Convention::). Syntax error
4314 messages passed to `yyerror' from the parser will reference the literal
4315 string instead of the token name.
4316
4317 The token numbered as 0 corresponds to end of file; the following
4318 line allows for nicer error messages referring to "end of file" instead
4319 of "$end":
4320
4321 %token END 0 "end of file"
4322
4323 
4324 File: bison.info, Node: Precedence Decl, Next: Union Decl, Prev: Token Decl, Up: Declarations
4325
4326 3.7.3 Operator Precedence
4327 -------------------------
4328
4329 Use the `%left', `%right' or `%nonassoc' declaration to declare a token
4330 and specify its precedence and associativity, all at once. These are
4331 called "precedence declarations". *Note Operator Precedence:
4332 Precedence, for general information on operator precedence.
4333
4334 The syntax of a precedence declaration is nearly the same as that of
4335 `%token': either
4336
4337 %left SYMBOLS...
4338
4339 or
4340
4341 %left <TYPE> SYMBOLS...
4342
4343 And indeed any of these declarations serves the purposes of `%token'.
4344 But in addition, they specify the associativity and relative precedence
4345 for all the SYMBOLS:
4346
4347 * The associativity of an operator OP determines how repeated uses
4348 of the operator nest: whether `X OP Y OP Z' is parsed by grouping
4349 X with Y first or by grouping Y with Z first. `%left' specifies
4350 left-associativity (grouping X with Y first) and `%right'
4351 specifies right-associativity (grouping Y with Z first).
4352 `%nonassoc' specifies no associativity, which means that `X OP Y
4353 OP Z' is considered a syntax error.
4354
4355 * The precedence of an operator determines how it nests with other
4356 operators. All the tokens declared in a single precedence
4357 declaration have equal precedence and nest together according to
4358 their associativity. When two tokens declared in different
4359 precedence declarations associate, the one declared later has the
4360 higher precedence and is grouped first.
4361
4362 For backward compatibility, there is a confusing difference between
4363 the argument lists of `%token' and precedence declarations. Only a
4364 `%token' can associate a literal string with a token type name. A
4365 precedence declaration always interprets a literal string as a
4366 reference to a separate token. For example:
4367
4368 %left OR "<=" // Does not declare an alias.
4369 %left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=".
4370
4371 
4372 File: bison.info, Node: Union Decl, Next: Type Decl, Prev: Precedence Decl, Up: Declarations
4373
4374 3.7.4 The Collection of Value Types
4375 -----------------------------------
4376
4377 The `%union' declaration specifies the entire collection of possible
4378 data types for semantic values. The keyword `%union' is followed by
4379 braced code containing the same thing that goes inside a `union' in C.
4380
4381 For example:
4382
4383 %union {
4384 double val;
4385 symrec *tptr;
4386 }
4387
4388 This says that the two alternative types are `double' and `symrec *'.
4389 They are given names `val' and `tptr'; these names are used in the
4390 `%token' and `%type' declarations to pick one of the types for a
4391 terminal or nonterminal symbol (*note Nonterminal Symbols: Type Decl.).
4392
4393 As an extension to POSIX, a tag is allowed after the `union'. For
4394 example:
4395
4396 %union value {
4397 double val;
4398 symrec *tptr;
4399 }
4400
4401 specifies the union tag `value', so the corresponding C type is `union
4402 value'. If you do not specify a tag, it defaults to `YYSTYPE'.
4403
4404 As another extension to POSIX, you may specify multiple `%union'
4405 declarations; their contents are concatenated. However, only the first
4406 `%union' declaration can specify a tag.
4407
4408 Note that, unlike making a `union' declaration in C, you need not
4409 write a semicolon after the closing brace.
4410
4411 Instead of `%union', you can define and use your own union type
4412 `YYSTYPE' if your grammar contains at least one `<TYPE>' tag. For
4413 example, you can put the following into a header file `parser.h':
4414
4415 union YYSTYPE {
4416 double val;
4417 symrec *tptr;
4418 };
4419 typedef union YYSTYPE YYSTYPE;
4420
4421 and then your grammar can use the following instead of `%union':
4422
4423 %{
4424 #include "parser.h"
4425 %}
4426 %type <val> expr
4427 %token <tptr> ID
4428
4429 
4430 File: bison.info, Node: Type Decl, Next: Initial Action Decl, Prev: Union Dec l, Up: Declarations
4431
4432 3.7.5 Nonterminal Symbols
4433 -------------------------
4434
4435 When you use `%union' to specify multiple value types, you must declare
4436 the value type of each nonterminal symbol for which values are used.
4437 This is done with a `%type' declaration, like this:
4438
4439 %type <TYPE> NONTERMINAL...
4440
4441 Here NONTERMINAL is the name of a nonterminal symbol, and TYPE is the
4442 name given in the `%union' to the alternative that you want (*note The
4443 Collection of Value Types: Union Decl.). You can give any number of
4444 nonterminal symbols in the same `%type' declaration, if they have the
4445 same value type. Use spaces to separate the symbol names.
4446
4447 You can also declare the value type of a terminal symbol. To do
4448 this, use the same `<TYPE>' construction in a declaration for the
4449 terminal symbol. All kinds of token declarations allow `<TYPE>'.
4450
4451 
4452 File: bison.info, Node: Initial Action Decl, Next: Destructor Decl, Prev: Typ e Decl, Up: Declarations
4453
4454 3.7.6 Performing Actions before Parsing
4455 ---------------------------------------
4456
4457 Sometimes your parser needs to perform some initializations before
4458 parsing. The `%initial-action' directive allows for such arbitrary
4459 code.
4460
4461 -- Directive: %initial-action { CODE }
4462 Declare that the braced CODE must be invoked before parsing each
4463 time `yyparse' is called. The CODE may use `$$' and `@$' --
4464 initial value and location of the lookahead -- and the
4465 `%parse-param'.
4466
4467 For instance, if your locations use a file name, you may use
4468
4469 %parse-param { char const *file_name };
4470 %initial-action
4471 {
4472 @$.initialize (file_name);
4473 };
4474
4475 
4476 File: bison.info, Node: Destructor Decl, Next: Expect Decl, Prev: Initial Act ion Decl, Up: Declarations
4477
4478 3.7.7 Freeing Discarded Symbols
4479 -------------------------------
4480
4481 During error recovery (*note Error Recovery::), symbols already pushed
4482 on the stack and tokens coming from the rest of the file are discarded
4483 until the parser falls on its feet. If the parser runs out of memory,
4484 or if it returns via `YYABORT' or `YYACCEPT', all the symbols on the
4485 stack must be discarded. Even if the parser succeeds, it must discard
4486 the start symbol.
4487
4488 When discarded symbols convey heap based information, this memory is
4489 lost. While this behavior can be tolerable for batch parsers, such as
4490 in traditional compilers, it is unacceptable for programs like shells or
4491 protocol implementations that may parse and execute indefinitely.
4492
4493 The `%destructor' directive defines code that is called when a
4494 symbol is automatically discarded.
4495
4496 -- Directive: %destructor { CODE } SYMBOLS
4497 Invoke the braced CODE whenever the parser discards one of the
4498 SYMBOLS. Within CODE, `$$' designates the semantic value
4499 associated with the discarded symbol, and `@$' designates its
4500 location. The additional parser parameters are also available
4501 (*note The Parser Function `yyparse': Parser Function.).
4502
4503 When a symbol is listed among SYMBOLS, its `%destructor' is called
4504 a per-symbol `%destructor'. You may also define a per-type
4505 `%destructor' by listing a semantic type tag among SYMBOLS. In
4506 that case, the parser will invoke this CODE whenever it discards
4507 any grammar symbol that has that semantic type tag unless that
4508 symbol has its own per-symbol `%destructor'.
4509
4510 Finally, you can define two different kinds of default
4511 `%destructor's. (These default forms are experimental. More user
4512 feedback will help to determine whether they should become
4513 permanent features.) You can place each of `<*>' and `<>' in the
4514 SYMBOLS list of exactly one `%destructor' declaration in your
4515 grammar file. The parser will invoke the CODE associated with one
4516 of these whenever it discards any user-defined grammar symbol that
4517 has no per-symbol and no per-type `%destructor'. The parser uses
4518 the CODE for `<*>' in the case of such a grammar symbol for which
4519 you have formally declared a semantic type tag (`%type' counts as
4520 such a declaration, but `$<tag>$' does not). The parser uses the
4521 CODE for `<>' in the case of such a grammar symbol that has no
4522 declared semantic type tag.
4523
4524 For example:
4525
4526 %union { char *string; }
4527 %token <string> STRING1
4528 %token <string> STRING2
4529 %type <string> string1
4530 %type <string> string2
4531 %union { char character; }
4532 %token <character> CHR
4533 %type <character> chr
4534 %token TAGLESS
4535
4536 %destructor { } <character>
4537 %destructor { free ($$); } <*>
4538 %destructor { free ($$); printf ("%d", @$.first_line); } STRING1 string1
4539 %destructor { printf ("Discarding tagless symbol.\n"); } <>
4540
4541 guarantees that, when the parser discards any user-defined symbol that
4542 has a semantic type tag other than `<character>', it passes its
4543 semantic value to `free' by default. However, when the parser discards
4544 a `STRING1' or a `string1', it also prints its line number to `stdout'.
4545 It performs only the second `%destructor' in this case, so it invokes
4546 `free' only once. Finally, the parser merely prints a message whenever
4547 it discards any symbol, such as `TAGLESS', that has no semantic type
4548 tag.
4549
4550 A Bison-generated parser invokes the default `%destructor's only for
4551 user-defined as opposed to Bison-defined symbols. For example, the
4552 parser will not invoke either kind of default `%destructor' for the
4553 special Bison-defined symbols `$accept', `$undefined', or `$end' (*note
4554 Bison Symbols: Table of Symbols.), none of which you can reference in
4555 your grammar. It also will not invoke either for the `error' token
4556 (*note error: Table of Symbols.), which is always defined by Bison
4557 regardless of whether you reference it in your grammar. However, it
4558 may invoke one of them for the end token (token 0) if you redefine it
4559 from `$end' to, for example, `END':
4560
4561 %token END 0
4562
4563 Finally, Bison will never invoke a `%destructor' for an unreferenced
4564 mid-rule semantic value (*note Actions in Mid-Rule: Mid-Rule Actions.).
4565 That is, Bison does not consider a mid-rule to have a semantic value if
4566 you do not reference `$$' in the mid-rule's action or `$N' (where N is
4567 the RHS symbol position of the mid-rule) in any later action in that
4568 rule. However, if you do reference either, the Bison-generated parser
4569 will invoke the `<>' `%destructor' whenever it discards the mid-rule
4570 symbol.
4571
4572
4573 "Discarded symbols" are the following:
4574
4575 * stacked symbols popped during the first phase of error recovery,
4576
4577 * incoming terminals during the second phase of error recovery,
4578
4579 * the current lookahead and the entire stack (except the current
4580 right-hand side symbols) when the parser returns immediately, and
4581
4582 * the start symbol, when the parser succeeds.
4583
4584 The parser can "return immediately" because of an explicit call to
4585 `YYABORT' or `YYACCEPT', or failed error recovery, or memory exhaustion.
4586
4587 Right-hand side symbols of a rule that explicitly triggers a syntax
4588 error via `YYERROR' are not discarded automatically. As a rule of
4589 thumb, destructors are invoked only when user actions cannot manage the
4590 memory.
4591
4592 
4593 File: bison.info, Node: Expect Decl, Next: Start Decl, Prev: Destructor Decl, Up: Declarations
4594
4595 3.7.8 Suppressing Conflict Warnings
4596 -----------------------------------
4597
4598 Bison normally warns if there are any conflicts in the grammar (*note
4599 Shift/Reduce Conflicts: Shift/Reduce.), but most real grammars have
4600 harmless shift/reduce conflicts which are resolved in a predictable way
4601 and would be difficult to eliminate. It is desirable to suppress the
4602 warning about these conflicts unless the number of conflicts changes.
4603 You can do this with the `%expect' declaration.
4604
4605 The declaration looks like this:
4606
4607 %expect N
4608
4609 Here N is a decimal integer. The declaration says there should be N
4610 shift/reduce conflicts and no reduce/reduce conflicts. Bison reports
4611 an error if the number of shift/reduce conflicts differs from N, or if
4612 there are any reduce/reduce conflicts.
4613
4614 For normal LALR(1) parsers, reduce/reduce conflicts are more
4615 serious, and should be eliminated entirely. Bison will always report
4616 reduce/reduce conflicts for these parsers. With GLR parsers, however,
4617 both kinds of conflicts are routine; otherwise, there would be no need
4618 to use GLR parsing. Therefore, it is also possible to specify an
4619 expected number of reduce/reduce conflicts in GLR parsers, using the
4620 declaration:
4621
4622 %expect-rr N
4623
4624 In general, using `%expect' involves these steps:
4625
4626 * Compile your grammar without `%expect'. Use the `-v' option to
4627 get a verbose list of where the conflicts occur. Bison will also
4628 print the number of conflicts.
4629
4630 * Check each of the conflicts to make sure that Bison's default
4631 resolution is what you really want. If not, rewrite the grammar
4632 and go back to the beginning.
4633
4634 * Add an `%expect' declaration, copying the number N from the number
4635 which Bison printed. With GLR parsers, add an `%expect-rr'
4636 declaration as well.
4637
4638 Now Bison will warn you if you introduce an unexpected conflict, but
4639 will keep silent otherwise.
4640
4641 
4642 File: bison.info, Node: Start Decl, Next: Pure Decl, Prev: Expect Decl, Up: Declarations
4643
4644 3.7.9 The Start-Symbol
4645 ----------------------
4646
4647 Bison assumes by default that the start symbol for the grammar is the
4648 first nonterminal specified in the grammar specification section. The
4649 programmer may override this restriction with the `%start' declaration
4650 as follows:
4651
4652 %start SYMBOL
4653
4654 
4655 File: bison.info, Node: Pure Decl, Next: Push Decl, Prev: Start Decl, Up: De clarations
4656
4657 3.7.10 A Pure (Reentrant) Parser
4658 --------------------------------
4659
4660 A "reentrant" program is one which does not alter in the course of
4661 execution; in other words, it consists entirely of "pure" (read-only)
4662 code. Reentrancy is important whenever asynchronous execution is
4663 possible; for example, a nonreentrant program may not be safe to call
4664 from a signal handler. In systems with multiple threads of control, a
4665 nonreentrant program must be called only within interlocks.
4666
4667 Normally, Bison generates a parser which is not reentrant. This is
4668 suitable for most uses, and it permits compatibility with Yacc. (The
4669 standard Yacc interfaces are inherently nonreentrant, because they use
4670 statically allocated variables for communication with `yylex',
4671 including `yylval' and `yylloc'.)
4672
4673 Alternatively, you can generate a pure, reentrant parser. The Bison
4674 declaration `%define api.pure' says that you want the parser to be
4675 reentrant. It looks like this:
4676
4677 %define api.pure
4678
4679 The result is that the communication variables `yylval' and `yylloc'
4680 become local variables in `yyparse', and a different calling convention
4681 is used for the lexical analyzer function `yylex'. *Note Calling
4682 Conventions for Pure Parsers: Pure Calling, for the details of this.
4683 The variable `yynerrs' becomes local in `yyparse' in pull mode but it
4684 becomes a member of yypstate in push mode. (*note The Error Reporting
4685 Function `yyerror': Error Reporting.). The convention for calling
4686 `yyparse' itself is unchanged.
4687
4688 Whether the parser is pure has nothing to do with the grammar rules.
4689 You can generate either a pure parser or a nonreentrant parser from any
4690 valid grammar.
4691
4692 
4693 File: bison.info, Node: Push Decl, Next: Decl Summary, Prev: Pure Decl, Up: Declarations
4694
4695 3.7.11 A Push Parser
4696 --------------------
4697
4698 (The current push parsing interface is experimental and may evolve.
4699 More user feedback will help to stabilize it.)
4700
4701 A pull parser is called once and it takes control until all its input
4702 is completely parsed. A push parser, on the other hand, is called each
4703 time a new token is made available.
4704
4705 A push parser is typically useful when the parser is part of a main
4706 event loop in the client's application. This is typically a
4707 requirement of a GUI, when the main event loop needs to be triggered
4708 within a certain time period.
4709
4710 Normally, Bison generates a pull parser. The following Bison
4711 declaration says that you want the parser to be a push parser (*note
4712 %define api.push_pull: Decl Summary.):
4713
4714 %define api.push_pull "push"
4715
4716 In almost all cases, you want to ensure that your push parser is also
4717 a pure parser (*note A Pure (Reentrant) Parser: Pure Decl.). The only
4718 time you should create an impure push parser is to have backwards
4719 compatibility with the impure Yacc pull mode interface. Unless you know
4720 what you are doing, your declarations should look like this:
4721
4722 %define api.pure
4723 %define api.push_pull "push"
4724
4725 There is a major notable functional difference between the pure push
4726 parser and the impure push parser. It is acceptable for a pure push
4727 parser to have many parser instances, of the same type of parser, in
4728 memory at the same time. An impure push parser should only use one
4729 parser at a time.
4730
4731 When a push parser is selected, Bison will generate some new symbols
4732 in the generated parser. `yypstate' is a structure that the generated
4733 parser uses to store the parser's state. `yypstate_new' is the
4734 function that will create a new parser instance. `yypstate_delete'
4735 will free the resources associated with the corresponding parser
4736 instance. Finally, `yypush_parse' is the function that should be
4737 called whenever a token is available to provide the parser. A trivial
4738 example of using a pure push parser would look like this:
4739
4740 int status;
4741 yypstate *ps = yypstate_new ();
4742 do {
4743 status = yypush_parse (ps, yylex (), NULL);
4744 } while (status == YYPUSH_MORE);
4745 yypstate_delete (ps);
4746
4747 If the user decided to use an impure push parser, a few things about
4748 the generated parser will change. The `yychar' variable becomes a
4749 global variable instead of a variable in the `yypush_parse' function.
4750 For this reason, the signature of the `yypush_parse' function is
4751 changed to remove the token as a parameter. A nonreentrant push parser
4752 example would thus look like this:
4753
4754 extern int yychar;
4755 int status;
4756 yypstate *ps = yypstate_new ();
4757 do {
4758 yychar = yylex ();
4759 status = yypush_parse (ps);
4760 } while (status == YYPUSH_MORE);
4761 yypstate_delete (ps);
4762
4763 That's it. Notice the next token is put into the global variable
4764 `yychar' for use by the next invocation of the `yypush_parse' function.
4765
4766 Bison also supports both the push parser interface along with the
4767 pull parser interface in the same generated parser. In order to get
4768 this functionality, you should replace the `%define api.push_pull
4769 "push"' declaration with the `%define api.push_pull "both"'
4770 declaration. Doing this will create all of the symbols mentioned
4771 earlier along with the two extra symbols, `yyparse' and `yypull_parse'.
4772 `yyparse' can be used exactly as it normally would be used. However,
4773 the user should note that it is implemented in the generated parser by
4774 calling `yypull_parse'. This makes the `yyparse' function that is
4775 generated with the `%define api.push_pull "both"' declaration slower
4776 than the normal `yyparse' function. If the user calls the
4777 `yypull_parse' function it will parse the rest of the input stream. It
4778 is possible to `yypush_parse' tokens to select a subgrammar and then
4779 `yypull_parse' the rest of the input stream. If you would like to
4780 switch back and forth between between parsing styles, you would have to
4781 write your own `yypull_parse' function that knows when to quit looking
4782 for input. An example of using the `yypull_parse' function would look
4783 like this:
4784
4785 yypstate *ps = yypstate_new ();
4786 yypull_parse (ps); /* Will call the lexer */
4787 yypstate_delete (ps);
4788
4789 Adding the `%define api.pure' declaration does exactly the same
4790 thing to the generated parser with `%define api.push_pull "both"' as it
4791 did for `%define api.push_pull "push"'.
4792
4793 
4794 File: bison.info, Node: Decl Summary, Prev: Push Decl, Up: Declarations
4795
4796 3.7.12 Bison Declaration Summary
4797 --------------------------------
4798
4799 Here is a summary of the declarations used to define a grammar:
4800
4801 -- Directive: %union
4802 Declare the collection of data types that semantic values may have
4803 (*note The Collection of Value Types: Union Decl.).
4804
4805 -- Directive: %token
4806 Declare a terminal symbol (token type name) with no precedence or
4807 associativity specified (*note Token Type Names: Token Decl.).
4808
4809 -- Directive: %right
4810 Declare a terminal symbol (token type name) that is
4811 right-associative (*note Operator Precedence: Precedence Decl.).
4812
4813 -- Directive: %left
4814 Declare a terminal symbol (token type name) that is
4815 left-associative (*note Operator Precedence: Precedence Decl.).
4816
4817 -- Directive: %nonassoc
4818 Declare a terminal symbol (token type name) that is nonassociative
4819 (*note Operator Precedence: Precedence Decl.). Using it in a way
4820 that would be associative is a syntax error.
4821
4822 -- Directive: %type
4823 Declare the type of semantic values for a nonterminal symbol
4824 (*note Nonterminal Symbols: Type Decl.).
4825
4826 -- Directive: %start
4827 Specify the grammar's start symbol (*note The Start-Symbol: Start
4828 Decl.).
4829
4830 -- Directive: %expect
4831 Declare the expected number of shift-reduce conflicts (*note
4832 Suppressing Conflict Warnings: Expect Decl.).
4833
4834
4835 In order to change the behavior of `bison', use the following
4836 directives:
4837
4838 -- Directive: %code {CODE}
4839 This is the unqualified form of the `%code' directive. It inserts
4840 CODE verbatim at a language-dependent default location in the
4841 output(1).
4842
4843 For C/C++, the default location is the parser source code file
4844 after the usual contents of the parser header file. Thus, `%code'
4845 replaces the traditional Yacc prologue, `%{CODE%}', for most
4846 purposes. For a detailed discussion, see *Note Prologue
4847 Alternatives::.
4848
4849 For Java, the default location is inside the parser class.
4850
4851 (Like all the Yacc prologue alternatives, this directive is
4852 experimental. More user feedback will help to determine whether
4853 it should become a permanent feature.)
4854
4855 -- Directive: %code QUALIFIER {CODE}
4856 This is the qualified form of the `%code' directive. If you need
4857 to specify location-sensitive verbatim CODE that does not belong
4858 at the default location selected by the unqualified `%code' form,
4859 use this form instead.
4860
4861 QUALIFIER identifies the purpose of CODE and thus the location(s)
4862 where Bison should generate it. Not all values of QUALIFIER are
4863 available for all target languages:
4864
4865 * requires
4866
4867 * Language(s): C, C++
4868
4869 * Purpose: This is the best place to write dependency code
4870 required for `YYSTYPE' and `YYLTYPE'. In other words,
4871 it's the best place to define types referenced in
4872 `%union' directives, and it's the best place to override
4873 Bison's default `YYSTYPE' and `YYLTYPE' definitions.
4874
4875 * Location(s): The parser header file and the parser
4876 source code file before the Bison-generated `YYSTYPE'
4877 and `YYLTYPE' definitions.
4878
4879 * provides
4880
4881 * Language(s): C, C++
4882
4883 * Purpose: This is the best place to write additional
4884 definitions and declarations that should be provided to
4885 other modules.
4886
4887 * Location(s): The parser header file and the parser
4888 source code file after the Bison-generated `YYSTYPE',
4889 `YYLTYPE', and token definitions.
4890
4891 * top
4892
4893 * Language(s): C, C++
4894
4895 * Purpose: The unqualified `%code' or `%code requires'
4896 should usually be more appropriate than `%code top'.
4897 However, occasionally it is necessary to insert code
4898 much nearer the top of the parser source code file. For
4899 example:
4900
4901 %code top {
4902 #define _GNU_SOURCE
4903 #include <stdio.h>
4904 }
4905
4906 * Location(s): Near the top of the parser source code file.
4907
4908 * imports
4909
4910 * Language(s): Java
4911
4912 * Purpose: This is the best place to write Java import
4913 directives.
4914
4915 * Location(s): The parser Java file after any Java package
4916 directive and before any class definitions.
4917
4918 (Like all the Yacc prologue alternatives, this directive is
4919 experimental. More user feedback will help to determine whether
4920 it should become a permanent feature.)
4921
4922 For a detailed discussion of how to use `%code' in place of the
4923 traditional Yacc prologue for C/C++, see *Note Prologue
4924 Alternatives::.
4925
4926 -- Directive: %debug
4927 In the parser file, define the macro `YYDEBUG' to 1 if it is not
4928 already defined, so that the debugging facilities are compiled.
4929 *Note Tracing Your Parser: Tracing.
4930
4931 -- Directive: %define VARIABLE
4932 -- Directive: %define VARIABLE "VALUE"
4933 Define a variable to adjust Bison's behavior. The possible
4934 choices for VARIABLE, as well as their meanings, depend on the
4935 selected target language and/or the parser skeleton (*note
4936 %language: Decl Summary, *note %skeleton: Decl Summary.).
4937
4938 Bison will warn if a VARIABLE is defined multiple times.
4939
4940 Omitting `"VALUE"' is always equivalent to specifying it as `""'.
4941
4942 Some VARIABLEs may be used as Booleans. In this case, Bison will
4943 complain if the variable definition does not meet one of the
4944 following four conditions:
4945
4946 1. `"VALUE"' is `"true"'
4947
4948 2. `"VALUE"' is omitted (or is `""'). This is equivalent to
4949 `"true"'.
4950
4951 3. `"VALUE"' is `"false"'.
4952
4953 4. VARIABLE is never defined. In this case, Bison selects a
4954 default value, which may depend on the selected target
4955 language and/or parser skeleton.
4956
4957 Some of the accepted VARIABLEs are:
4958
4959 * api.pure
4960
4961 * Language(s): C
4962
4963 * Purpose: Request a pure (reentrant) parser program.
4964 *Note A Pure (Reentrant) Parser: Pure Decl.
4965
4966 * Accepted Values: Boolean
4967
4968 * Default Value: `"false"'
4969
4970 * api.push_pull
4971
4972 * Language(s): C (LALR(1) only)
4973
4974 * Purpose: Requests a pull parser, a push parser, or both.
4975 *Note A Push Parser: Push Decl. (The current push
4976 parsing interface is experimental and may evolve. More
4977 user feedback will help to stabilize it.)
4978
4979 * Accepted Values: `"pull"', `"push"', `"both"'
4980
4981 * Default Value: `"pull"'
4982
4983 * lr.keep_unreachable_states
4984
4985 * Language(s): all
4986
4987 * Purpose: Requests that Bison allow unreachable parser
4988 states to remain in the parser tables. Bison considers
4989 a state to be unreachable if there exists no sequence of
4990 transitions from the start state to that state. A state
4991 can become unreachable during conflict resolution if
4992 Bison disables a shift action leading to it from a
4993 predecessor state. Keeping unreachable states is
4994 sometimes useful for analysis purposes, but they are
4995 useless in the generated parser.
4996
4997 * Accepted Values: Boolean
4998
4999 * Default Value: `"false"'
5000
5001 * Caveats:
5002
5003 * Unreachable states may contain conflicts and may
5004 use rules not used in any other state. Thus,
5005 keeping unreachable states may induce warnings that
5006 are irrelevant to your parser's behavior, and it
5007 may eliminate warnings that are relevant. Of
5008 course, the change in warnings may actually be
5009 relevant to a parser table analysis that wants to
5010 keep unreachable states, so this behavior will
5011 likely remain in future Bison releases.
5012
5013 * While Bison is able to remove unreachable states,
5014 it is not guaranteed to remove other kinds of
5015 useless states. Specifically, when Bison disables
5016 reduce actions during conflict resolution, some
5017 goto actions may become useless, and thus some
5018 additional states may become useless. If Bison
5019 were to compute which goto actions were useless and
5020 then disable those actions, it could identify such
5021 states as unreachable and then remove those states.
5022 However, Bison does not compute which goto actions
5023 are useless.
5024
5025 * namespace
5026
5027 * Languages(s): C++
5028
5029 * Purpose: Specifies the namespace for the parser class.
5030 For example, if you specify:
5031
5032 %define namespace "foo::bar"
5033
5034 Bison uses `foo::bar' verbatim in references such as:
5035
5036 foo::bar::parser::semantic_type
5037
5038 However, to open a namespace, Bison removes any leading
5039 `::' and then splits on any remaining occurrences:
5040
5041 namespace foo { namespace bar {
5042 class position;
5043 class location;
5044 } }
5045
5046 * Accepted Values: Any absolute or relative C++ namespace
5047 reference without a trailing `"::"'. For example,
5048 `"foo"' or `"::foo::bar"'.
5049
5050 * Default Value: The value specified by `%name-prefix',
5051 which defaults to `yy'. This usage of `%name-prefix' is
5052 for backward compatibility and can be confusing since
5053 `%name-prefix' also specifies the textual prefix for the
5054 lexical analyzer function. Thus, if you specify
5055 `%name-prefix', it is best to also specify `%define
5056 namespace' so that `%name-prefix' _only_ affects the
5057 lexical analyzer function. For example, if you specify:
5058
5059 %define namespace "foo"
5060 %name-prefix "bar::"
5061
5062 The parser namespace is `foo' and `yylex' is referenced
5063 as `bar::lex'.
5064
5065
5066 -- Directive: %defines
5067 Write a header file containing macro definitions for the token type
5068 names defined in the grammar as well as a few other declarations.
5069 If the parser output file is named `NAME.c' then this file is
5070 named `NAME.h'.
5071
5072 For C parsers, the output header declares `YYSTYPE' unless
5073 `YYSTYPE' is already defined as a macro or you have used a
5074 `<TYPE>' tag without using `%union'. Therefore, if you are using
5075 a `%union' (*note More Than One Value Type: Multiple Types.) with
5076 components that require other definitions, or if you have defined
5077 a `YYSTYPE' macro or type definition (*note Data Types of Semantic
5078 Values: Value Type.), you need to arrange for these definitions to
5079 be propagated to all modules, e.g., by putting them in a
5080 prerequisite header that is included both by your parser and by
5081 any other module that needs `YYSTYPE'.
5082
5083 Unless your parser is pure, the output header declares `yylval' as
5084 an external variable. *Note A Pure (Reentrant) Parser: Pure Decl.
5085
5086 If you have also used locations, the output header declares
5087 `YYLTYPE' and `yylloc' using a protocol similar to that of the
5088 `YYSTYPE' macro and `yylval'. *Note Tracking Locations: Locations.
5089
5090 This output file is normally essential if you wish to put the
5091 definition of `yylex' in a separate source file, because `yylex'
5092 typically needs to be able to refer to the above-mentioned
5093 declarations and to the token type codes. *Note Semantic Values
5094 of Tokens: Token Values.
5095
5096 If you have declared `%code requires' or `%code provides', the
5097 output header also contains their code. *Note %code: Decl Summary.
5098
5099 -- Directive: %defines DEFINES-FILE
5100 Same as above, but save in the file DEFINES-FILE.
5101
5102 -- Directive: %destructor
5103 Specify how the parser should reclaim the memory associated to
5104 discarded symbols. *Note Freeing Discarded Symbols: Destructor
5105 Decl.
5106
5107 -- Directive: %file-prefix "PREFIX"
5108 Specify a prefix to use for all Bison output file names. The
5109 names are chosen as if the input file were named `PREFIX.y'.
5110
5111 -- Directive: %language "LANGUAGE"
5112 Specify the programming language for the generated parser.
5113 Currently supported languages include C, C++, and Java. LANGUAGE
5114 is case-insensitive.
5115
5116 This directive is experimental and its effect may be modified in
5117 future releases.
5118
5119 -- Directive: %locations
5120 Generate the code processing the locations (*note Special Features
5121 for Use in Actions: Action Features.). This mode is enabled as
5122 soon as the grammar uses the special `@N' tokens, but if your
5123 grammar does not use it, using `%locations' allows for more
5124 accurate syntax error messages.
5125
5126 -- Directive: %name-prefix "PREFIX"
5127 Rename the external symbols used in the parser so that they start
5128 with PREFIX instead of `yy'. The precise list of symbols renamed
5129 in C parsers is `yyparse', `yylex', `yyerror', `yynerrs',
5130 `yylval', `yychar', `yydebug', and (if locations are used)
5131 `yylloc'. If you use a push parser, `yypush_parse',
5132 `yypull_parse', `yypstate', `yypstate_new' and `yypstate_delete'
5133 will also be renamed. For example, if you use `%name-prefix
5134 "c_"', the names become `c_parse', `c_lex', and so on. For C++
5135 parsers, see the `%define namespace' documentation in this section.
5136 *Note Multiple Parsers in the Same Program: Multiple Parsers.
5137
5138 -- Directive: %no-lines
5139 Don't generate any `#line' preprocessor commands in the parser
5140 file. Ordinarily Bison writes these commands in the parser file
5141 so that the C compiler and debuggers will associate errors and
5142 object code with your source file (the grammar file). This
5143 directive causes them to associate errors with the parser file,
5144 treating it an independent source file in its own right.
5145
5146 -- Directive: %output "FILE"
5147 Specify FILE for the parser file.
5148
5149 -- Directive: %pure-parser
5150 Deprecated version of `%define api.pure' (*note %define: Decl
5151 Summary.), for which Bison is more careful to warn about
5152 unreasonable usage.
5153
5154 -- Directive: %require "VERSION"
5155 Require version VERSION or higher of Bison. *Note Require a
5156 Version of Bison: Require Decl.
5157
5158 -- Directive: %skeleton "FILE"
5159 Specify the skeleton to use.
5160
5161 If FILE does not contain a `/', FILE is the name of a skeleton
5162 file in the Bison installation directory. If it does, FILE is an
5163 absolute file name or a file name relative to the directory of the
5164 grammar file. This is similar to how most shells resolve commands.
5165
5166 -- Directive: %token-table
5167 Generate an array of token names in the parser file. The name of
5168 the array is `yytname'; `yytname[I]' is the name of the token
5169 whose internal Bison token code number is I. The first three
5170 elements of `yytname' correspond to the predefined tokens `"$end"',
5171 `"error"', and `"$undefined"'; after these come the symbols
5172 defined in the grammar file.
5173
5174 The name in the table includes all the characters needed to
5175 represent the token in Bison. For single-character literals and
5176 literal strings, this includes the surrounding quoting characters
5177 and any escape sequences. For example, the Bison single-character
5178 literal `'+'' corresponds to a three-character name, represented
5179 in C as `"'+'"'; and the Bison two-character literal string `"\\/"'
5180 corresponds to a five-character name, represented in C as
5181 `"\"\\\\/\""'.
5182
5183 When you specify `%token-table', Bison also generates macro
5184 definitions for macros `YYNTOKENS', `YYNNTS', and `YYNRULES', and
5185 `YYNSTATES':
5186
5187 `YYNTOKENS'
5188 The highest token number, plus one.
5189
5190 `YYNNTS'
5191 The number of nonterminal symbols.
5192
5193 `YYNRULES'
5194 The number of grammar rules,
5195
5196 `YYNSTATES'
5197 The number of parser states (*note Parser States::).
5198
5199 -- Directive: %verbose
5200 Write an extra output file containing verbose descriptions of the
5201 parser states and what is done for each type of lookahead token in
5202 that state. *Note Understanding Your Parser: Understanding, for
5203 more information.
5204
5205 -- Directive: %yacc
5206 Pretend the option `--yacc' was given, i.e., imitate Yacc,
5207 including its naming conventions. *Note Bison Options::, for more.
5208
5209 ---------- Footnotes ----------
5210
5211 (1) The default location is actually skeleton-dependent; writers
5212 of non-standard skeletons however should choose the default location
5213 consistently with the behavior of the standard Bison skeletons.
5214
5215 
5216 File: bison.info, Node: Multiple Parsers, Prev: Declarations, Up: Grammar Fil e
5217
5218 3.8 Multiple Parsers in the Same Program
5219 ========================================
5220
5221 Most programs that use Bison parse only one language and therefore
5222 contain only one Bison parser. But what if you want to parse more than
5223 one language with the same program? Then you need to avoid a name
5224 conflict between different definitions of `yyparse', `yylval', and so
5225 on.
5226
5227 The easy way to do this is to use the option `-p PREFIX' (*note
5228 Invoking Bison: Invocation.). This renames the interface functions and
5229 variables of the Bison parser to start with PREFIX instead of `yy'.
5230 You can use this to give each parser distinct names that do not
5231 conflict.
5232
5233 The precise list of symbols renamed is `yyparse', `yylex',
5234 `yyerror', `yynerrs', `yylval', `yylloc', `yychar' and `yydebug'. If
5235 you use a push parser, `yypush_parse', `yypull_parse', `yypstate',
5236 `yypstate_new' and `yypstate_delete' will also be renamed. For
5237 example, if you use `-p c', the names become `cparse', `clex', and so
5238 on.
5239
5240 *All the other variables and macros associated with Bison are not
5241 renamed.* These others are not global; there is no conflict if the same
5242 name is used in different parsers. For example, `YYSTYPE' is not
5243 renamed, but defining this in different ways in different parsers causes
5244 no trouble (*note Data Types of Semantic Values: Value Type.).
5245
5246 The `-p' option works by adding macro definitions to the beginning
5247 of the parser source file, defining `yyparse' as `PREFIXparse', and so
5248 on. This effectively substitutes one name for the other in the entire
5249 parser file.
5250
5251 
5252 File: bison.info, Node: Interface, Next: Algorithm, Prev: Grammar File, Up: Top
5253
5254 4 Parser C-Language Interface
5255 *****************************
5256
5257 The Bison parser is actually a C function named `yyparse'. Here we
5258 describe the interface conventions of `yyparse' and the other functions
5259 that it needs to use.
5260
5261 Keep in mind that the parser uses many C identifiers starting with
5262 `yy' and `YY' for internal purposes. If you use such an identifier
5263 (aside from those in this manual) in an action or in epilogue in the
5264 grammar file, you are likely to run into trouble.
5265
5266 * Menu:
5267
5268 * Parser Function:: How to call `yyparse' and what it returns.
5269 * Push Parser Function:: How to call `yypush_parse' and what it returns.
5270 * Pull Parser Function:: How to call `yypull_parse' and what it returns.
5271 * Parser Create Function:: How to call `yypstate_new' and what it returns.
5272 * Parser Delete Function:: How to call `yypstate_delete' and what it returns.
5273 * Lexical:: You must supply a function `yylex'
5274 which reads tokens.
5275 * Error Reporting:: You must supply a function `yyerror'.
5276 * Action Features:: Special features for use in actions.
5277 * Internationalization:: How to let the parser speak in the user's
5278 native language.
5279
5280 
5281 File: bison.info, Node: Parser Function, Next: Push Parser Function, Up: Inte rface
5282
5283 4.1 The Parser Function `yyparse'
5284 =================================
5285
5286 You call the function `yyparse' to cause parsing to occur. This
5287 function reads tokens, executes actions, and ultimately returns when it
5288 encounters end-of-input or an unrecoverable syntax error. You can also
5289 write an action which directs `yyparse' to return immediately without
5290 reading further.
5291
5292 -- Function: int yyparse (void)
5293 The value returned by `yyparse' is 0 if parsing was successful
5294 (return is due to end-of-input).
5295
5296 The value is 1 if parsing failed because of invalid input, i.e.,
5297 input that contains a syntax error or that causes `YYABORT' to be
5298 invoked.
5299
5300 The value is 2 if parsing failed due to memory exhaustion.
5301
5302 In an action, you can cause immediate return from `yyparse' by using
5303 these macros:
5304
5305 -- Macro: YYACCEPT
5306 Return immediately with value 0 (to report success).
5307
5308 -- Macro: YYABORT
5309 Return immediately with value 1 (to report failure).
5310
5311 If you use a reentrant parser, you can optionally pass additional
5312 parameter information to it in a reentrant way. To do so, use the
5313 declaration `%parse-param':
5314
5315 -- Directive: %parse-param {ARGUMENT-DECLARATION}
5316 Declare that an argument declared by the braced-code
5317 ARGUMENT-DECLARATION is an additional `yyparse' argument. The
5318 ARGUMENT-DECLARATION is used when declaring functions or
5319 prototypes. The last identifier in ARGUMENT-DECLARATION must be
5320 the argument name.
5321
5322 Here's an example. Write this in the parser:
5323
5324 %parse-param {int *nastiness}
5325 %parse-param {int *randomness}
5326
5327 Then call the parser like this:
5328
5329 {
5330 int nastiness, randomness;
5331 ... /* Store proper data in `nastiness' and `randomness'. */
5332 value = yyparse (&nastiness, &randomness);
5333 ...
5334 }
5335
5336 In the grammar actions, use expressions like this to refer to the data:
5337
5338 exp: ... { ...; *randomness += 1; ... }
5339
5340 
5341 File: bison.info, Node: Push Parser Function, Next: Pull Parser Function, Pre v: Parser Function, Up: Interface
5342
5343 4.2 The Push Parser Function `yypush_parse'
5344 ===========================================
5345
5346 (The current push parsing interface is experimental and may evolve.
5347 More user feedback will help to stabilize it.)
5348
5349 You call the function `yypush_parse' to parse a single token. This
5350 function is available if either the `%define api.push_pull "push"' or
5351 `%define api.push_pull "both"' declaration is used. *Note A Push
5352 Parser: Push Decl.
5353
5354 -- Function: int yypush_parse (yypstate *yyps)
5355 The value returned by `yypush_parse' is the same as for yyparse
5356 with the following exception. `yypush_parse' will return
5357 YYPUSH_MORE if more input is required to finish parsing the
5358 grammar.
5359
5360 
5361 File: bison.info, Node: Pull Parser Function, Next: Parser Create Function, P rev: Push Parser Function, Up: Interface
5362
5363 4.3 The Pull Parser Function `yypull_parse'
5364 ===========================================
5365
5366 (The current push parsing interface is experimental and may evolve.
5367 More user feedback will help to stabilize it.)
5368
5369 You call the function `yypull_parse' to parse the rest of the input
5370 stream. This function is available if the `%define api.push_pull
5371 "both"' declaration is used. *Note A Push Parser: Push Decl.
5372
5373 -- Function: int yypull_parse (yypstate *yyps)
5374 The value returned by `yypull_parse' is the same as for `yyparse'.
5375
5376 
5377 File: bison.info, Node: Parser Create Function, Next: Parser Delete Function, Prev: Pull Parser Function, Up: Interface
5378
5379 4.4 The Parser Create Function `yystate_new'
5380 ============================================
5381
5382 (The current push parsing interface is experimental and may evolve.
5383 More user feedback will help to stabilize it.)
5384
5385 You call the function `yypstate_new' to create a new parser instance.
5386 This function is available if either the `%define api.push_pull "push"'
5387 or `%define api.push_pull "both"' declaration is used. *Note A Push
5388 Parser: Push Decl.
5389
5390 -- Function: yypstate *yypstate_new (void)
5391 The fuction will return a valid parser instance if there was
5392 memory available or 0 if no memory was available. In impure mode,
5393 it will also return 0 if a parser instance is currently allocated.
5394
5395 
5396 File: bison.info, Node: Parser Delete Function, Next: Lexical, Prev: Parser C reate Function, Up: Interface
5397
5398 4.5 The Parser Delete Function `yystate_delete'
5399 ===============================================
5400
5401 (The current push parsing interface is experimental and may evolve.
5402 More user feedback will help to stabilize it.)
5403
5404 You call the function `yypstate_delete' to delete a parser instance.
5405 function is available if either the `%define api.push_pull "push"' or
5406 `%define api.push_pull "both"' declaration is used. *Note A Push
5407 Parser: Push Decl.
5408
5409 -- Function: void yypstate_delete (yypstate *yyps)
5410 This function will reclaim the memory associated with a parser
5411 instance. After this call, you should no longer attempt to use
5412 the parser instance.
5413
5414 
5415 File: bison.info, Node: Lexical, Next: Error Reporting, Prev: Parser Delete F unction, Up: Interface
5416
5417 4.6 The Lexical Analyzer Function `yylex'
5418 =========================================
5419
5420 The "lexical analyzer" function, `yylex', recognizes tokens from the
5421 input stream and returns them to the parser. Bison does not create
5422 this function automatically; you must write it so that `yyparse' can
5423 call it. The function is sometimes referred to as a lexical scanner.
5424
5425 In simple programs, `yylex' is often defined at the end of the Bison
5426 grammar file. If `yylex' is defined in a separate source file, you
5427 need to arrange for the token-type macro definitions to be available
5428 there. To do this, use the `-d' option when you run Bison, so that it
5429 will write these macro definitions into a separate header file
5430 `NAME.tab.h' which you can include in the other source files that need
5431 it. *Note Invoking Bison: Invocation.
5432
5433 * Menu:
5434
5435 * Calling Convention:: How `yyparse' calls `yylex'.
5436 * Token Values:: How `yylex' must return the semantic value
5437 of the token it has read.
5438 * Token Locations:: How `yylex' must return the text location
5439 (line number, etc.) of the token, if the
5440 actions want that.
5441 * Pure Calling:: How the calling convention differs in a pure parser
5442 (*note A Pure (Reentrant) Parser: Pure Decl.).
5443
5444 
5445 File: bison.info, Node: Calling Convention, Next: Token Values, Up: Lexical
5446
5447 4.6.1 Calling Convention for `yylex'
5448 ------------------------------------
5449
5450 The value that `yylex' returns must be the positive numeric code for
5451 the type of token it has just found; a zero or negative value signifies
5452 end-of-input.
5453
5454 When a token is referred to in the grammar rules by a name, that name
5455 in the parser file becomes a C macro whose definition is the proper
5456 numeric code for that token type. So `yylex' can use the name to
5457 indicate that type. *Note Symbols::.
5458
5459 When a token is referred to in the grammar rules by a character
5460 literal, the numeric code for that character is also the code for the
5461 token type. So `yylex' can simply return that character code, possibly
5462 converted to `unsigned char' to avoid sign-extension. The null
5463 character must not be used this way, because its code is zero and that
5464 signifies end-of-input.
5465
5466 Here is an example showing these things:
5467
5468 int
5469 yylex (void)
5470 {
5471 ...
5472 if (c == EOF) /* Detect end-of-input. */
5473 return 0;
5474 ...
5475 if (c == '+' || c == '-')
5476 return c; /* Assume token type for `+' is '+'. */
5477 ...
5478 return INT; /* Return the type of the token. */
5479 ...
5480 }
5481
5482 This interface has been designed so that the output from the `lex'
5483 utility can be used without change as the definition of `yylex'.
5484
5485 If the grammar uses literal string tokens, there are two ways that
5486 `yylex' can determine the token type codes for them:
5487
5488 * If the grammar defines symbolic token names as aliases for the
5489 literal string tokens, `yylex' can use these symbolic names like
5490 all others. In this case, the use of the literal string tokens in
5491 the grammar file has no effect on `yylex'.
5492
5493 * `yylex' can find the multicharacter token in the `yytname' table.
5494 The index of the token in the table is the token type's code. The
5495 name of a multicharacter token is recorded in `yytname' with a
5496 double-quote, the token's characters, and another double-quote.
5497 The token's characters are escaped as necessary to be suitable as
5498 input to Bison.
5499
5500 Here's code for looking up a multicharacter token in `yytname',
5501 assuming that the characters of the token are stored in
5502 `token_buffer', and assuming that the token does not contain any
5503 characters like `"' that require escaping.
5504
5505 for (i = 0; i < YYNTOKENS; i++)
5506 {
5507 if (yytname[i] != 0
5508 && yytname[i][0] == '"'
5509 && ! strncmp (yytname[i] + 1, token_buffer,
5510 strlen (token_buffer))
5511 && yytname[i][strlen (token_buffer) + 1] == '"'
5512 && yytname[i][strlen (token_buffer) + 2] == 0)
5513 break;
5514 }
5515
5516 The `yytname' table is generated only if you use the
5517 `%token-table' declaration. *Note Decl Summary::.
5518
5519 
5520 File: bison.info, Node: Token Values, Next: Token Locations, Prev: Calling Co nvention, Up: Lexical
5521
5522 4.6.2 Semantic Values of Tokens
5523 -------------------------------
5524
5525 In an ordinary (nonreentrant) parser, the semantic value of the token
5526 must be stored into the global variable `yylval'. When you are using
5527 just one data type for semantic values, `yylval' has that type. Thus,
5528 if the type is `int' (the default), you might write this in `yylex':
5529
5530 ...
5531 yylval = value; /* Put value onto Bison stack. */
5532 return INT; /* Return the type of the token. */
5533 ...
5534
5535 When you are using multiple data types, `yylval''s type is a union
5536 made from the `%union' declaration (*note The Collection of Value
5537 Types: Union Decl.). So when you store a token's value, you must use
5538 the proper member of the union. If the `%union' declaration looks like
5539 this:
5540
5541 %union {
5542 int intval;
5543 double val;
5544 symrec *tptr;
5545 }
5546
5547 then the code in `yylex' might look like this:
5548
5549 ...
5550 yylval.intval = value; /* Put value onto Bison stack. */
5551 return INT; /* Return the type of the token. */
5552 ...
5553
5554 
5555 File: bison.info, Node: Token Locations, Next: Pure Calling, Prev: Token Valu es, Up: Lexical
5556
5557 4.6.3 Textual Locations of Tokens
5558 ---------------------------------
5559
5560 If you are using the `@N'-feature (*note Tracking Locations:
5561 Locations.) in actions to keep track of the textual locations of tokens
5562 and groupings, then you must provide this information in `yylex'. The
5563 function `yyparse' expects to find the textual location of a token just
5564 parsed in the global variable `yylloc'. So `yylex' must store the
5565 proper data in that variable.
5566
5567 By default, the value of `yylloc' is a structure and you need only
5568 initialize the members that are going to be used by the actions. The
5569 four members are called `first_line', `first_column', `last_line' and
5570 `last_column'. Note that the use of this feature makes the parser
5571 noticeably slower.
5572
5573 The data type of `yylloc' has the name `YYLTYPE'.
5574
5575 
5576 File: bison.info, Node: Pure Calling, Prev: Token Locations, Up: Lexical
5577
5578 4.6.4 Calling Conventions for Pure Parsers
5579 ------------------------------------------
5580
5581 When you use the Bison declaration `%define api.pure' to request a
5582 pure, reentrant parser, the global communication variables `yylval' and
5583 `yylloc' cannot be used. (*Note A Pure (Reentrant) Parser: Pure Decl.)
5584 In such parsers the two global variables are replaced by pointers
5585 passed as arguments to `yylex'. You must declare them as shown here,
5586 and pass the information back by storing it through those pointers.
5587
5588 int
5589 yylex (YYSTYPE *lvalp, YYLTYPE *llocp)
5590 {
5591 ...
5592 *lvalp = value; /* Put value onto Bison stack. */
5593 return INT; /* Return the type of the token. */
5594 ...
5595 }
5596
5597 If the grammar file does not use the `@' constructs to refer to
5598 textual locations, then the type `YYLTYPE' will not be defined. In
5599 this case, omit the second argument; `yylex' will be called with only
5600 one argument.
5601
5602 If you wish to pass the additional parameter data to `yylex', use
5603 `%lex-param' just like `%parse-param' (*note Parser Function::).
5604
5605 -- Directive: lex-param {ARGUMENT-DECLARATION}
5606 Declare that the braced-code ARGUMENT-DECLARATION is an additional
5607 `yylex' argument declaration.
5608
5609 For instance:
5610
5611 %parse-param {int *nastiness}
5612 %lex-param {int *nastiness}
5613 %parse-param {int *randomness}
5614
5615 results in the following signature:
5616
5617 int yylex (int *nastiness);
5618 int yyparse (int *nastiness, int *randomness);
5619
5620 If `%define api.pure' is added:
5621
5622 int yylex (YYSTYPE *lvalp, int *nastiness);
5623 int yyparse (int *nastiness, int *randomness);
5624
5625 and finally, if both `%define api.pure' and `%locations' are used:
5626
5627 int yylex (YYSTYPE *lvalp, YYLTYPE *llocp, int *nastiness);
5628 int yyparse (int *nastiness, int *randomness);
5629
5630 
5631 File: bison.info, Node: Error Reporting, Next: Action Features, Prev: Lexical , Up: Interface
5632
5633 4.7 The Error Reporting Function `yyerror'
5634 ==========================================
5635
5636 The Bison parser detects a "syntax error" or "parse error" whenever it
5637 reads a token which cannot satisfy any syntax rule. An action in the
5638 grammar can also explicitly proclaim an error, using the macro
5639 `YYERROR' (*note Special Features for Use in Actions: Action Features.).
5640
5641 The Bison parser expects to report the error by calling an error
5642 reporting function named `yyerror', which you must supply. It is
5643 called by `yyparse' whenever a syntax error is found, and it receives
5644 one argument. For a syntax error, the string is normally
5645 `"syntax error"'.
5646
5647 If you invoke the directive `%error-verbose' in the Bison
5648 declarations section (*note The Bison Declarations Section: Bison
5649 Declarations.), then Bison provides a more verbose and specific error
5650 message string instead of just plain `"syntax error"'.
5651
5652 The parser can detect one other kind of error: memory exhaustion.
5653 This can happen when the input contains constructions that are very
5654 deeply nested. It isn't likely you will encounter this, since the Bison
5655 parser normally extends its stack automatically up to a very large
5656 limit. But if memory is exhausted, `yyparse' calls `yyerror' in the
5657 usual fashion, except that the argument string is `"memory exhausted"'.
5658
5659 In some cases diagnostics like `"syntax error"' are translated
5660 automatically from English to some other language before they are
5661 passed to `yyerror'. *Note Internationalization::.
5662
5663 The following definition suffices in simple programs:
5664
5665 void
5666 yyerror (char const *s)
5667 {
5668 fprintf (stderr, "%s\n", s);
5669 }
5670
5671 After `yyerror' returns to `yyparse', the latter will attempt error
5672 recovery if you have written suitable error recovery grammar rules
5673 (*note Error Recovery::). If recovery is impossible, `yyparse' will
5674 immediately return 1.
5675
5676 Obviously, in location tracking pure parsers, `yyerror' should have
5677 an access to the current location. This is indeed the case for the GLR
5678 parsers, but not for the Yacc parser, for historical reasons. I.e., if
5679 `%locations %define api.pure' is passed then the prototypes for
5680 `yyerror' are:
5681
5682 void yyerror (char const *msg); /* Yacc parsers. */
5683 void yyerror (YYLTYPE *locp, char const *msg); /* GLR parsers. */
5684
5685 If `%parse-param {int *nastiness}' is used, then:
5686
5687 void yyerror (int *nastiness, char const *msg); /* Yacc parsers. */
5688 void yyerror (int *nastiness, char const *msg); /* GLR parsers. */
5689
5690 Finally, GLR and Yacc parsers share the same `yyerror' calling
5691 convention for absolutely pure parsers, i.e., when the calling
5692 convention of `yylex' _and_ the calling convention of `%define
5693 api.pure' are pure. I.e.:
5694
5695 /* Location tracking. */
5696 %locations
5697 /* Pure yylex. */
5698 %define api.pure
5699 %lex-param {int *nastiness}
5700 /* Pure yyparse. */
5701 %parse-param {int *nastiness}
5702 %parse-param {int *randomness}
5703
5704 results in the following signatures for all the parser kinds:
5705
5706 int yylex (YYSTYPE *lvalp, YYLTYPE *llocp, int *nastiness);
5707 int yyparse (int *nastiness, int *randomness);
5708 void yyerror (YYLTYPE *locp,
5709 int *nastiness, int *randomness,
5710 char const *msg);
5711
5712 The prototypes are only indications of how the code produced by Bison
5713 uses `yyerror'. Bison-generated code always ignores the returned
5714 value, so `yyerror' can return any type, including `void'. Also,
5715 `yyerror' can be a variadic function; that is why the message is always
5716 passed last.
5717
5718 Traditionally `yyerror' returns an `int' that is always ignored, but
5719 this is purely for historical reasons, and `void' is preferable since
5720 it more accurately describes the return type for `yyerror'.
5721
5722 The variable `yynerrs' contains the number of syntax errors reported
5723 so far. Normally this variable is global; but if you request a pure
5724 parser (*note A Pure (Reentrant) Parser: Pure Decl.) then it is a
5725 local variable which only the actions can access.
5726
5727 
5728 File: bison.info, Node: Action Features, Next: Internationalization, Prev: Er ror Reporting, Up: Interface
5729
5730 4.8 Special Features for Use in Actions
5731 =======================================
5732
5733 Here is a table of Bison constructs, variables and macros that are
5734 useful in actions.
5735
5736 -- Variable: $$
5737 Acts like a variable that contains the semantic value for the
5738 grouping made by the current rule. *Note Actions::.
5739
5740 -- Variable: $N
5741 Acts like a variable that contains the semantic value for the Nth
5742 component of the current rule. *Note Actions::.
5743
5744 -- Variable: $<TYPEALT>$
5745 Like `$$' but specifies alternative TYPEALT in the union specified
5746 by the `%union' declaration. *Note Data Types of Values in
5747 Actions: Action Types.
5748
5749 -- Variable: $<TYPEALT>N
5750 Like `$N' but specifies alternative TYPEALT in the union specified
5751 by the `%union' declaration. *Note Data Types of Values in
5752 Actions: Action Types.
5753
5754 -- Macro: YYABORT;
5755 Return immediately from `yyparse', indicating failure. *Note The
5756 Parser Function `yyparse': Parser Function.
5757
5758 -- Macro: YYACCEPT;
5759 Return immediately from `yyparse', indicating success. *Note The
5760 Parser Function `yyparse': Parser Function.
5761
5762 -- Macro: YYBACKUP (TOKEN, VALUE);
5763 Unshift a token. This macro is allowed only for rules that reduce
5764 a single value, and only when there is no lookahead token. It is
5765 also disallowed in GLR parsers. It installs a lookahead token
5766 with token type TOKEN and semantic value VALUE; then it discards
5767 the value that was going to be reduced by this rule.
5768
5769 If the macro is used when it is not valid, such as when there is a
5770 lookahead token already, then it reports a syntax error with a
5771 message `cannot back up' and performs ordinary error recovery.
5772
5773 In either case, the rest of the action is not executed.
5774
5775 -- Macro: YYEMPTY
5776 Value stored in `yychar' when there is no lookahead token.
5777
5778 -- Macro: YYEOF
5779 Value stored in `yychar' when the lookahead is the end of the input
5780 stream.
5781
5782 -- Macro: YYERROR;
5783 Cause an immediate syntax error. This statement initiates error
5784 recovery just as if the parser itself had detected an error;
5785 however, it does not call `yyerror', and does not print any
5786 message. If you want to print an error message, call `yyerror'
5787 explicitly before the `YYERROR;' statement. *Note Error
5788 Recovery::.
5789
5790 -- Macro: YYRECOVERING
5791 The expression `YYRECOVERING ()' yields 1 when the parser is
5792 recovering from a syntax error, and 0 otherwise. *Note Error
5793 Recovery::.
5794
5795 -- Variable: yychar
5796 Variable containing either the lookahead token, or `YYEOF' when the
5797 lookahead is the end of the input stream, or `YYEMPTY' when no
5798 lookahead has been performed so the next token is not yet known.
5799 Do not modify `yychar' in a deferred semantic action (*note GLR
5800 Semantic Actions::). *Note Lookahead Tokens: Lookahead.
5801
5802 -- Macro: yyclearin;
5803 Discard the current lookahead token. This is useful primarily in
5804 error rules. Do not invoke `yyclearin' in a deferred semantic
5805 action (*note GLR Semantic Actions::). *Note Error Recovery::.
5806
5807 -- Macro: yyerrok;
5808 Resume generating error messages immediately for subsequent syntax
5809 errors. This is useful primarily in error rules. *Note Error
5810 Recovery::.
5811
5812 -- Variable: yylloc
5813 Variable containing the lookahead token location when `yychar' is
5814 not set to `YYEMPTY' or `YYEOF'. Do not modify `yylloc' in a
5815 deferred semantic action (*note GLR Semantic Actions::). *Note
5816 Actions and Locations: Actions and Locations.
5817
5818 -- Variable: yylval
5819 Variable containing the lookahead token semantic value when
5820 `yychar' is not set to `YYEMPTY' or `YYEOF'. Do not modify
5821 `yylval' in a deferred semantic action (*note GLR Semantic
5822 Actions::). *Note Actions: Actions.
5823
5824 -- Value: @$
5825 Acts like a structure variable containing information on the
5826 textual location of the grouping made by the current rule. *Note
5827 Tracking Locations: Locations.
5828
5829
5830 -- Value: @N
5831 Acts like a structure variable containing information on the
5832 textual location of the Nth component of the current rule. *Note
5833 Tracking Locations: Locations.
5834
5835 
5836 File: bison.info, Node: Internationalization, Prev: Action Features, Up: Inte rface
5837
5838 4.9 Parser Internationalization
5839 ===============================
5840
5841 A Bison-generated parser can print diagnostics, including error and
5842 tracing messages. By default, they appear in English. However, Bison
5843 also supports outputting diagnostics in the user's native language. To
5844 make this work, the user should set the usual environment variables.
5845 *Note The User's View: (gettext)Users. For example, the shell command
5846 `export LC_ALL=fr_CA.UTF-8' might set the user's locale to French
5847 Canadian using the UTF-8 encoding. The exact set of available locales
5848 depends on the user's installation.
5849
5850 The maintainer of a package that uses a Bison-generated parser
5851 enables the internationalization of the parser's output through the
5852 following steps. Here we assume a package that uses GNU Autoconf and
5853 GNU Automake.
5854
5855 1. Into the directory containing the GNU Autoconf macros used by the
5856 package--often called `m4'--copy the `bison-i18n.m4' file
5857 installed by Bison under `share/aclocal/bison-i18n.m4' in Bison's
5858 installation directory. For example:
5859
5860 cp /usr/local/share/aclocal/bison-i18n.m4 m4/bison-i18n.m4
5861
5862 2. In the top-level `configure.ac', after the `AM_GNU_GETTEXT'
5863 invocation, add an invocation of `BISON_I18N'. This macro is
5864 defined in the file `bison-i18n.m4' that you copied earlier. It
5865 causes `configure' to find the value of the `BISON_LOCALEDIR'
5866 variable, and it defines the source-language symbol `YYENABLE_NLS'
5867 to enable translations in the Bison-generated parser.
5868
5869 3. In the `main' function of your program, designate the directory
5870 containing Bison's runtime message catalog, through a call to
5871 `bindtextdomain' with domain name `bison-runtime'. For example:
5872
5873 bindtextdomain ("bison-runtime", BISON_LOCALEDIR);
5874
5875 Typically this appears after any other call `bindtextdomain
5876 (PACKAGE, LOCALEDIR)' that your package already has. Here we rely
5877 on `BISON_LOCALEDIR' to be defined as a string through the
5878 `Makefile'.
5879
5880 4. In the `Makefile.am' that controls the compilation of the `main'
5881 function, make `BISON_LOCALEDIR' available as a C preprocessor
5882 macro, either in `DEFS' or in `AM_CPPFLAGS'. For example:
5883
5884 DEFS = @DEFS@ -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"'
5885
5886 or:
5887
5888 AM_CPPFLAGS = -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"'
5889
5890 5. Finally, invoke the command `autoreconf' to generate the build
5891 infrastructure.
5892
5893 
5894 File: bison.info, Node: Algorithm, Next: Error Recovery, Prev: Interface, Up : Top
5895
5896 5 The Bison Parser Algorithm
5897 ****************************
5898
5899 As Bison reads tokens, it pushes them onto a stack along with their
5900 semantic values. The stack is called the "parser stack". Pushing a
5901 token is traditionally called "shifting".
5902
5903 For example, suppose the infix calculator has read `1 + 5 *', with a
5904 `3' to come. The stack will have four elements, one for each token
5905 that was shifted.
5906
5907 But the stack does not always have an element for each token read.
5908 When the last N tokens and groupings shifted match the components of a
5909 grammar rule, they can be combined according to that rule. This is
5910 called "reduction". Those tokens and groupings are replaced on the
5911 stack by a single grouping whose symbol is the result (left hand side)
5912 of that rule. Running the rule's action is part of the process of
5913 reduction, because this is what computes the semantic value of the
5914 resulting grouping.
5915
5916 For example, if the infix calculator's parser stack contains this:
5917
5918 1 + 5 * 3
5919
5920 and the next input token is a newline character, then the last three
5921 elements can be reduced to 15 via the rule:
5922
5923 expr: expr '*' expr;
5924
5925 Then the stack contains just these three elements:
5926
5927 1 + 15
5928
5929 At this point, another reduction can be made, resulting in the single
5930 value 16. Then the newline token can be shifted.
5931
5932 The parser tries, by shifts and reductions, to reduce the entire
5933 input down to a single grouping whose symbol is the grammar's
5934 start-symbol (*note Languages and Context-Free Grammars: Language and
5935 Grammar.).
5936
5937 This kind of parser is known in the literature as a bottom-up parser.
5938
5939 * Menu:
5940
5941 * Lookahead:: Parser looks one token ahead when deciding what to do.
5942 * Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
5943 * Precedence:: Operator precedence works by resolving conflicts.
5944 * Contextual Precedence:: When an operator's precedence depends on context.
5945 * Parser States:: The parser is a finite-state-machine with stack.
5946 * Reduce/Reduce:: When two rules are applicable in the same situation.
5947 * Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
5948 * Generalized LR Parsing:: Parsing arbitrary context-free grammars.
5949 * Memory Management:: What happens when memory is exhausted. How to avoid it.
5950
5951 
5952 File: bison.info, Node: Lookahead, Next: Shift/Reduce, Up: Algorithm
5953
5954 5.1 Lookahead Tokens
5955 ====================
5956
5957 The Bison parser does _not_ always reduce immediately as soon as the
5958 last N tokens and groupings match a rule. This is because such a
5959 simple strategy is inadequate to handle most languages. Instead, when a
5960 reduction is possible, the parser sometimes "looks ahead" at the next
5961 token in order to decide what to do.
5962
5963 When a token is read, it is not immediately shifted; first it
5964 becomes the "lookahead token", which is not on the stack. Now the
5965 parser can perform one or more reductions of tokens and groupings on
5966 the stack, while the lookahead token remains off to the side. When no
5967 more reductions should take place, the lookahead token is shifted onto
5968 the stack. This does not mean that all possible reductions have been
5969 done; depending on the token type of the lookahead token, some rules
5970 may choose to delay their application.
5971
5972 Here is a simple case where lookahead is needed. These three rules
5973 define expressions which contain binary addition operators and postfix
5974 unary factorial operators (`!'), and allow parentheses for grouping.
5975
5976 expr: term '+' expr
5977 | term
5978 ;
5979
5980 term: '(' expr ')'
5981 | term '!'
5982 | NUMBER
5983 ;
5984
5985 Suppose that the tokens `1 + 2' have been read and shifted; what
5986 should be done? If the following token is `)', then the first three
5987 tokens must be reduced to form an `expr'. This is the only valid
5988 course, because shifting the `)' would produce a sequence of symbols
5989 `term ')'', and no rule allows this.
5990
5991 If the following token is `!', then it must be shifted immediately so
5992 that `2 !' can be reduced to make a `term'. If instead the parser were
5993 to reduce before shifting, `1 + 2' would become an `expr'. It would
5994 then be impossible to shift the `!' because doing so would produce on
5995 the stack the sequence of symbols `expr '!''. No rule allows that
5996 sequence.
5997
5998 The lookahead token is stored in the variable `yychar'. Its
5999 semantic value and location, if any, are stored in the variables
6000 `yylval' and `yylloc'. *Note Special Features for Use in Actions:
6001 Action Features.
6002
6003 
6004 File: bison.info, Node: Shift/Reduce, Next: Precedence, Prev: Lookahead, Up: Algorithm
6005
6006 5.2 Shift/Reduce Conflicts
6007 ==========================
6008
6009 Suppose we are parsing a language which has if-then and if-then-else
6010 statements, with a pair of rules like this:
6011
6012 if_stmt:
6013 IF expr THEN stmt
6014 | IF expr THEN stmt ELSE stmt
6015 ;
6016
6017 Here we assume that `IF', `THEN' and `ELSE' are terminal symbols for
6018 specific keyword tokens.
6019
6020 When the `ELSE' token is read and becomes the lookahead token, the
6021 contents of the stack (assuming the input is valid) are just right for
6022 reduction by the first rule. But it is also legitimate to shift the
6023 `ELSE', because that would lead to eventual reduction by the second
6024 rule.
6025
6026 This situation, where either a shift or a reduction would be valid,
6027 is called a "shift/reduce conflict". Bison is designed to resolve
6028 these conflicts by choosing to shift, unless otherwise directed by
6029 operator precedence declarations. To see the reason for this, let's
6030 contrast it with the other alternative.
6031
6032 Since the parser prefers to shift the `ELSE', the result is to attach
6033 the else-clause to the innermost if-statement, making these two inputs
6034 equivalent:
6035
6036 if x then if y then win (); else lose;
6037
6038 if x then do; if y then win (); else lose; end;
6039
6040 But if the parser chose to reduce when possible rather than shift,
6041 the result would be to attach the else-clause to the outermost
6042 if-statement, making these two inputs equivalent:
6043
6044 if x then if y then win (); else lose;
6045
6046 if x then do; if y then win (); end; else lose;
6047
6048 The conflict exists because the grammar as written is ambiguous:
6049 either parsing of the simple nested if-statement is legitimate. The
6050 established convention is that these ambiguities are resolved by
6051 attaching the else-clause to the innermost if-statement; this is what
6052 Bison accomplishes by choosing to shift rather than reduce. (It would
6053 ideally be cleaner to write an unambiguous grammar, but that is very
6054 hard to do in this case.) This particular ambiguity was first
6055 encountered in the specifications of Algol 60 and is called the
6056 "dangling `else'" ambiguity.
6057
6058 To avoid warnings from Bison about predictable, legitimate
6059 shift/reduce conflicts, use the `%expect N' declaration. There will be
6060 no warning as long as the number of shift/reduce conflicts is exactly N.
6061 *Note Suppressing Conflict Warnings: Expect Decl.
6062
6063 The definition of `if_stmt' above is solely to blame for the
6064 conflict, but the conflict does not actually appear without additional
6065 rules. Here is a complete Bison input file that actually manifests the
6066 conflict:
6067
6068 %token IF THEN ELSE variable
6069 %%
6070 stmt: expr
6071 | if_stmt
6072 ;
6073
6074 if_stmt:
6075 IF expr THEN stmt
6076 | IF expr THEN stmt ELSE stmt
6077 ;
6078
6079 expr: variable
6080 ;
6081
6082 
6083 File: bison.info, Node: Precedence, Next: Contextual Precedence, Prev: Shift/ Reduce, Up: Algorithm
6084
6085 5.3 Operator Precedence
6086 =======================
6087
6088 Another situation where shift/reduce conflicts appear is in arithmetic
6089 expressions. Here shifting is not always the preferred resolution; the
6090 Bison declarations for operator precedence allow you to specify when to
6091 shift and when to reduce.
6092
6093 * Menu:
6094
6095 * Why Precedence:: An example showing why precedence is needed.
6096 * Using Precedence:: How to specify precedence in Bison grammars.
6097 * Precedence Examples:: How these features are used in the previous example.
6098 * How Precedence:: How they work.
6099
6100 
6101 File: bison.info, Node: Why Precedence, Next: Using Precedence, Up: Precedenc e
6102
6103 5.3.1 When Precedence is Needed
6104 -------------------------------
6105
6106 Consider the following ambiguous grammar fragment (ambiguous because the
6107 input `1 - 2 * 3' can be parsed in two different ways):
6108
6109 expr: expr '-' expr
6110 | expr '*' expr
6111 | expr '<' expr
6112 | '(' expr ')'
6113 ...
6114 ;
6115
6116 Suppose the parser has seen the tokens `1', `-' and `2'; should it
6117 reduce them via the rule for the subtraction operator? It depends on
6118 the next token. Of course, if the next token is `)', we must reduce;
6119 shifting is invalid because no single rule can reduce the token
6120 sequence `- 2 )' or anything starting with that. But if the next token
6121 is `*' or `<', we have a choice: either shifting or reduction would
6122 allow the parse to complete, but with different results.
6123
6124 To decide which one Bison should do, we must consider the results.
6125 If the next operator token OP is shifted, then it must be reduced first
6126 in order to permit another opportunity to reduce the difference. The
6127 result is (in effect) `1 - (2 OP 3)'. On the other hand, if the
6128 subtraction is reduced before shifting OP, the result is
6129 `(1 - 2) OP 3'. Clearly, then, the choice of shift or reduce should
6130 depend on the relative precedence of the operators `-' and OP: `*'
6131 should be shifted first, but not `<'.
6132
6133 What about input such as `1 - 2 - 5'; should this be `(1 - 2) - 5'
6134 or should it be `1 - (2 - 5)'? For most operators we prefer the
6135 former, which is called "left association". The latter alternative,
6136 "right association", is desirable for assignment operators. The choice
6137 of left or right association is a matter of whether the parser chooses
6138 to shift or reduce when the stack contains `1 - 2' and the lookahead
6139 token is `-': shifting makes right-associativity.
6140
6141 
6142 File: bison.info, Node: Using Precedence, Next: Precedence Examples, Prev: Wh y Precedence, Up: Precedence
6143
6144 5.3.2 Specifying Operator Precedence
6145 ------------------------------------
6146
6147 Bison allows you to specify these choices with the operator precedence
6148 declarations `%left' and `%right'. Each such declaration contains a
6149 list of tokens, which are operators whose precedence and associativity
6150 is being declared. The `%left' declaration makes all those operators
6151 left-associative and the `%right' declaration makes them
6152 right-associative. A third alternative is `%nonassoc', which declares
6153 that it is a syntax error to find the same operator twice "in a row".
6154
6155 The relative precedence of different operators is controlled by the
6156 order in which they are declared. The first `%left' or `%right'
6157 declaration in the file declares the operators whose precedence is
6158 lowest, the next such declaration declares the operators whose
6159 precedence is a little higher, and so on.
6160
6161 
6162 File: bison.info, Node: Precedence Examples, Next: How Precedence, Prev: Usin g Precedence, Up: Precedence
6163
6164 5.3.3 Precedence Examples
6165 -------------------------
6166
6167 In our example, we would want the following declarations:
6168
6169 %left '<'
6170 %left '-'
6171 %left '*'
6172
6173 In a more complete example, which supports other operators as well,
6174 we would declare them in groups of equal precedence. For example,
6175 `'+'' is declared with `'-'':
6176
6177 %left '<' '>' '=' NE LE GE
6178 %left '+' '-'
6179 %left '*' '/'
6180
6181 (Here `NE' and so on stand for the operators for "not equal" and so on.
6182 We assume that these tokens are more than one character long and
6183 therefore are represented by names, not character literals.)
6184
6185 
6186 File: bison.info, Node: How Precedence, Prev: Precedence Examples, Up: Preced ence
6187
6188 5.3.4 How Precedence Works
6189 --------------------------
6190
6191 The first effect of the precedence declarations is to assign precedence
6192 levels to the terminal symbols declared. The second effect is to assign
6193 precedence levels to certain rules: each rule gets its precedence from
6194 the last terminal symbol mentioned in the components. (You can also
6195 specify explicitly the precedence of a rule. *Note Context-Dependent
6196 Precedence: Contextual Precedence.)
6197
6198 Finally, the resolution of conflicts works by comparing the
6199 precedence of the rule being considered with that of the lookahead
6200 token. If the token's precedence is higher, the choice is to shift.
6201 If the rule's precedence is higher, the choice is to reduce. If they
6202 have equal precedence, the choice is made based on the associativity of
6203 that precedence level. The verbose output file made by `-v' (*note
6204 Invoking Bison: Invocation.) says how each conflict was resolved.
6205
6206 Not all rules and not all tokens have precedence. If either the
6207 rule or the lookahead token has no precedence, then the default is to
6208 shift.
6209
6210 
6211 File: bison.info, Node: Contextual Precedence, Next: Parser States, Prev: Pre cedence, Up: Algorithm
6212
6213 5.4 Context-Dependent Precedence
6214 ================================
6215
6216 Often the precedence of an operator depends on the context. This sounds
6217 outlandish at first, but it is really very common. For example, a minus
6218 sign typically has a very high precedence as a unary operator, and a
6219 somewhat lower precedence (lower than multiplication) as a binary
6220 operator.
6221
6222 The Bison precedence declarations, `%left', `%right' and
6223 `%nonassoc', can only be used once for a given token; so a token has
6224 only one precedence declared in this way. For context-dependent
6225 precedence, you need to use an additional mechanism: the `%prec'
6226 modifier for rules.
6227
6228 The `%prec' modifier declares the precedence of a particular rule by
6229 specifying a terminal symbol whose precedence should be used for that
6230 rule. It's not necessary for that symbol to appear otherwise in the
6231 rule. The modifier's syntax is:
6232
6233 %prec TERMINAL-SYMBOL
6234
6235 and it is written after the components of the rule. Its effect is to
6236 assign the rule the precedence of TERMINAL-SYMBOL, overriding the
6237 precedence that would be deduced for it in the ordinary way. The
6238 altered rule precedence then affects how conflicts involving that rule
6239 are resolved (*note Operator Precedence: Precedence.).
6240
6241 Here is how `%prec' solves the problem of unary minus. First,
6242 declare a precedence for a fictitious terminal symbol named `UMINUS'.
6243 There are no tokens of this type, but the symbol serves to stand for its
6244 precedence:
6245
6246 ...
6247 %left '+' '-'
6248 %left '*'
6249 %left UMINUS
6250
6251 Now the precedence of `UMINUS' can be used in specific rules:
6252
6253 exp: ...
6254 | exp '-' exp
6255 ...
6256 | '-' exp %prec UMINUS
6257
6258 
6259 File: bison.info, Node: Parser States, Next: Reduce/Reduce, Prev: Contextual Precedence, Up: Algorithm
6260
6261 5.5 Parser States
6262 =================
6263
6264 The function `yyparse' is implemented using a finite-state machine.
6265 The values pushed on the parser stack are not simply token type codes;
6266 they represent the entire sequence of terminal and nonterminal symbols
6267 at or near the top of the stack. The current state collects all the
6268 information about previous input which is relevant to deciding what to
6269 do next.
6270
6271 Each time a lookahead token is read, the current parser state
6272 together with the type of lookahead token are looked up in a table.
6273 This table entry can say, "Shift the lookahead token." In this case,
6274 it also specifies the new parser state, which is pushed onto the top of
6275 the parser stack. Or it can say, "Reduce using rule number N." This
6276 means that a certain number of tokens or groupings are taken off the
6277 top of the stack, and replaced by one grouping. In other words, that
6278 number of states are popped from the stack, and one new state is pushed.
6279
6280 There is one other alternative: the table can say that the lookahead
6281 token is erroneous in the current state. This causes error processing
6282 to begin (*note Error Recovery::).
6283
6284 
6285 File: bison.info, Node: Reduce/Reduce, Next: Mystery Conflicts, Prev: Parser States, Up: Algorithm
6286
6287 5.6 Reduce/Reduce Conflicts
6288 ===========================
6289
6290 A reduce/reduce conflict occurs if there are two or more rules that
6291 apply to the same sequence of input. This usually indicates a serious
6292 error in the grammar.
6293
6294 For example, here is an erroneous attempt to define a sequence of
6295 zero or more `word' groupings.
6296
6297 sequence: /* empty */
6298 { printf ("empty sequence\n"); }
6299 | maybeword
6300 | sequence word
6301 { printf ("added word %s\n", $2); }
6302 ;
6303
6304 maybeword: /* empty */
6305 { printf ("empty maybeword\n"); }
6306 | word
6307 { printf ("single word %s\n", $1); }
6308 ;
6309
6310 The error is an ambiguity: there is more than one way to parse a single
6311 `word' into a `sequence'. It could be reduced to a `maybeword' and
6312 then into a `sequence' via the second rule. Alternatively,
6313 nothing-at-all could be reduced into a `sequence' via the first rule,
6314 and this could be combined with the `word' using the third rule for
6315 `sequence'.
6316
6317 There is also more than one way to reduce nothing-at-all into a
6318 `sequence'. This can be done directly via the first rule, or
6319 indirectly via `maybeword' and then the second rule.
6320
6321 You might think that this is a distinction without a difference,
6322 because it does not change whether any particular input is valid or
6323 not. But it does affect which actions are run. One parsing order runs
6324 the second rule's action; the other runs the first rule's action and
6325 the third rule's action. In this example, the output of the program
6326 changes.
6327
6328 Bison resolves a reduce/reduce conflict by choosing to use the rule
6329 that appears first in the grammar, but it is very risky to rely on
6330 this. Every reduce/reduce conflict must be studied and usually
6331 eliminated. Here is the proper way to define `sequence':
6332
6333 sequence: /* empty */
6334 { printf ("empty sequence\n"); }
6335 | sequence word
6336 { printf ("added word %s\n", $2); }
6337 ;
6338
6339 Here is another common error that yields a reduce/reduce conflict:
6340
6341 sequence: /* empty */
6342 | sequence words
6343 | sequence redirects
6344 ;
6345
6346 words: /* empty */
6347 | words word
6348 ;
6349
6350 redirects:/* empty */
6351 | redirects redirect
6352 ;
6353
6354 The intention here is to define a sequence which can contain either
6355 `word' or `redirect' groupings. The individual definitions of
6356 `sequence', `words' and `redirects' are error-free, but the three
6357 together make a subtle ambiguity: even an empty input can be parsed in
6358 infinitely many ways!
6359
6360 Consider: nothing-at-all could be a `words'. Or it could be two
6361 `words' in a row, or three, or any number. It could equally well be a
6362 `redirects', or two, or any number. Or it could be a `words' followed
6363 by three `redirects' and another `words'. And so on.
6364
6365 Here are two ways to correct these rules. First, to make it a
6366 single level of sequence:
6367
6368 sequence: /* empty */
6369 | sequence word
6370 | sequence redirect
6371 ;
6372
6373 Second, to prevent either a `words' or a `redirects' from being
6374 empty:
6375
6376 sequence: /* empty */
6377 | sequence words
6378 | sequence redirects
6379 ;
6380
6381 words: word
6382 | words word
6383 ;
6384
6385 redirects:redirect
6386 | redirects redirect
6387 ;
6388
6389 
6390 File: bison.info, Node: Mystery Conflicts, Next: Generalized LR Parsing, Prev : Reduce/Reduce, Up: Algorithm
6391
6392 5.7 Mysterious Reduce/Reduce Conflicts
6393 ======================================
6394
6395 Sometimes reduce/reduce conflicts can occur that don't look warranted.
6396 Here is an example:
6397
6398 %token ID
6399
6400 %%
6401 def: param_spec return_spec ','
6402 ;
6403 param_spec:
6404 type
6405 | name_list ':' type
6406 ;
6407 return_spec:
6408 type
6409 | name ':' type
6410 ;
6411 type: ID
6412 ;
6413 name: ID
6414 ;
6415 name_list:
6416 name
6417 | name ',' name_list
6418 ;
6419
6420 It would seem that this grammar can be parsed with only a single
6421 token of lookahead: when a `param_spec' is being read, an `ID' is a
6422 `name' if a comma or colon follows, or a `type' if another `ID'
6423 follows. In other words, this grammar is LR(1).
6424
6425 However, Bison, like most parser generators, cannot actually handle
6426 all LR(1) grammars. In this grammar, two contexts, that after an `ID'
6427 at the beginning of a `param_spec' and likewise at the beginning of a
6428 `return_spec', are similar enough that Bison assumes they are the same.
6429 They appear similar because the same set of rules would be active--the
6430 rule for reducing to a `name' and that for reducing to a `type'. Bison
6431 is unable to determine at that stage of processing that the rules would
6432 require different lookahead tokens in the two contexts, so it makes a
6433 single parser state for them both. Combining the two contexts causes a
6434 conflict later. In parser terminology, this occurrence means that the
6435 grammar is not LALR(1).
6436
6437 In general, it is better to fix deficiencies than to document them.
6438 But this particular deficiency is intrinsically hard to fix; parser
6439 generators that can handle LR(1) grammars are hard to write and tend to
6440 produce parsers that are very large. In practice, Bison is more useful
6441 as it is now.
6442
6443 When the problem arises, you can often fix it by identifying the two
6444 parser states that are being confused, and adding something to make them
6445 look distinct. In the above example, adding one rule to `return_spec'
6446 as follows makes the problem go away:
6447
6448 %token BOGUS
6449 ...
6450 %%
6451 ...
6452 return_spec:
6453 type
6454 | name ':' type
6455 /* This rule is never used. */
6456 | ID BOGUS
6457 ;
6458
6459 This corrects the problem because it introduces the possibility of an
6460 additional active rule in the context after the `ID' at the beginning of
6461 `return_spec'. This rule is not active in the corresponding context in
6462 a `param_spec', so the two contexts receive distinct parser states. As
6463 long as the token `BOGUS' is never generated by `yylex', the added rule
6464 cannot alter the way actual input is parsed.
6465
6466 In this particular example, there is another way to solve the
6467 problem: rewrite the rule for `return_spec' to use `ID' directly
6468 instead of via `name'. This also causes the two confusing contexts to
6469 have different sets of active rules, because the one for `return_spec'
6470 activates the altered rule for `return_spec' rather than the one for
6471 `name'.
6472
6473 param_spec:
6474 type
6475 | name_list ':' type
6476 ;
6477 return_spec:
6478 type
6479 | ID ':' type
6480 ;
6481
6482 For a more detailed exposition of LALR(1) parsers and parser
6483 generators, please see: Frank DeRemer and Thomas Pennello, Efficient
6484 Computation of LALR(1) Look-Ahead Sets, `ACM Transactions on
6485 Programming Languages and Systems', Vol. 4, No. 4 (October 1982), pp.
6486 615-649 `http://doi.acm.org/10.1145/69622.357187'.
6487
6488 
6489 File: bison.info, Node: Generalized LR Parsing, Next: Memory Management, Prev : Mystery Conflicts, Up: Algorithm
6490
6491 5.8 Generalized LR (GLR) Parsing
6492 ================================
6493
6494 Bison produces _deterministic_ parsers that choose uniquely when to
6495 reduce and which reduction to apply based on a summary of the preceding
6496 input and on one extra token of lookahead. As a result, normal Bison
6497 handles a proper subset of the family of context-free languages.
6498 Ambiguous grammars, since they have strings with more than one possible
6499 sequence of reductions cannot have deterministic parsers in this sense.
6500 The same is true of languages that require more than one symbol of
6501 lookahead, since the parser lacks the information necessary to make a
6502 decision at the point it must be made in a shift-reduce parser.
6503 Finally, as previously mentioned (*note Mystery Conflicts::), there are
6504 languages where Bison's particular choice of how to summarize the input
6505 seen so far loses necessary information.
6506
6507 When you use the `%glr-parser' declaration in your grammar file,
6508 Bison generates a parser that uses a different algorithm, called
6509 Generalized LR (or GLR). A Bison GLR parser uses the same basic
6510 algorithm for parsing as an ordinary Bison parser, but behaves
6511 differently in cases where there is a shift-reduce conflict that has not
6512 been resolved by precedence rules (*note Precedence::) or a
6513 reduce-reduce conflict. When a GLR parser encounters such a situation,
6514 it effectively _splits_ into a several parsers, one for each possible
6515 shift or reduction. These parsers then proceed as usual, consuming
6516 tokens in lock-step. Some of the stacks may encounter other conflicts
6517 and split further, with the result that instead of a sequence of states,
6518 a Bison GLR parsing stack is what is in effect a tree of states.
6519
6520 In effect, each stack represents a guess as to what the proper parse
6521 is. Additional input may indicate that a guess was wrong, in which case
6522 the appropriate stack silently disappears. Otherwise, the semantics
6523 actions generated in each stack are saved, rather than being executed
6524 immediately. When a stack disappears, its saved semantic actions never
6525 get executed. When a reduction causes two stacks to become equivalent,
6526 their sets of semantic actions are both saved with the state that
6527 results from the reduction. We say that two stacks are equivalent when
6528 they both represent the same sequence of states, and each pair of
6529 corresponding states represents a grammar symbol that produces the same
6530 segment of the input token stream.
6531
6532 Whenever the parser makes a transition from having multiple states
6533 to having one, it reverts to the normal LALR(1) parsing algorithm,
6534 after resolving and executing the saved-up actions. At this
6535 transition, some of the states on the stack will have semantic values
6536 that are sets (actually multisets) of possible actions. The parser
6537 tries to pick one of the actions by first finding one whose rule has
6538 the highest dynamic precedence, as set by the `%dprec' declaration.
6539 Otherwise, if the alternative actions are not ordered by precedence,
6540 but there the same merging function is declared for both rules by the
6541 `%merge' declaration, Bison resolves and evaluates both and then calls
6542 the merge function on the result. Otherwise, it reports an ambiguity.
6543
6544 It is possible to use a data structure for the GLR parsing tree that
6545 permits the processing of any LALR(1) grammar in linear time (in the
6546 size of the input), any unambiguous (not necessarily LALR(1)) grammar in
6547 quadratic worst-case time, and any general (possibly ambiguous)
6548 context-free grammar in cubic worst-case time. However, Bison currently
6549 uses a simpler data structure that requires time proportional to the
6550 length of the input times the maximum number of stacks required for any
6551 prefix of the input. Thus, really ambiguous or nondeterministic
6552 grammars can require exponential time and space to process. Such badly
6553 behaving examples, however, are not generally of practical interest.
6554 Usually, nondeterminism in a grammar is local--the parser is "in doubt"
6555 only for a few tokens at a time. Therefore, the current data structure
6556 should generally be adequate. On LALR(1) portions of a grammar, in
6557 particular, it is only slightly slower than with the default Bison
6558 parser.
6559
6560 For a more detailed exposition of GLR parsers, please see: Elizabeth
6561 Scott, Adrian Johnstone and Shamsa Sadaf Hussain, Tomita-Style
6562 Generalised LR Parsers, Royal Holloway, University of London,
6563 Department of Computer Science, TR-00-12,
6564 `http://www.cs.rhul.ac.uk/research/languages/publications/tomita_style_1.ps',
6565 (2000-12-24).
6566
6567 
6568 File: bison.info, Node: Memory Management, Prev: Generalized LR Parsing, Up: Algorithm
6569
6570 5.9 Memory Management, and How to Avoid Memory Exhaustion
6571 =========================================================
6572
6573 The Bison parser stack can run out of memory if too many tokens are
6574 shifted and not reduced. When this happens, the parser function
6575 `yyparse' calls `yyerror' and then returns 2.
6576
6577 Because Bison parsers have growing stacks, hitting the upper limit
6578 usually results from using a right recursion instead of a left
6579 recursion, *Note Recursive Rules: Recursion.
6580
6581 By defining the macro `YYMAXDEPTH', you can control how deep the
6582 parser stack can become before memory is exhausted. Define the macro
6583 with a value that is an integer. This value is the maximum number of
6584 tokens that can be shifted (and not reduced) before overflow.
6585
6586 The stack space allowed is not necessarily allocated. If you
6587 specify a large value for `YYMAXDEPTH', the parser normally allocates a
6588 small stack at first, and then makes it bigger by stages as needed.
6589 This increasing allocation happens automatically and silently.
6590 Therefore, you do not need to make `YYMAXDEPTH' painfully small merely
6591 to save space for ordinary inputs that do not need much stack.
6592
6593 However, do not allow `YYMAXDEPTH' to be a value so large that
6594 arithmetic overflow could occur when calculating the size of the stack
6595 space. Also, do not allow `YYMAXDEPTH' to be less than `YYINITDEPTH'.
6596
6597 The default value of `YYMAXDEPTH', if you do not define it, is 10000.
6598
6599 You can control how much stack is allocated initially by defining the
6600 macro `YYINITDEPTH' to a positive integer. For the C LALR(1) parser,
6601 this value must be a compile-time constant unless you are assuming C99
6602 or some other target language or compiler that allows variable-length
6603 arrays. The default is 200.
6604
6605 Do not allow `YYINITDEPTH' to be greater than `YYMAXDEPTH'.
6606
6607 Because of semantical differences between C and C++, the LALR(1)
6608 parsers in C produced by Bison cannot grow when compiled by C++
6609 compilers. In this precise case (compiling a C parser as C++) you are
6610 suggested to grow `YYINITDEPTH'. The Bison maintainers hope to fix
6611 this deficiency in a future release.
6612
6613 
6614 File: bison.info, Node: Error Recovery, Next: Context Dependency, Prev: Algor ithm, Up: Top
6615
6616 6 Error Recovery
6617 ****************
6618
6619 It is not usually acceptable to have a program terminate on a syntax
6620 error. For example, a compiler should recover sufficiently to parse the
6621 rest of the input file and check it for errors; a calculator should
6622 accept another expression.
6623
6624 In a simple interactive command parser where each input is one line,
6625 it may be sufficient to allow `yyparse' to return 1 on error and have
6626 the caller ignore the rest of the input line when that happens (and
6627 then call `yyparse' again). But this is inadequate for a compiler,
6628 because it forgets all the syntactic context leading up to the error.
6629 A syntax error deep within a function in the compiler input should not
6630 cause the compiler to treat the following line like the beginning of a
6631 source file.
6632
6633 You can define how to recover from a syntax error by writing rules to
6634 recognize the special token `error'. This is a terminal symbol that is
6635 always defined (you need not declare it) and reserved for error
6636 handling. The Bison parser generates an `error' token whenever a
6637 syntax error happens; if you have provided a rule to recognize this
6638 token in the current context, the parse can continue.
6639
6640 For example:
6641
6642 stmnts: /* empty string */
6643 | stmnts '\n'
6644 | stmnts exp '\n'
6645 | stmnts error '\n'
6646
6647 The fourth rule in this example says that an error followed by a
6648 newline makes a valid addition to any `stmnts'.
6649
6650 What happens if a syntax error occurs in the middle of an `exp'? The
6651 error recovery rule, interpreted strictly, applies to the precise
6652 sequence of a `stmnts', an `error' and a newline. If an error occurs in
6653 the middle of an `exp', there will probably be some additional tokens
6654 and subexpressions on the stack after the last `stmnts', and there will
6655 be tokens to read before the next newline. So the rule is not
6656 applicable in the ordinary way.
6657
6658 But Bison can force the situation to fit the rule, by discarding
6659 part of the semantic context and part of the input. First it discards
6660 states and objects from the stack until it gets back to a state in
6661 which the `error' token is acceptable. (This means that the
6662 subexpressions already parsed are discarded, back to the last complete
6663 `stmnts'.) At this point the `error' token can be shifted. Then, if
6664 the old lookahead token is not acceptable to be shifted next, the
6665 parser reads tokens and discards them until it finds a token which is
6666 acceptable. In this example, Bison reads and discards input until the
6667 next newline so that the fourth rule can apply. Note that discarded
6668 symbols are possible sources of memory leaks, see *Note Freeing
6669 Discarded Symbols: Destructor Decl, for a means to reclaim this memory.
6670
6671 The choice of error rules in the grammar is a choice of strategies
6672 for error recovery. A simple and useful strategy is simply to skip the
6673 rest of the current input line or current statement if an error is
6674 detected:
6675
6676 stmnt: error ';' /* On error, skip until ';' is read. */
6677
6678 It is also useful to recover to the matching close-delimiter of an
6679 opening-delimiter that has already been parsed. Otherwise the
6680 close-delimiter will probably appear to be unmatched, and generate
6681 another, spurious error message:
6682
6683 primary: '(' expr ')'
6684 | '(' error ')'
6685 ...
6686 ;
6687
6688 Error recovery strategies are necessarily guesses. When they guess
6689 wrong, one syntax error often leads to another. In the above example,
6690 the error recovery rule guesses that an error is due to bad input
6691 within one `stmnt'. Suppose that instead a spurious semicolon is
6692 inserted in the middle of a valid `stmnt'. After the error recovery
6693 rule recovers from the first error, another syntax error will be found
6694 straightaway, since the text following the spurious semicolon is also
6695 an invalid `stmnt'.
6696
6697 To prevent an outpouring of error messages, the parser will output
6698 no error message for another syntax error that happens shortly after
6699 the first; only after three consecutive input tokens have been
6700 successfully shifted will error messages resume.
6701
6702 Note that rules which accept the `error' token may have actions, just
6703 as any other rules can.
6704
6705 You can make error messages resume immediately by using the macro
6706 `yyerrok' in an action. If you do this in the error rule's action, no
6707 error messages will be suppressed. This macro requires no arguments;
6708 `yyerrok;' is a valid C statement.
6709
6710 The previous lookahead token is reanalyzed immediately after an
6711 error. If this is unacceptable, then the macro `yyclearin' may be used
6712 to clear this token. Write the statement `yyclearin;' in the error
6713 rule's action. *Note Special Features for Use in Actions: Action
6714 Features.
6715
6716 For example, suppose that on a syntax error, an error handling
6717 routine is called that advances the input stream to some point where
6718 parsing should once again commence. The next symbol returned by the
6719 lexical scanner is probably correct. The previous lookahead token
6720 ought to be discarded with `yyclearin;'.
6721
6722 The expression `YYRECOVERING ()' yields 1 when the parser is
6723 recovering from a syntax error, and 0 otherwise. Syntax error
6724 diagnostics are suppressed while recovering from a syntax error.
6725
6726 
6727 File: bison.info, Node: Context Dependency, Next: Debugging, Prev: Error Reco very, Up: Top
6728
6729 7 Handling Context Dependencies
6730 *******************************
6731
6732 The Bison paradigm is to parse tokens first, then group them into larger
6733 syntactic units. In many languages, the meaning of a token is affected
6734 by its context. Although this violates the Bison paradigm, certain
6735 techniques (known as "kludges") may enable you to write Bison parsers
6736 for such languages.
6737
6738 * Menu:
6739
6740 * Semantic Tokens:: Token parsing can depend on the semantic context.
6741 * Lexical Tie-ins:: Token parsing can depend on the syntactic context.
6742 * Tie-in Recovery:: Lexical tie-ins have implications for how
6743 error recovery rules must be written.
6744
6745 (Actually, "kludge" means any technique that gets its job done but is
6746 neither clean nor robust.)
6747
6748 
6749 File: bison.info, Node: Semantic Tokens, Next: Lexical Tie-ins, Up: Context D ependency
6750
6751 7.1 Semantic Info in Token Types
6752 ================================
6753
6754 The C language has a context dependency: the way an identifier is used
6755 depends on what its current meaning is. For example, consider this:
6756
6757 foo (x);
6758
6759 This looks like a function call statement, but if `foo' is a typedef
6760 name, then this is actually a declaration of `x'. How can a Bison
6761 parser for C decide how to parse this input?
6762
6763 The method used in GNU C is to have two different token types,
6764 `IDENTIFIER' and `TYPENAME'. When `yylex' finds an identifier, it
6765 looks up the current declaration of the identifier in order to decide
6766 which token type to return: `TYPENAME' if the identifier is declared as
6767 a typedef, `IDENTIFIER' otherwise.
6768
6769 The grammar rules can then express the context dependency by the
6770 choice of token type to recognize. `IDENTIFIER' is accepted as an
6771 expression, but `TYPENAME' is not. `TYPENAME' can start a declaration,
6772 but `IDENTIFIER' cannot. In contexts where the meaning of the
6773 identifier is _not_ significant, such as in declarations that can
6774 shadow a typedef name, either `TYPENAME' or `IDENTIFIER' is
6775 accepted--there is one rule for each of the two token types.
6776
6777 This technique is simple to use if the decision of which kinds of
6778 identifiers to allow is made at a place close to where the identifier is
6779 parsed. But in C this is not always so: C allows a declaration to
6780 redeclare a typedef name provided an explicit type has been specified
6781 earlier:
6782
6783 typedef int foo, bar;
6784 int baz (void)
6785 {
6786 static bar (bar); /* redeclare `bar' as static variable */
6787 extern foo foo (foo); /* redeclare `foo' as function */
6788 return foo (bar);
6789 }
6790
6791 Unfortunately, the name being declared is separated from the
6792 declaration construct itself by a complicated syntactic structure--the
6793 "declarator".
6794
6795 As a result, part of the Bison parser for C needs to be duplicated,
6796 with all the nonterminal names changed: once for parsing a declaration
6797 in which a typedef name can be redefined, and once for parsing a
6798 declaration in which that can't be done. Here is a part of the
6799 duplication, with actions omitted for brevity:
6800
6801 initdcl:
6802 declarator maybeasm '='
6803 init
6804 | declarator maybeasm
6805 ;
6806
6807 notype_initdcl:
6808 notype_declarator maybeasm '='
6809 init
6810 | notype_declarator maybeasm
6811 ;
6812
6813 Here `initdcl' can redeclare a typedef name, but `notype_initdcl'
6814 cannot. The distinction between `declarator' and `notype_declarator'
6815 is the same sort of thing.
6816
6817 There is some similarity between this technique and a lexical tie-in
6818 (described next), in that information which alters the lexical analysis
6819 is changed during parsing by other parts of the program. The
6820 difference is here the information is global, and is used for other
6821 purposes in the program. A true lexical tie-in has a special-purpose
6822 flag controlled by the syntactic context.
6823
6824 
6825 File: bison.info, Node: Lexical Tie-ins, Next: Tie-in Recovery, Prev: Semanti c Tokens, Up: Context Dependency
6826
6827 7.2 Lexical Tie-ins
6828 ===================
6829
6830 One way to handle context-dependency is the "lexical tie-in": a flag
6831 which is set by Bison actions, whose purpose is to alter the way tokens
6832 are parsed.
6833
6834 For example, suppose we have a language vaguely like C, but with a
6835 special construct `hex (HEX-EXPR)'. After the keyword `hex' comes an
6836 expression in parentheses in which all integers are hexadecimal. In
6837 particular, the token `a1b' must be treated as an integer rather than
6838 as an identifier if it appears in that context. Here is how you can do
6839 it:
6840
6841 %{
6842 int hexflag;
6843 int yylex (void);
6844 void yyerror (char const *);
6845 %}
6846 %%
6847 ...
6848 expr: IDENTIFIER
6849 | constant
6850 | HEX '('
6851 { hexflag = 1; }
6852 expr ')'
6853 { hexflag = 0;
6854 $$ = $4; }
6855 | expr '+' expr
6856 { $$ = make_sum ($1, $3); }
6857 ...
6858 ;
6859
6860 constant:
6861 INTEGER
6862 | STRING
6863 ;
6864
6865 Here we assume that `yylex' looks at the value of `hexflag'; when it is
6866 nonzero, all integers are parsed in hexadecimal, and tokens starting
6867 with letters are parsed as integers if possible.
6868
6869 The declaration of `hexflag' shown in the prologue of the parser file
6870 is needed to make it accessible to the actions (*note The Prologue:
6871 Prologue.). You must also write the code in `yylex' to obey the flag.
6872
6873 
6874 File: bison.info, Node: Tie-in Recovery, Prev: Lexical Tie-ins, Up: Context D ependency
6875
6876 7.3 Lexical Tie-ins and Error Recovery
6877 ======================================
6878
6879 Lexical tie-ins make strict demands on any error recovery rules you
6880 have. *Note Error Recovery::.
6881
6882 The reason for this is that the purpose of an error recovery rule is
6883 to abort the parsing of one construct and resume in some larger
6884 construct. For example, in C-like languages, a typical error recovery
6885 rule is to skip tokens until the next semicolon, and then start a new
6886 statement, like this:
6887
6888 stmt: expr ';'
6889 | IF '(' expr ')' stmt { ... }
6890 ...
6891 error ';'
6892 { hexflag = 0; }
6893 ;
6894
6895 If there is a syntax error in the middle of a `hex (EXPR)'
6896 construct, this error rule will apply, and then the action for the
6897 completed `hex (EXPR)' will never run. So `hexflag' would remain set
6898 for the entire rest of the input, or until the next `hex' keyword,
6899 causing identifiers to be misinterpreted as integers.
6900
6901 To avoid this problem the error recovery rule itself clears
6902 `hexflag'.
6903
6904 There may also be an error recovery rule that works within
6905 expressions. For example, there could be a rule which applies within
6906 parentheses and skips to the close-parenthesis:
6907
6908 expr: ...
6909 | '(' expr ')'
6910 { $$ = $2; }
6911 | '(' error ')'
6912 ...
6913
6914 If this rule acts within the `hex' construct, it is not going to
6915 abort that construct (since it applies to an inner level of parentheses
6916 within the construct). Therefore, it should not clear the flag: the
6917 rest of the `hex' construct should be parsed with the flag still in
6918 effect.
6919
6920 What if there is an error recovery rule which might abort out of the
6921 `hex' construct or might not, depending on circumstances? There is no
6922 way you can write the action to determine whether a `hex' construct is
6923 being aborted or not. So if you are using a lexical tie-in, you had
6924 better make sure your error recovery rules are not of this kind. Each
6925 rule must be such that you can be sure that it always will, or always
6926 won't, have to clear the flag.
6927
6928 
6929 File: bison.info, Node: Debugging, Next: Invocation, Prev: Context Dependency , Up: Top
6930
6931 8 Debugging Your Parser
6932 ***********************
6933
6934 Developing a parser can be a challenge, especially if you don't
6935 understand the algorithm (*note The Bison Parser Algorithm:
6936 Algorithm.). Even so, sometimes a detailed description of the automaton
6937 can help (*note Understanding Your Parser: Understanding.), or tracing
6938 the execution of the parser can give some insight on why it behaves
6939 improperly (*note Tracing Your Parser: Tracing.).
6940
6941 * Menu:
6942
6943 * Understanding:: Understanding the structure of your parser.
6944 * Tracing:: Tracing the execution of your parser.
6945
6946 
6947 File: bison.info, Node: Understanding, Next: Tracing, Up: Debugging
6948
6949 8.1 Understanding Your Parser
6950 =============================
6951
6952 As documented elsewhere (*note The Bison Parser Algorithm: Algorithm.)
6953 Bison parsers are "shift/reduce automata". In some cases (much more
6954 frequent than one would hope), looking at this automaton is required to
6955 tune or simply fix a parser. Bison provides two different
6956 representation of it, either textually or graphically (as a DOT file).
6957
6958 The textual file is generated when the options `--report' or
6959 `--verbose' are specified, see *Note Invoking Bison: Invocation. Its
6960 name is made by removing `.tab.c' or `.c' from the parser output file
6961 name, and adding `.output' instead. Therefore, if the input file is
6962 `foo.y', then the parser file is called `foo.tab.c' by default. As a
6963 consequence, the verbose output file is called `foo.output'.
6964
6965 The following grammar file, `calc.y', will be used in the sequel:
6966
6967 %token NUM STR
6968 %left '+' '-'
6969 %left '*'
6970 %%
6971 exp: exp '+' exp
6972 | exp '-' exp
6973 | exp '*' exp
6974 | exp '/' exp
6975 | NUM
6976 ;
6977 useless: STR;
6978 %%
6979
6980 `bison' reports:
6981
6982 calc.y: warning: 1 nonterminal and 1 rule useless in grammar
6983 calc.y:11.1-7: warning: nonterminal useless in grammar: useless
6984 calc.y:11.10-12: warning: rule useless in grammar: useless: STR
6985 calc.y: conflicts: 7 shift/reduce
6986
6987 When given `--report=state', in addition to `calc.tab.c', it creates
6988 a file `calc.output' with contents detailed below. The order of the
6989 output and the exact presentation might vary, but the interpretation is
6990 the same.
6991
6992 The first section includes details on conflicts that were solved
6993 thanks to precedence and/or associativity:
6994
6995 Conflict in state 8 between rule 2 and token '+' resolved as reduce.
6996 Conflict in state 8 between rule 2 and token '-' resolved as reduce.
6997 Conflict in state 8 between rule 2 and token '*' resolved as shift.
6998 ...
6999
7000
7001 The next section lists states that still have conflicts.
7002
7003 State 8 conflicts: 1 shift/reduce
7004 State 9 conflicts: 1 shift/reduce
7005 State 10 conflicts: 1 shift/reduce
7006 State 11 conflicts: 4 shift/reduce
7007
7008 The next section reports useless tokens, nonterminal and rules. Useless
7009 nonterminals and rules are removed in order to produce a smaller parser,
7010 but useless tokens are preserved, since they might be used by the
7011 scanner (note the difference between "useless" and "unused" below):
7012
7013 Nonterminals useless in grammar:
7014 useless
7015
7016 Terminals unused in grammar:
7017 STR
7018
7019 Rules useless in grammar:
7020 #6 useless: STR;
7021
7022 The next section reproduces the exact grammar that Bison used:
7023
7024 Grammar
7025
7026 Number, Line, Rule
7027 0 5 $accept -> exp $end
7028 1 5 exp -> exp '+' exp
7029 2 6 exp -> exp '-' exp
7030 3 7 exp -> exp '*' exp
7031 4 8 exp -> exp '/' exp
7032 5 9 exp -> NUM
7033
7034 and reports the uses of the symbols:
7035
7036 Terminals, with rules where they appear
7037
7038 $end (0) 0
7039 '*' (42) 3
7040 '+' (43) 1
7041 '-' (45) 2
7042 '/' (47) 4
7043 error (256)
7044 NUM (258) 5
7045
7046 Nonterminals, with rules where they appear
7047
7048 $accept (8)
7049 on left: 0
7050 exp (9)
7051 on left: 1 2 3 4 5, on right: 0 1 2 3 4
7052
7053 Bison then proceeds onto the automaton itself, describing each state
7054 with it set of "items", also known as "pointed rules". Each item is a
7055 production rule together with a point (marked by `.') that the input
7056 cursor.
7057
7058 state 0
7059
7060 $accept -> . exp $ (rule 0)
7061
7062 NUM shift, and go to state 1
7063
7064 exp go to state 2
7065
7066 This reads as follows: "state 0 corresponds to being at the very
7067 beginning of the parsing, in the initial rule, right before the start
7068 symbol (here, `exp'). When the parser returns to this state right
7069 after having reduced a rule that produced an `exp', the control flow
7070 jumps to state 2. If there is no such transition on a nonterminal
7071 symbol, and the lookahead is a `NUM', then this token is shifted on the
7072 parse stack, and the control flow jumps to state 1. Any other
7073 lookahead triggers a syntax error."
7074
7075 Even though the only active rule in state 0 seems to be rule 0, the
7076 report lists `NUM' as a lookahead token because `NUM' can be at the
7077 beginning of any rule deriving an `exp'. By default Bison reports the
7078 so-called "core" or "kernel" of the item set, but if you want to see
7079 more detail you can invoke `bison' with `--report=itemset' to list all
7080 the items, include those that can be derived:
7081
7082 state 0
7083
7084 $accept -> . exp $ (rule 0)
7085 exp -> . exp '+' exp (rule 1)
7086 exp -> . exp '-' exp (rule 2)
7087 exp -> . exp '*' exp (rule 3)
7088 exp -> . exp '/' exp (rule 4)
7089 exp -> . NUM (rule 5)
7090
7091 NUM shift, and go to state 1
7092
7093 exp go to state 2
7094
7095 In the state 1...
7096
7097 state 1
7098
7099 exp -> NUM . (rule 5)
7100
7101 $default reduce using rule 5 (exp)
7102
7103 the rule 5, `exp: NUM;', is completed. Whatever the lookahead token
7104 (`$default'), the parser will reduce it. If it was coming from state
7105 0, then, after this reduction it will return to state 0, and will jump
7106 to state 2 (`exp: go to state 2').
7107
7108 state 2
7109
7110 $accept -> exp . $ (rule 0)
7111 exp -> exp . '+' exp (rule 1)
7112 exp -> exp . '-' exp (rule 2)
7113 exp -> exp . '*' exp (rule 3)
7114 exp -> exp . '/' exp (rule 4)
7115
7116 $ shift, and go to state 3
7117 '+' shift, and go to state 4
7118 '-' shift, and go to state 5
7119 '*' shift, and go to state 6
7120 '/' shift, and go to state 7
7121
7122 In state 2, the automaton can only shift a symbol. For instance,
7123 because of the item `exp -> exp . '+' exp', if the lookahead if `+', it
7124 will be shifted on the parse stack, and the automaton control will jump
7125 to state 4, corresponding to the item `exp -> exp '+' . exp'. Since
7126 there is no default action, any other token than those listed above
7127 will trigger a syntax error.
7128
7129 The state 3 is named the "final state", or the "accepting state":
7130
7131 state 3
7132
7133 $accept -> exp $ . (rule 0)
7134
7135 $default accept
7136
7137 the initial rule is completed (the start symbol and the end of input
7138 were read), the parsing exits successfully.
7139
7140 The interpretation of states 4 to 7 is straightforward, and is left
7141 to the reader.
7142
7143 state 4
7144
7145 exp -> exp '+' . exp (rule 1)
7146
7147 NUM shift, and go to state 1
7148
7149 exp go to state 8
7150
7151 state 5
7152
7153 exp -> exp '-' . exp (rule 2)
7154
7155 NUM shift, and go to state 1
7156
7157 exp go to state 9
7158
7159 state 6
7160
7161 exp -> exp '*' . exp (rule 3)
7162
7163 NUM shift, and go to state 1
7164
7165 exp go to state 10
7166
7167 state 7
7168
7169 exp -> exp '/' . exp (rule 4)
7170
7171 NUM shift, and go to state 1
7172
7173 exp go to state 11
7174
7175 As was announced in beginning of the report, `State 8 conflicts: 1
7176 shift/reduce':
7177
7178 state 8
7179
7180 exp -> exp . '+' exp (rule 1)
7181 exp -> exp '+' exp . (rule 1)
7182 exp -> exp . '-' exp (rule 2)
7183 exp -> exp . '*' exp (rule 3)
7184 exp -> exp . '/' exp (rule 4)
7185
7186 '*' shift, and go to state 6
7187 '/' shift, and go to state 7
7188
7189 '/' [reduce using rule 1 (exp)]
7190 $default reduce using rule 1 (exp)
7191
7192 Indeed, there are two actions associated to the lookahead `/':
7193 either shifting (and going to state 7), or reducing rule 1. The
7194 conflict means that either the grammar is ambiguous, or the parser lacks
7195 information to make the right decision. Indeed the grammar is
7196 ambiguous, as, since we did not specify the precedence of `/', the
7197 sentence `NUM + NUM / NUM' can be parsed as `NUM + (NUM / NUM)', which
7198 corresponds to shifting `/', or as `(NUM + NUM) / NUM', which
7199 corresponds to reducing rule 1.
7200
7201 Because in LALR(1) parsing a single decision can be made, Bison
7202 arbitrarily chose to disable the reduction, see *Note Shift/Reduce
7203 Conflicts: Shift/Reduce. Discarded actions are reported in between
7204 square brackets.
7205
7206 Note that all the previous states had a single possible action:
7207 either shifting the next token and going to the corresponding state, or
7208 reducing a single rule. In the other cases, i.e., when shifting _and_
7209 reducing is possible or when _several_ reductions are possible, the
7210 lookahead is required to select the action. State 8 is one such state:
7211 if the lookahead is `*' or `/' then the action is shifting, otherwise
7212 the action is reducing rule 1. In other words, the first two items,
7213 corresponding to rule 1, are not eligible when the lookahead token is
7214 `*', since we specified that `*' has higher precedence than `+'. More
7215 generally, some items are eligible only with some set of possible
7216 lookahead tokens. When run with `--report=lookahead', Bison specifies
7217 these lookahead tokens:
7218
7219 state 8
7220
7221 exp -> exp . '+' exp (rule 1)
7222 exp -> exp '+' exp . [$, '+', '-', '/'] (rule 1)
7223 exp -> exp . '-' exp (rule 2)
7224 exp -> exp . '*' exp (rule 3)
7225 exp -> exp . '/' exp (rule 4)
7226
7227 '*' shift, and go to state 6
7228 '/' shift, and go to state 7
7229
7230 '/' [reduce using rule 1 (exp)]
7231 $default reduce using rule 1 (exp)
7232
7233 The remaining states are similar:
7234
7235 state 9
7236
7237 exp -> exp . '+' exp (rule 1)
7238 exp -> exp . '-' exp (rule 2)
7239 exp -> exp '-' exp . (rule 2)
7240 exp -> exp . '*' exp (rule 3)
7241 exp -> exp . '/' exp (rule 4)
7242
7243 '*' shift, and go to state 6
7244 '/' shift, and go to state 7
7245
7246 '/' [reduce using rule 2 (exp)]
7247 $default reduce using rule 2 (exp)
7248
7249 state 10
7250
7251 exp -> exp . '+' exp (rule 1)
7252 exp -> exp . '-' exp (rule 2)
7253 exp -> exp . '*' exp (rule 3)
7254 exp -> exp '*' exp . (rule 3)
7255 exp -> exp . '/' exp (rule 4)
7256
7257 '/' shift, and go to state 7
7258
7259 '/' [reduce using rule 3 (exp)]
7260 $default reduce using rule 3 (exp)
7261
7262 state 11
7263
7264 exp -> exp . '+' exp (rule 1)
7265 exp -> exp . '-' exp (rule 2)
7266 exp -> exp . '*' exp (rule 3)
7267 exp -> exp . '/' exp (rule 4)
7268 exp -> exp '/' exp . (rule 4)
7269
7270 '+' shift, and go to state 4
7271 '-' shift, and go to state 5
7272 '*' shift, and go to state 6
7273 '/' shift, and go to state 7
7274
7275 '+' [reduce using rule 4 (exp)]
7276 '-' [reduce using rule 4 (exp)]
7277 '*' [reduce using rule 4 (exp)]
7278 '/' [reduce using rule 4 (exp)]
7279 $default reduce using rule 4 (exp)
7280
7281 Observe that state 11 contains conflicts not only due to the lack of
7282 precedence of `/' with respect to `+', `-', and `*', but also because
7283 the associativity of `/' is not specified.
7284
7285 
7286 File: bison.info, Node: Tracing, Prev: Understanding, Up: Debugging
7287
7288 8.2 Tracing Your Parser
7289 =======================
7290
7291 If a Bison grammar compiles properly but doesn't do what you want when
7292 it runs, the `yydebug' parser-trace feature can help you figure out why.
7293
7294 There are several means to enable compilation of trace facilities:
7295
7296 the macro `YYDEBUG'
7297 Define the macro `YYDEBUG' to a nonzero value when you compile the
7298 parser. This is compliant with POSIX Yacc. You could use
7299 `-DYYDEBUG=1' as a compiler option or you could put `#define
7300 YYDEBUG 1' in the prologue of the grammar file (*note The
7301 Prologue: Prologue.).
7302
7303 the option `-t', `--debug'
7304 Use the `-t' option when you run Bison (*note Invoking Bison:
7305 Invocation.). This is POSIX compliant too.
7306
7307 the directive `%debug'
7308 Add the `%debug' directive (*note Bison Declaration Summary: Decl
7309 Summary.). This is a Bison extension, which will prove useful
7310 when Bison will output parsers for languages that don't use a
7311 preprocessor. Unless POSIX and Yacc portability matter to you,
7312 this is the preferred solution.
7313
7314 We suggest that you always enable the debug option so that debugging
7315 is always possible.
7316
7317 The trace facility outputs messages with macro calls of the form
7318 `YYFPRINTF (stderr, FORMAT, ARGS)' where FORMAT and ARGS are the usual
7319 `printf' format and variadic arguments. If you define `YYDEBUG' to a
7320 nonzero value but do not define `YYFPRINTF', `<stdio.h>' is
7321 automatically included and `YYFPRINTF' is defined to `fprintf'.
7322
7323 Once you have compiled the program with trace facilities, the way to
7324 request a trace is to store a nonzero value in the variable `yydebug'.
7325 You can do this by making the C code do it (in `main', perhaps), or you
7326 can alter the value with a C debugger.
7327
7328 Each step taken by the parser when `yydebug' is nonzero produces a
7329 line or two of trace information, written on `stderr'. The trace
7330 messages tell you these things:
7331
7332 * Each time the parser calls `yylex', what kind of token was read.
7333
7334 * Each time a token is shifted, the depth and complete contents of
7335 the state stack (*note Parser States::).
7336
7337 * Each time a rule is reduced, which rule it is, and the complete
7338 contents of the state stack afterward.
7339
7340 To make sense of this information, it helps to refer to the listing
7341 file produced by the Bison `-v' option (*note Invoking Bison:
7342 Invocation.). This file shows the meaning of each state in terms of
7343 positions in various rules, and also what each state will do with each
7344 possible input token. As you read the successive trace messages, you
7345 can see that the parser is functioning according to its specification in
7346 the listing file. Eventually you will arrive at the place where
7347 something undesirable happens, and you will see which parts of the
7348 grammar are to blame.
7349
7350 The parser file is a C program and you can use C debuggers on it,
7351 but it's not easy to interpret what it is doing. The parser function
7352 is a finite-state machine interpreter, and aside from the actions it
7353 executes the same code over and over. Only the values of variables
7354 show where in the grammar it is working.
7355
7356 The debugging information normally gives the token type of each token
7357 read, but not its semantic value. You can optionally define a macro
7358 named `YYPRINT' to provide a way to print the value. If you define
7359 `YYPRINT', it should take three arguments. The parser will pass a
7360 standard I/O stream, the numeric code for the token type, and the token
7361 value (from `yylval').
7362
7363 Here is an example of `YYPRINT' suitable for the multi-function
7364 calculator (*note Declarations for `mfcalc': Mfcalc Declarations.):
7365
7366 %{
7367 static void print_token_value (FILE *, int, YYSTYPE);
7368 #define YYPRINT(file, type, value) print_token_value (file, type, value)
7369 %}
7370
7371 ... %% ... %% ...
7372
7373 static void
7374 print_token_value (FILE *file, int type, YYSTYPE value)
7375 {
7376 if (type == VAR)
7377 fprintf (file, "%s", value.tptr->name);
7378 else if (type == NUM)
7379 fprintf (file, "%d", value.val);
7380 }
7381
7382 
7383 File: bison.info, Node: Invocation, Next: Other Languages, Prev: Debugging, Up: Top
7384
7385 9 Invoking Bison
7386 ****************
7387
7388 The usual way to invoke Bison is as follows:
7389
7390 bison INFILE
7391
7392 Here INFILE is the grammar file name, which usually ends in `.y'.
7393 The parser file's name is made by replacing the `.y' with `.tab.c' and
7394 removing any leading directory. Thus, the `bison foo.y' file name
7395 yields `foo.tab.c', and the `bison hack/foo.y' file name yields
7396 `foo.tab.c'. It's also possible, in case you are writing C++ code
7397 instead of C in your grammar file, to name it `foo.ypp' or `foo.y++'.
7398 Then, the output files will take an extension like the given one as
7399 input (respectively `foo.tab.cpp' and `foo.tab.c++'). This feature
7400 takes effect with all options that manipulate file names like `-o' or
7401 `-d'.
7402
7403 For example :
7404
7405 bison -d INFILE.YXX
7406 will produce `infile.tab.cxx' and `infile.tab.hxx', and
7407
7408 bison -d -o OUTPUT.C++ INFILE.Y
7409 will produce `output.c++' and `outfile.h++'.
7410
7411 For compatibility with POSIX, the standard Bison distribution also
7412 contains a shell script called `yacc' that invokes Bison with the `-y'
7413 option.
7414
7415 * Menu:
7416
7417 * Bison Options:: All the options described in detail,
7418 in alphabetical order by short options.
7419 * Option Cross Key:: Alphabetical list of long options.
7420 * Yacc Library:: Yacc-compatible `yylex' and `main'.
7421
7422 
7423 File: bison.info, Node: Bison Options, Next: Option Cross Key, Up: Invocation
7424
7425 9.1 Bison Options
7426 =================
7427
7428 Bison supports both traditional single-letter options and mnemonic long
7429 option names. Long option names are indicated with `--' instead of
7430 `-'. Abbreviations for option names are allowed as long as they are
7431 unique. When a long option takes an argument, like `--file-prefix',
7432 connect the option name and the argument with `='.
7433
7434 Here is a list of options that can be used with Bison, alphabetized
7435 by short option. It is followed by a cross key alphabetized by long
7436 option.
7437
7438 Operations modes:
7439 `-h'
7440 `--help'
7441 Print a summary of the command-line options to Bison and exit.
7442
7443 `-V'
7444 `--version'
7445 Print the version number of Bison and exit.
7446
7447 `--print-localedir'
7448 Print the name of the directory containing locale-dependent data.
7449
7450 `--print-datadir'
7451 Print the name of the directory containing skeletons and XSLT.
7452
7453 `-y'
7454 `--yacc'
7455 Act more like the traditional Yacc command. This can cause
7456 different diagnostics to be generated, and may change behavior in
7457 other minor ways. Most importantly, imitate Yacc's output file
7458 name conventions, so that the parser output file is called
7459 `y.tab.c', and the other outputs are called `y.output' and
7460 `y.tab.h'. Also, if generating an LALR(1) parser in C, generate
7461 `#define' statements in addition to an `enum' to associate token
7462 numbers with token names. Thus, the following shell script can
7463 substitute for Yacc, and the Bison distribution contains such a
7464 script for compatibility with POSIX:
7465
7466 #! /bin/sh
7467 bison -y "$@"
7468
7469 The `-y'/`--yacc' option is intended for use with traditional Yacc
7470 grammars. If your grammar uses a Bison extension like
7471 `%glr-parser', Bison might not be Yacc-compatible even if this
7472 option is specified.
7473
7474 `-W'
7475 `--warnings'
7476 Output warnings falling in CATEGORY. CATEGORY can be one of:
7477 `midrule-values'
7478 Warn about mid-rule values that are set but not used within
7479 any of the actions of the parent rule. For example, warn
7480 about unused `$2' in:
7481
7482 exp: '1' { $$ = 1; } '+' exp { $$ = $1 + $4; };
7483
7484 Also warn about mid-rule values that are used but not set.
7485 For example, warn about unset `$$' in the mid-rule action in:
7486
7487 exp: '1' { $1 = 1; } '+' exp { $$ = $2 + $4; };
7488
7489 These warnings are not enabled by default since they
7490 sometimes prove to be false alarms in existing grammars
7491 employing the Yacc constructs `$0' or `$-N' (where N is some
7492 positive integer).
7493
7494 `yacc'
7495 Incompatibilities with POSIX Yacc.
7496
7497 `all'
7498 All the warnings.
7499
7500 `none'
7501 Turn off all the warnings.
7502
7503 `error'
7504 Treat warnings as errors.
7505
7506 A category can be turned off by prefixing its name with `no-'. For
7507 instance, `-Wno-syntax' will hide the warnings about unused
7508 variables.
7509
7510 Tuning the parser:
7511
7512 `-t'
7513 `--debug'
7514 In the parser file, define the macro `YYDEBUG' to 1 if it is not
7515 already defined, so that the debugging facilities are compiled.
7516 *Note Tracing Your Parser: Tracing.
7517
7518 `-L LANGUAGE'
7519 `--language=LANGUAGE'
7520 Specify the programming language for the generated parser, as if
7521 `%language' was specified (*note Bison Declaration Summary: Decl
7522 Summary.). Currently supported languages include C, C++, and Java.
7523 LANGUAGE is case-insensitive.
7524
7525 This option is experimental and its effect may be modified in
7526 future releases.
7527
7528 `--locations'
7529 Pretend that `%locations' was specified. *Note Decl Summary::.
7530
7531 `-p PREFIX'
7532 `--name-prefix=PREFIX'
7533 Pretend that `%name-prefix "PREFIX"' was specified. *Note Decl
7534 Summary::.
7535
7536 `-l'
7537 `--no-lines'
7538 Don't put any `#line' preprocessor commands in the parser file.
7539 Ordinarily Bison puts them in the parser file so that the C
7540 compiler and debuggers will associate errors with your source
7541 file, the grammar file. This option causes them to associate
7542 errors with the parser file, treating it as an independent source
7543 file in its own right.
7544
7545 `-S FILE'
7546 `--skeleton=FILE'
7547 Specify the skeleton to use, similar to `%skeleton' (*note Bison
7548 Declaration Summary: Decl Summary.).
7549
7550 If FILE does not contain a `/', FILE is the name of a skeleton
7551 file in the Bison installation directory. If it does, FILE is an
7552 absolute file name or a file name relative to the current working
7553 directory. This is similar to how most shells resolve commands.
7554
7555 `-k'
7556 `--token-table'
7557 Pretend that `%token-table' was specified. *Note Decl Summary::.
7558
7559 Adjust the output:
7560
7561 `--defines[=FILE]'
7562 Pretend that `%defines' was specified, i.e., write an extra output
7563 file containing macro definitions for the token type names defined
7564 in the grammar, as well as a few other declarations. *Note Decl
7565 Summary::.
7566
7567 `-d'
7568 This is the same as `--defines' except `-d' does not accept a FILE
7569 argument since POSIX Yacc requires that `-d' can be bundled with
7570 other short options.
7571
7572 `-b FILE-PREFIX'
7573 `--file-prefix=PREFIX'
7574 Pretend that `%file-prefix' was specified, i.e., specify prefix to
7575 use for all Bison output file names. *Note Decl Summary::.
7576
7577 `-r THINGS'
7578 `--report=THINGS'
7579 Write an extra output file containing verbose description of the
7580 comma separated list of THINGS among:
7581
7582 `state'
7583 Description of the grammar, conflicts (resolved and
7584 unresolved), and LALR automaton.
7585
7586 `lookahead'
7587 Implies `state' and augments the description of the automaton
7588 with each rule's lookahead set.
7589
7590 `itemset'
7591 Implies `state' and augments the description of the automaton
7592 with the full set of items for each state, instead of its
7593 core only.
7594
7595 `--report-file=FILE'
7596 Specify the FILE for the verbose description.
7597
7598 `-v'
7599 `--verbose'
7600 Pretend that `%verbose' was specified, i.e., write an extra output
7601 file containing verbose descriptions of the grammar and parser.
7602 *Note Decl Summary::.
7603
7604 `-o FILE'
7605 `--output=FILE'
7606 Specify the FILE for the parser file.
7607
7608 The other output files' names are constructed from FILE as
7609 described under the `-v' and `-d' options.
7610
7611 `-g[FILE]'
7612 `--graph[=FILE]'
7613 Output a graphical representation of the LALR(1) grammar automaton
7614 computed by Bison, in Graphviz (http://www.graphviz.org/) DOT
7615 (http://www.graphviz.org/doc/info/lang.html) format. `FILE' is
7616 optional. If omitted and the grammar file is `foo.y', the output
7617 file will be `foo.dot'.
7618
7619 `-x[FILE]'
7620 `--xml[=FILE]'
7621 Output an XML report of the LALR(1) automaton computed by Bison.
7622 `FILE' is optional. If omitted and the grammar file is `foo.y',
7623 the output file will be `foo.xml'. (The current XML schema is
7624 experimental and may evolve. More user feedback will help to
7625 stabilize it.)
7626
7627 
7628 File: bison.info, Node: Option Cross Key, Next: Yacc Library, Prev: Bison Opt ions, Up: Invocation
7629
7630 9.2 Option Cross Key
7631 ====================
7632
7633 Here is a list of options, alphabetized by long option, to help you find
7634 the corresponding short option.
7635
7636 Long Option Short Option
7637 -------------------------------------------------
7638 `--debug' `-t'
7639 `--defines=[FILE]'
7640 `--file-prefix=PREFIX' `-b' PREFIX
7641 `--graph=[FILE]' `-g' [FILE]
7642 `--help' `-h'
7643 `--language=LANGUAGE' `-L' LANGUAGE
7644 `--locations'
7645 `--name-prefix=PREFIX' `-p' PREFIX
7646 `--no-lines' `-l'
7647 `--output=FILE' `-o' FILE
7648 `--print-datadir'
7649 `--print-localedir'
7650 `--report-file=FILE'
7651 `--report=THINGS' `-r' THINGS
7652 `--skeleton=FILE' `-S' FILE
7653 `--token-table' `-k'
7654 `--verbose' `-v'
7655 `--version' `-V'
7656 `--warnings' `-W'
7657 `--xml=[FILE]' `-x' [FILE]
7658 `--yacc' `-y'
7659
7660 
7661 File: bison.info, Node: Yacc Library, Prev: Option Cross Key, Up: Invocation
7662
7663 9.3 Yacc Library
7664 ================
7665
7666 The Yacc library contains default implementations of the `yyerror' and
7667 `main' functions. These default implementations are normally not
7668 useful, but POSIX requires them. To use the Yacc library, link your
7669 program with the `-ly' option. Note that Bison's implementation of the
7670 Yacc library is distributed under the terms of the GNU General Public
7671 License (*note Copying::).
7672
7673 If you use the Yacc library's `yyerror' function, you should declare
7674 `yyerror' as follows:
7675
7676 int yyerror (char const *);
7677
7678 Bison ignores the `int' value returned by this `yyerror'. If you
7679 use the Yacc library's `main' function, your `yyparse' function should
7680 have the following type signature:
7681
7682 int yyparse (void);
7683
7684 
7685 File: bison.info, Node: Other Languages, Next: FAQ, Prev: Invocation, Up: To p
7686
7687 10 Parsers Written In Other Languages
7688 *************************************
7689
7690 * Menu:
7691
7692 * C++ Parsers:: The interface to generate C++ parser classes
7693 * Java Parsers:: The interface to generate Java parser classes
7694
7695 
7696 File: bison.info, Node: C++ Parsers, Next: Java Parsers, Up: Other Languages
7697
7698 10.1 C++ Parsers
7699 ================
7700
7701 * Menu:
7702
7703 * C++ Bison Interface:: Asking for C++ parser generation
7704 * C++ Semantic Values:: %union vs. C++
7705 * C++ Location Values:: The position and location classes
7706 * C++ Parser Interface:: Instantiating and running the parser
7707 * C++ Scanner Interface:: Exchanges between yylex and parse
7708 * A Complete C++ Example:: Demonstrating their use
7709
7710 
7711 File: bison.info, Node: C++ Bison Interface, Next: C++ Semantic Values, Up: C ++ Parsers
7712
7713 10.1.1 C++ Bison Interface
7714 --------------------------
7715
7716 The C++ LALR(1) parser is selected using the skeleton directive,
7717 `%skeleton "lalr1.c"', or the synonymous command-line option
7718 `--skeleton=lalr1.c'. *Note Decl Summary::.
7719
7720 When run, `bison' will create several entities in the `yy' namespace. Use
7721 the `%define namespace' directive to change the namespace name, see
7722 *Note Decl Summary::. The various classes are generated in the
7723 following files:
7724
7725 `position.hh'
7726 `location.hh'
7727 The definition of the classes `position' and `location', used for
7728 location tracking. *Note C++ Location Values::.
7729
7730 `stack.hh'
7731 An auxiliary class `stack' used by the parser.
7732
7733 `FILE.hh'
7734 `FILE.cc'
7735 (Assuming the extension of the input file was `.yy'.) The
7736 declaration and implementation of the C++ parser class. The
7737 basename and extension of these two files follow the same rules as
7738 with regular C parsers (*note Invocation::).
7739
7740 The header is _mandatory_; you must either pass `-d'/`--defines'
7741 to `bison', or use the `%defines' directive.
7742
7743 All these files are documented using Doxygen; run `doxygen' for a
7744 complete and accurate documentation.
7745
7746 
7747 File: bison.info, Node: C++ Semantic Values, Next: C++ Location Values, Prev: C++ Bison Interface, Up: C++ Parsers
7748
7749 10.1.2 C++ Semantic Values
7750 --------------------------
7751
7752 The `%union' directive works as for C, see *Note The Collection of
7753 Value Types: Union Decl. In particular it produces a genuine
7754 `union'(1), which have a few specific features in C++.
7755 - The type `YYSTYPE' is defined but its use is discouraged: rather
7756 you should refer to the parser's encapsulated type
7757 `yy::parser::semantic_type'.
7758
7759 - Non POD (Plain Old Data) types cannot be used. C++ forbids any
7760 instance of classes with constructors in unions: only _pointers_
7761 to such objects are allowed.
7762
7763 Because objects have to be stored via pointers, memory is not
7764 reclaimed automatically: using the `%destructor' directive is the only
7765 means to avoid leaks. *Note Freeing Discarded Symbols: Destructor Decl.
7766
7767 ---------- Footnotes ----------
7768
7769 (1) In the future techniques to allow complex types within
7770 pseudo-unions (similar to Boost variants) might be implemented to
7771 alleviate these issues.
7772
7773 
7774 File: bison.info, Node: C++ Location Values, Next: C++ Parser Interface, Prev : C++ Semantic Values, Up: C++ Parsers
7775
7776 10.1.3 C++ Location Values
7777 --------------------------
7778
7779 When the directive `%locations' is used, the C++ parser supports
7780 location tracking, see *Note Locations Overview: Locations. Two
7781 auxiliary classes define a `position', a single point in a file, and a
7782 `location', a range composed of a pair of `position's (possibly
7783 spanning several files).
7784
7785 -- Method on position: std::string* file
7786 The name of the file. It will always be handled as a pointer, the
7787 parser will never duplicate nor deallocate it. As an experimental
7788 feature you may change it to `TYPE*' using `%define filename_type
7789 "TYPE"'.
7790
7791 -- Method on position: unsigned int line
7792 The line, starting at 1.
7793
7794 -- Method on position: unsigned int lines (int HEIGHT = 1)
7795 Advance by HEIGHT lines, resetting the column number.
7796
7797 -- Method on position: unsigned int column
7798 The column, starting at 0.
7799
7800 -- Method on position: unsigned int columns (int WIDTH = 1)
7801 Advance by WIDTH columns, without changing the line number.
7802
7803 -- Method on position: position& operator+= (position& POS, int WIDTH)
7804 -- Method on position: position operator+ (const position& POS, int
7805 WIDTH)
7806 -- Method on position: position& operator-= (const position& POS, int
7807 WIDTH)
7808 -- Method on position: position operator- (position& POS, int WIDTH)
7809 Various forms of syntactic sugar for `columns'.
7810
7811 -- Method on position: position operator<< (std::ostream O, const
7812 position& P)
7813 Report P on O like this: `FILE:LINE.COLUMN', or `LINE.COLUMN' if
7814 FILE is null.
7815
7816 -- Method on location: position begin
7817 -- Method on location: position end
7818 The first, inclusive, position of the range, and the first beyond.
7819
7820 -- Method on location: unsigned int columns (int WIDTH = 1)
7821 -- Method on location: unsigned int lines (int HEIGHT = 1)
7822 Advance the `end' position.
7823
7824 -- Method on location: location operator+ (const location& BEGIN,
7825 const location& END)
7826 -- Method on location: location operator+ (const location& BEGIN, int
7827 WIDTH)
7828 -- Method on location: location operator+= (const location& LOC, int
7829 WIDTH)
7830 Various forms of syntactic sugar.
7831
7832 -- Method on location: void step ()
7833 Move `begin' onto `end'.
7834
7835 
7836 File: bison.info, Node: C++ Parser Interface, Next: C++ Scanner Interface, Pr ev: C++ Location Values, Up: C++ Parsers
7837
7838 10.1.4 C++ Parser Interface
7839 ---------------------------
7840
7841 The output files `OUTPUT.hh' and `OUTPUT.cc' declare and define the
7842 parser class in the namespace `yy'. The class name defaults to
7843 `parser', but may be changed using `%define parser_class_name "NAME"'.
7844 The interface of this class is detailed below. It can be extended
7845 using the `%parse-param' feature: its semantics is slightly changed
7846 since it describes an additional member of the parser class, and an
7847 additional argument for its constructor.
7848
7849 -- Type of parser: semantic_value_type
7850 -- Type of parser: location_value_type
7851 The types for semantics value and locations.
7852
7853 -- Method on parser: parser (TYPE1 ARG1, ...)
7854 Build a new parser object. There are no arguments by default,
7855 unless `%parse-param {TYPE1 ARG1}' was used.
7856
7857 -- Method on parser: int parse ()
7858 Run the syntactic analysis, and return 0 on success, 1 otherwise.
7859
7860 -- Method on parser: std::ostream& debug_stream ()
7861 -- Method on parser: void set_debug_stream (std::ostream& O)
7862 Get or set the stream used for tracing the parsing. It defaults to
7863 `std::cerr'.
7864
7865 -- Method on parser: debug_level_type debug_level ()
7866 -- Method on parser: void set_debug_level (debug_level L)
7867 Get or set the tracing level. Currently its value is either 0, no
7868 trace, or nonzero, full tracing.
7869
7870 -- Method on parser: void error (const location_type& L, const
7871 std::string& M)
7872 The definition for this member function must be supplied by the
7873 user: the parser uses it to report a parser error occurring at L,
7874 described by M.
7875
7876 
7877 File: bison.info, Node: C++ Scanner Interface, Next: A Complete C++ Example, Prev: C++ Parser Interface, Up: C++ Parsers
7878
7879 10.1.5 C++ Scanner Interface
7880 ----------------------------
7881
7882 The parser invokes the scanner by calling `yylex'. Contrary to C
7883 parsers, C++ parsers are always pure: there is no point in using the
7884 `%define api.pure' directive. Therefore the interface is as follows.
7885
7886 -- Method on parser: int yylex (semantic_value_type& YYLVAL,
7887 location_type& YYLLOC, TYPE1 ARG1, ...)
7888 Return the next token. Its type is the return value, its semantic
7889 value and location being YYLVAL and YYLLOC. Invocations of
7890 `%lex-param {TYPE1 ARG1}' yield additional arguments.
7891
7892 
7893 File: bison.info, Node: A Complete C++ Example, Prev: C++ Scanner Interface, Up: C++ Parsers
7894
7895 10.1.6 A Complete C++ Example
7896 -----------------------------
7897
7898 This section demonstrates the use of a C++ parser with a simple but
7899 complete example. This example should be available on your system,
7900 ready to compile, in the directory "../bison/examples/calc++". It
7901 focuses on the use of Bison, therefore the design of the various C++
7902 classes is very naive: no accessors, no encapsulation of members etc.
7903 We will use a Lex scanner, and more precisely, a Flex scanner, to
7904 demonstrate the various interaction. A hand written scanner is
7905 actually easier to interface with.
7906
7907 * Menu:
7908
7909 * Calc++ --- C++ Calculator:: The specifications
7910 * Calc++ Parsing Driver:: An active parsing context
7911 * Calc++ Parser:: A parser class
7912 * Calc++ Scanner:: A pure C++ Flex scanner
7913 * Calc++ Top Level:: Conducting the band
7914
7915 
7916 File: bison.info, Node: Calc++ --- C++ Calculator, Next: Calc++ Parsing Driver , Up: A Complete C++ Example
7917
7918 10.1.6.1 Calc++ -- C++ Calculator
7919 .................................
7920
7921 Of course the grammar is dedicated to arithmetics, a single expression,
7922 possibly preceded by variable assignments. An environment containing
7923 possibly predefined variables such as `one' and `two', is exchanged
7924 with the parser. An example of valid input follows.
7925
7926 three := 3
7927 seven := one + two * three
7928 seven * seven
7929
7930 
7931 File: bison.info, Node: Calc++ Parsing Driver, Next: Calc++ Parser, Prev: Cal c++ --- C++ Calculator, Up: A Complete C++ Example
7932
7933 10.1.6.2 Calc++ Parsing Driver
7934 ..............................
7935
7936 To support a pure interface with the parser (and the scanner) the
7937 technique of the "parsing context" is convenient: a structure
7938 containing all the data to exchange. Since, in addition to simply
7939 launch the parsing, there are several auxiliary tasks to execute (open
7940 the file for parsing, instantiate the parser etc.), we recommend
7941 transforming the simple parsing context structure into a fully blown
7942 "parsing driver" class.
7943
7944 The declaration of this driver class, `calc++-driver.hh', is as
7945 follows. The first part includes the CPP guard and imports the
7946 required standard library components, and the declaration of the parser
7947 class.
7948
7949 #ifndef CALCXX_DRIVER_HH
7950 # define CALCXX_DRIVER_HH
7951 # include <string>
7952 # include <map>
7953 # include "calc++-parser.hh"
7954
7955 Then comes the declaration of the scanning function. Flex expects the
7956 signature of `yylex' to be defined in the macro `YY_DECL', and the C++
7957 parser expects it to be declared. We can factor both as follows.
7958
7959 // Tell Flex the lexer's prototype ...
7960 # define YY_DECL \
7961 yy::calcxx_parser::token_type \
7962 yylex (yy::calcxx_parser::semantic_type* yylval, \
7963 yy::calcxx_parser::location_type* yylloc, \
7964 calcxx_driver& driver)
7965 // ... and declare it for the parser's sake.
7966 YY_DECL;
7967
7968 The `calcxx_driver' class is then declared with its most obvious
7969 members.
7970
7971 // Conducting the whole scanning and parsing of Calc++.
7972 class calcxx_driver
7973 {
7974 public:
7975 calcxx_driver ();
7976 virtual ~calcxx_driver ();
7977
7978 std::map<std::string, int> variables;
7979
7980 int result;
7981
7982 To encapsulate the coordination with the Flex scanner, it is useful to
7983 have two members function to open and close the scanning phase.
7984
7985 // Handling the scanner.
7986 void scan_begin ();
7987 void scan_end ();
7988 bool trace_scanning;
7989
7990 Similarly for the parser itself.
7991
7992 // Run the parser. Return 0 on success.
7993 int parse (const std::string& f);
7994 std::string file;
7995 bool trace_parsing;
7996
7997 To demonstrate pure handling of parse errors, instead of simply dumping
7998 them on the standard error output, we will pass them to the compiler
7999 driver using the following two member functions. Finally, we close the
8000 class declaration and CPP guard.
8001
8002 // Error handling.
8003 void error (const yy::location& l, const std::string& m);
8004 void error (const std::string& m);
8005 };
8006 #endif // ! CALCXX_DRIVER_HH
8007
8008 The implementation of the driver is straightforward. The `parse'
8009 member function deserves some attention. The `error' functions are
8010 simple stubs, they should actually register the located error messages
8011 and set error state.
8012
8013 #include "calc++-driver.hh"
8014 #include "calc++-parser.hh"
8015
8016 calcxx_driver::calcxx_driver ()
8017 : trace_scanning (false), trace_parsing (false)
8018 {
8019 variables["one"] = 1;
8020 variables["two"] = 2;
8021 }
8022
8023 calcxx_driver::~calcxx_driver ()
8024 {
8025 }
8026
8027 int
8028 calcxx_driver::parse (const std::string &f)
8029 {
8030 file = f;
8031 scan_begin ();
8032 yy::calcxx_parser parser (*this);
8033 parser.set_debug_level (trace_parsing);
8034 int res = parser.parse ();
8035 scan_end ();
8036 return res;
8037 }
8038
8039 void
8040 calcxx_driver::error (const yy::location& l, const std::string& m)
8041 {
8042 std::cerr << l << ": " << m << std::endl;
8043 }
8044
8045 void
8046 calcxx_driver::error (const std::string& m)
8047 {
8048 std::cerr << m << std::endl;
8049 }
8050
8051 
8052 File: bison.info, Node: Calc++ Parser, Next: Calc++ Scanner, Prev: Calc++ Par sing Driver, Up: A Complete C++ Example
8053
8054 10.1.6.3 Calc++ Parser
8055 ......................
8056
8057 The parser definition file `calc++-parser.yy' starts by asking for the
8058 C++ LALR(1) skeleton, the creation of the parser header file, and
8059 specifies the name of the parser class. Because the C++ skeleton
8060 changed several times, it is safer to require the version you designed
8061 the grammar for.
8062
8063 %skeleton "lalr1.cc" /* -*- C++ -*- */
8064 %require "2.4.1"
8065 %defines
8066 %define parser_class_name "calcxx_parser"
8067
8068 Then come the declarations/inclusions needed to define the `%union'.
8069 Because the parser uses the parsing driver and reciprocally, both
8070 cannot include the header of the other. Because the driver's header
8071 needs detailed knowledge about the parser class (in particular its
8072 inner types), it is the parser's header which will simply use a forward
8073 declaration of the driver. *Note %code: Decl Summary.
8074
8075 %code requires {
8076 # include <string>
8077 class calcxx_driver;
8078 }
8079
8080 The driver is passed by reference to the parser and to the scanner.
8081 This provides a simple but effective pure interface, not relying on
8082 global variables.
8083
8084 // The parsing context.
8085 %parse-param { calcxx_driver& driver }
8086 %lex-param { calcxx_driver& driver }
8087
8088 Then we request the location tracking feature, and initialize the first
8089 location's file name. Afterwards new locations are computed relatively
8090 to the previous locations: the file name will be automatically
8091 propagated.
8092
8093 %locations
8094 %initial-action
8095 {
8096 // Initialize the initial location.
8097 @$.begin.filename = @$.end.filename = &driver.file;
8098 };
8099
8100 Use the two following directives to enable parser tracing and verbose
8101 error messages.
8102
8103 %debug
8104 %error-verbose
8105
8106 Semantic values cannot use "real" objects, but only pointers to them.
8107
8108 // Symbols.
8109 %union
8110 {
8111 int ival;
8112 std::string *sval;
8113 };
8114
8115 The code between `%code {' and `}' is output in the `*.cc' file; it
8116 needs detailed knowledge about the driver.
8117
8118 %code {
8119 # include "calc++-driver.hh"
8120 }
8121
8122 The token numbered as 0 corresponds to end of file; the following line
8123 allows for nicer error messages referring to "end of file" instead of
8124 "$end". Similarly user friendly named are provided for each symbol.
8125 Note that the tokens names are prefixed by `TOKEN_' to avoid name
8126 clashes.
8127
8128 %token END 0 "end of file"
8129 %token ASSIGN ":="
8130 %token <sval> IDENTIFIER "identifier"
8131 %token <ival> NUMBER "number"
8132 %type <ival> exp
8133
8134 To enable memory deallocation during error recovery, use `%destructor'.
8135
8136 %printer { debug_stream () << *$$; } "identifier"
8137 %destructor { delete $$; } "identifier"
8138
8139 %printer { debug_stream () << $$; } <ival>
8140
8141 The grammar itself is straightforward.
8142
8143 %%
8144 %start unit;
8145 unit: assignments exp { driver.result = $2; };
8146
8147 assignments: assignments assignment {}
8148 | /* Nothing. */ {};
8149
8150 assignment:
8151 "identifier" ":=" exp
8152 { driver.variables[*$1] = $3; delete $1; };
8153
8154 %left '+' '-';
8155 %left '*' '/';
8156 exp: exp '+' exp { $$ = $1 + $3; }
8157 | exp '-' exp { $$ = $1 - $3; }
8158 | exp '*' exp { $$ = $1 * $3; }
8159 | exp '/' exp { $$ = $1 / $3; }
8160 | "identifier" { $$ = driver.variables[*$1]; delete $1; }
8161 | "number" { $$ = $1; };
8162 %%
8163
8164 Finally the `error' member function registers the errors to the driver.
8165
8166 void
8167 yy::calcxx_parser::error (const yy::calcxx_parser::location_type& l,
8168 const std::string& m)
8169 {
8170 driver.error (l, m);
8171 }
8172
8173 
8174 File: bison.info, Node: Calc++ Scanner, Next: Calc++ Top Level, Prev: Calc++ Parser, Up: A Complete C++ Example
8175
8176 10.1.6.4 Calc++ Scanner
8177 .......................
8178
8179 The Flex scanner first includes the driver declaration, then the
8180 parser's to get the set of defined tokens.
8181
8182 %{ /* -*- C++ -*- */
8183 # include <cstdlib>
8184 # include <errno.h>
8185 # include <limits.h>
8186 # include <string>
8187 # include "calc++-driver.hh"
8188 # include "calc++-parser.hh"
8189
8190 /* Work around an incompatibility in flex (at least versions
8191 2.5.31 through 2.5.33): it generates code that does
8192 not conform to C89. See Debian bug 333231
8193 <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=333231>. */
8194 # undef yywrap
8195 # define yywrap() 1
8196
8197 /* By default yylex returns int, we use token_type.
8198 Unfortunately yyterminate by default returns 0, which is
8199 not of token_type. */
8200 #define yyterminate() return token::END
8201 %}
8202
8203 Because there is no `#include'-like feature we don't need `yywrap', we
8204 don't need `unput' either, and we parse an actual file, this is not an
8205 interactive session with the user. Finally we enable the scanner
8206 tracing features.
8207
8208 %option noyywrap nounput batch debug
8209
8210 Abbreviations allow for more readable rules.
8211
8212 id [a-zA-Z][a-zA-Z_0-9]*
8213 int [0-9]+
8214 blank [ \t]
8215
8216 The following paragraph suffices to track locations accurately. Each
8217 time `yylex' is invoked, the begin position is moved onto the end
8218 position. Then when a pattern is matched, the end position is advanced
8219 of its width. In case it matched ends of lines, the end cursor is
8220 adjusted, and each time blanks are matched, the begin cursor is moved
8221 onto the end cursor to effectively ignore the blanks preceding tokens.
8222 Comments would be treated equally.
8223
8224 %{
8225 # define YY_USER_ACTION yylloc->columns (yyleng);
8226 %}
8227 %%
8228 %{
8229 yylloc->step ();
8230 %}
8231 {blank}+ yylloc->step ();
8232 [\n]+ yylloc->lines (yyleng); yylloc->step ();
8233
8234 The rules are simple, just note the use of the driver to report errors.
8235 It is convenient to use a typedef to shorten
8236 `yy::calcxx_parser::token::identifier' into `token::identifier' for
8237 instance.
8238
8239 %{
8240 typedef yy::calcxx_parser::token token;
8241 %}
8242 /* Convert ints to the actual type of tokens. */
8243 [-+*/] return yy::calcxx_parser::token_type (yytext[0]);
8244 ":=" return token::ASSIGN;
8245 {int} {
8246 errno = 0;
8247 long n = strtol (yytext, NULL, 10);
8248 if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE))
8249 driver.error (*yylloc, "integer is out of range");
8250 yylval->ival = n;
8251 return token::NUMBER;
8252 }
8253 {id} yylval->sval = new std::string (yytext); return token::IDENTIFIE R;
8254 . driver.error (*yylloc, "invalid character");
8255 %%
8256
8257 Finally, because the scanner related driver's member function depend on
8258 the scanner's data, it is simpler to implement them in this file.
8259
8260 void
8261 calcxx_driver::scan_begin ()
8262 {
8263 yy_flex_debug = trace_scanning;
8264 if (file == "-")
8265 yyin = stdin;
8266 else if (!(yyin = fopen (file.c_str (), "r")))
8267 {
8268 error (std::string ("cannot open ") + file);
8269 exit (1);
8270 }
8271 }
8272
8273 void
8274 calcxx_driver::scan_end ()
8275 {
8276 fclose (yyin);
8277 }
8278
8279 
8280 File: bison.info, Node: Calc++ Top Level, Prev: Calc++ Scanner, Up: A Complet e C++ Example
8281
8282 10.1.6.5 Calc++ Top Level
8283 .........................
8284
8285 The top level file, `calc++.cc', poses no problem.
8286
8287 #include <iostream>
8288 #include "calc++-driver.hh"
8289
8290 int
8291 main (int argc, char *argv[])
8292 {
8293 calcxx_driver driver;
8294 for (++argv; argv[0]; ++argv)
8295 if (*argv == std::string ("-p"))
8296 driver.trace_parsing = true;
8297 else if (*argv == std::string ("-s"))
8298 driver.trace_scanning = true;
8299 else if (!driver.parse (*argv))
8300 std::cout << driver.result << std::endl;
8301 }
8302
8303 
8304 File: bison.info, Node: Java Parsers, Prev: C++ Parsers, Up: Other Languages
8305
8306 10.2 Java Parsers
8307 =================
8308
8309 * Menu:
8310
8311 * Java Bison Interface:: Asking for Java parser generation
8312 * Java Semantic Values:: %type and %token vs. Java
8313 * Java Location Values:: The position and location classes
8314 * Java Parser Interface:: Instantiating and running the parser
8315 * Java Scanner Interface:: Specifying the scanner for the parser
8316 * Java Action Features:: Special features for use in actions
8317 * Java Differences:: Differences between C/C++ and Java Grammars
8318 * Java Declarations Summary:: List of Bison declarations used with Java
8319
8320 
8321 File: bison.info, Node: Java Bison Interface, Next: Java Semantic Values, Up: Java Parsers
8322
8323 10.2.1 Java Bison Interface
8324 ---------------------------
8325
8326 (The current Java interface is experimental and may evolve. More user
8327 feedback will help to stabilize it.)
8328
8329 The Java parser skeletons are selected using the `%language "Java"'
8330 directive or the `-L java'/`--language=java' option.
8331
8332 When generating a Java parser, `bison BASENAME.y' will create a
8333 single Java source file named `BASENAME.java'. Using an input file
8334 without a `.y' suffix is currently broken. The basename of the output
8335 file can be changed by the `%file-prefix' directive or the
8336 `-p'/`--name-prefix' option. The entire output file name can be
8337 changed by the `%output' directive or the `-o'/`--output' option. The
8338 output file contains a single class for the parser.
8339
8340 You can create documentation for generated parsers using Javadoc.
8341
8342 Contrary to C parsers, Java parsers do not use global variables; the
8343 state of the parser is always local to an instance of the parser class.
8344 Therefore, all Java parsers are "pure", and the `%pure-parser' and
8345 `%define api.pure' directives does not do anything when used in Java.
8346
8347 Push parsers are currently unsupported in Java and `%define
8348 api.push_pull' have no effect.
8349
8350 GLR parsers are currently unsupported in Java. Do not use the
8351 `glr-parser' directive.
8352
8353 No header file can be generated for Java parsers. Do not use the
8354 `%defines' directive or the `-d'/`--defines' options.
8355
8356 Currently, support for debugging and verbose errors are always
8357 compiled in. Thus the `%debug' and `%token-table' directives and the
8358 `-t'/`--debug' and `-k'/`--token-table' options have no effect. This
8359 may change in the future to eliminate unused code in the generated
8360 parser, so use `%debug' and `%verbose-error' explicitly if needed.
8361 Also, in the future the `%token-table' directive might enable a public
8362 interface to access the token names and codes.
8363
8364 
8365 File: bison.info, Node: Java Semantic Values, Next: Java Location Values, Pre v: Java Bison Interface, Up: Java Parsers
8366
8367 10.2.2 Java Semantic Values
8368 ---------------------------
8369
8370 There is no `%union' directive in Java parsers. Instead, the semantic
8371 values' types (class names) should be specified in the `%type' or
8372 `%token' directive:
8373
8374 %type <Expression> expr assignment_expr term factor
8375 %type <Integer> number
8376
8377 By default, the semantic stack is declared to have `Object' members,
8378 which means that the class types you specify can be of any class. To
8379 improve the type safety of the parser, you can declare the common
8380 superclass of all the semantic values using the `%define stype'
8381 directive. For example, after the following declaration:
8382
8383 %define stype "ASTNode"
8384
8385 any `%type' or `%token' specifying a semantic type which is not a
8386 subclass of ASTNode, will cause a compile-time error.
8387
8388 Types used in the directives may be qualified with a package name.
8389 Primitive data types are accepted for Java version 1.5 or later. Note
8390 that in this case the autoboxing feature of Java 1.5 will be used.
8391 Generic types may not be used; this is due to a limitation in the
8392 implementation of Bison, and may change in future releases.
8393
8394 Java parsers do not support `%destructor', since the language adopts
8395 garbage collection. The parser will try to hold references to semantic
8396 values for as little time as needed.
8397
8398 Java parsers do not support `%printer', as `toString()' can be used
8399 to print the semantic values. This however may change (in a
8400 backwards-compatible way) in future versions of Bison.
8401
8402 
8403 File: bison.info, Node: Java Location Values, Next: Java Parser Interface, Pr ev: Java Semantic Values, Up: Java Parsers
8404
8405 10.2.3 Java Location Values
8406 ---------------------------
8407
8408 When the directive `%locations' is used, the Java parser supports
8409 location tracking, see *Note Locations Overview: Locations. An
8410 auxiliary user-defined class defines a "position", a single point in a
8411 file; Bison itself defines a class representing a "location", a range
8412 composed of a pair of positions (possibly spanning several files). The
8413 location class is an inner class of the parser; the name is `Location'
8414 by default, and may also be renamed using `%define location_type
8415 "CLASS-NAME'.
8416
8417 The location class treats the position as a completely opaque value.
8418 By default, the class name is `Position', but this can be changed with
8419 `%define position_type "CLASS-NAME"'. This class must be supplied by
8420 the user.
8421
8422 -- Instance Variable of Location: Position begin
8423 -- Instance Variable of Location: Position end
8424 The first, inclusive, position of the range, and the first beyond.
8425
8426 -- Constructor on Location: Location (Position LOC)
8427 Create a `Location' denoting an empty range located at a given
8428 point.
8429
8430 -- Constructor on Location: Location (Position BEGIN, Position END)
8431 Create a `Location' from the endpoints of the range.
8432
8433 -- Method on Location: String toString ()
8434 Prints the range represented by the location. For this to work
8435 properly, the position class should override the `equals' and
8436 `toString' methods appropriately.
8437
8438 
8439 File: bison.info, Node: Java Parser Interface, Next: Java Scanner Interface, Prev: Java Location Values, Up: Java Parsers
8440
8441 10.2.4 Java Parser Interface
8442 ----------------------------
8443
8444 The name of the generated parser class defaults to `YYParser'. The
8445 `YY' prefix may be changed using the `%name-prefix' directive or the
8446 `-p'/`--name-prefix' option. Alternatively, use `%define
8447 parser_class_name "NAME"' to give a custom name to the class. The
8448 interface of this class is detailed below.
8449
8450 By default, the parser class has package visibility. A declaration
8451 `%define public' will change to public visibility. Remember that,
8452 according to the Java language specification, the name of the `.java'
8453 file should match the name of the class in this case. Similarly, you
8454 can use `abstract', `final' and `strictfp' with the `%define'
8455 declaration to add other modifiers to the parser class.
8456
8457 The Java package name of the parser class can be specified using the
8458 `%define package' directive. The superclass and the implemented
8459 interfaces of the parser class can be specified with the `%define
8460 extends' and `%define implements' directives.
8461
8462 The parser class defines an inner class, `Location', that is used
8463 for location tracking (see *Note Java Location Values::), and a inner
8464 interface, `Lexer' (see *Note Java Scanner Interface::). Other than
8465 these inner class/interface, and the members described in the interface
8466 below, all the other members and fields are preceded with a `yy' or
8467 `YY' prefix to avoid clashes with user code.
8468
8469 The parser class can be extended using the `%parse-param' directive.
8470 Each occurrence of the directive will add a `protected final' field to
8471 the parser class, and an argument to its constructor, which initialize
8472 them automatically.
8473
8474 Token names defined by `%token' and the predefined `EOF' token name
8475 are added as constant fields to the parser class.
8476
8477 -- Constructor on YYParser: YYParser (LEX_PARAM, ..., PARSE_PARAM,
8478 ...)
8479 Build a new parser object with embedded `%code lexer'. There are
8480 no parameters, unless `%parse-param's and/or `%lex-param's are
8481 used.
8482
8483 -- Constructor on YYParser: YYParser (Lexer LEXER, PARSE_PARAM, ...)
8484 Build a new parser object using the specified scanner. There are
8485 no additional parameters unless `%parse-param's are used.
8486
8487 If the scanner is defined by `%code lexer', this constructor is
8488 declared `protected' and is called automatically with a scanner
8489 created with the correct `%lex-param's.
8490
8491 -- Method on YYParser: boolean parse ()
8492 Run the syntactic analysis, and return `true' on success, `false'
8493 otherwise.
8494
8495 -- Method on YYParser: boolean recovering ()
8496 During the syntactic analysis, return `true' if recovering from a
8497 syntax error. *Note Error Recovery::.
8498
8499 -- Method on YYParser: java.io.PrintStream getDebugStream ()
8500 -- Method on YYParser: void setDebugStream (java.io.printStream O)
8501 Get or set the stream used for tracing the parsing. It defaults to
8502 `System.err'.
8503
8504 -- Method on YYParser: int getDebugLevel ()
8505 -- Method on YYParser: void setDebugLevel (int L)
8506 Get or set the tracing level. Currently its value is either 0, no
8507 trace, or nonzero, full tracing.
8508
8509 
8510 File: bison.info, Node: Java Scanner Interface, Next: Java Action Features, P rev: Java Parser Interface, Up: Java Parsers
8511
8512 10.2.5 Java Scanner Interface
8513 -----------------------------
8514
8515 There are two possible ways to interface a Bison-generated Java parser
8516 with a scanner: the scanner may be defined by `%code lexer', or defined
8517 elsewhere. In either case, the scanner has to implement the `Lexer'
8518 inner interface of the parser class.
8519
8520 In the first case, the body of the scanner class is placed in `%code
8521 lexer' blocks. If you want to pass parameters from the parser
8522 constructor to the scanner constructor, specify them with `%lex-param';
8523 they are passed before `%parse-param's to the constructor.
8524
8525 In the second case, the scanner has to implement the `Lexer'
8526 interface, which is defined within the parser class (e.g.,
8527 `YYParser.Lexer'). The constructor of the parser object will then
8528 accept an object implementing the interface; `%lex-param' is not used
8529 in this case.
8530
8531 In both cases, the scanner has to implement the following methods.
8532
8533 -- Method on Lexer: void yyerror (Location LOC, String MSG)
8534 This method is defined by the user to emit an error message. The
8535 first parameter is omitted if location tracking is not active.
8536 Its type can be changed using `%define location_type "CLASS-NAME".'
8537
8538 -- Method on Lexer: int yylex ()
8539 Return the next token. Its type is the return value, its semantic
8540 value and location are saved and returned by the ther methods in
8541 the interface.
8542
8543 Use `%define lex_throws' to specify any uncaught exceptions.
8544 Default is `java.io.IOException'.
8545
8546 -- Method on Lexer: Position getStartPos ()
8547 -- Method on Lexer: Position getEndPos ()
8548 Return respectively the first position of the last token that
8549 `yylex' returned, and the first position beyond it. These methods
8550 are not needed unless location tracking is active.
8551
8552 The return type can be changed using `%define position_type
8553 "CLASS-NAME".'
8554
8555 -- Method on Lexer: Object getLVal ()
8556 Return the semantical value of the last token that yylex returned.
8557
8558 The return type can be changed using `%define stype "CLASS-NAME".'
8559
8560 
8561 File: bison.info, Node: Java Action Features, Next: Java Differences, Prev: J ava Scanner Interface, Up: Java Parsers
8562
8563 10.2.6 Special Features for Use in Java Actions
8564 -----------------------------------------------
8565
8566 The following special constructs can be uses in Java actions. Other
8567 analogous C action features are currently unavailable for Java.
8568
8569 Use `%define throws' to specify any uncaught exceptions from parser
8570 actions, and initial actions specified by `%initial-action'.
8571
8572 -- Variable: $N
8573 The semantic value for the Nth component of the current rule.
8574 This may not be assigned to. *Note Java Semantic Values::.
8575
8576 -- Variable: $<TYPEALT>N
8577 Like `$N' but specifies a alternative type TYPEALT. *Note Java
8578 Semantic Values::.
8579
8580 -- Variable: $$
8581 The semantic value for the grouping made by the current rule. As a
8582 value, this is in the base type (`Object' or as specified by
8583 `%define stype') as in not cast to the declared subtype because
8584 casts are not allowed on the left-hand side of Java assignments.
8585 Use an explicit Java cast if the correct subtype is needed. *Note
8586 Java Semantic Values::.
8587
8588 -- Variable: $<TYPEALT>$
8589 Same as `$$' since Java always allow assigning to the base type.
8590 Perhaps we should use this and `$<>$' for the value and `$$' for
8591 setting the value but there is currently no easy way to distinguish
8592 these constructs. *Note Java Semantic Values::.
8593
8594 -- Variable: @N
8595 The location information of the Nth component of the current rule.
8596 This may not be assigned to. *Note Java Location Values::.
8597
8598 -- Variable: @$
8599 The location information of the grouping made by the current rule.
8600 *Note Java Location Values::.
8601
8602 -- Statement: return YYABORT;
8603 Return immediately from the parser, indicating failure. *Note
8604 Java Parser Interface::.
8605
8606 -- Statement: return YYACCEPT;
8607 Return immediately from the parser, indicating success. *Note
8608 Java Parser Interface::.
8609
8610 -- Statement: return YYERROR;
8611 Start error recovery without printing an error message. *Note
8612 Error Recovery::.
8613
8614 -- Statement: return YYFAIL;
8615 Print an error message and start error recovery. *Note Error
8616 Recovery::.
8617
8618 -- Function: boolean recovering ()
8619 Return whether error recovery is being done. In this state, the
8620 parser reads token until it reaches a known state, and then
8621 restarts normal operation. *Note Error Recovery::.
8622
8623 -- Function: protected void yyerror (String msg)
8624 -- Function: protected void yyerror (Position pos, String msg)
8625 -- Function: protected void yyerror (Location loc, String msg)
8626 Print an error message using the `yyerror' method of the scanner
8627 instance in use.
8628
8629 
8630 File: bison.info, Node: Java Differences, Next: Java Declarations Summary, Pr ev: Java Action Features, Up: Java Parsers
8631
8632 10.2.7 Differences between C/C++ and Java Grammars
8633 --------------------------------------------------
8634
8635 The different structure of the Java language forces several differences
8636 between C/C++ grammars, and grammars designed for Java parsers. This
8637 section summarizes these differences.
8638
8639 * Java lacks a preprocessor, so the `YYERROR', `YYACCEPT', `YYABORT'
8640 symbols (*note Table of Symbols::) cannot obviously be macros.
8641 Instead, they should be preceded by `return' when they appear in
8642 an action. The actual definition of these symbols is opaque to
8643 the Bison grammar, and it might change in the future. The only
8644 meaningful operation that you can do, is to return them. See
8645 *note Java Action Features::.
8646
8647 Note that of these three symbols, only `YYACCEPT' and `YYABORT'
8648 will cause a return from the `yyparse' method(1).
8649
8650 * Java lacks unions, so `%union' has no effect. Instead, semantic
8651 values have a common base type: `Object' or as specified by
8652 `%define stype'. Angle backets on `%token', `type', `$N' and `$$'
8653 specify subtypes rather than fields of an union. The type of
8654 `$$', even with angle brackets, is the base type since Java casts
8655 are not allow on the left-hand side of assignments. Also, `$N'
8656 and `@N' are not allowed on the left-hand side of assignments. See
8657 *note Java Semantic Values:: and *note Java Action Features::.
8658
8659 * The prolog declarations have a different meaning than in C/C++
8660 code.
8661 `%code imports'
8662 blocks are placed at the beginning of the Java source code.
8663 They may include copyright notices. For a `package'
8664 declarations, it is suggested to use `%define package'
8665 instead.
8666
8667 unqualified `%code'
8668 blocks are placed inside the parser class.
8669
8670 `%code lexer'
8671 blocks, if specified, should include the implementation of the
8672 scanner. If there is no such block, the scanner can be any
8673 class that implements the appropriate interface (see *note
8674 Java Scanner Interface::).
8675
8676 Other `%code' blocks are not supported in Java parsers. In
8677 particular, `%{ ... %}' blocks should not be used and may give an
8678 error in future versions of Bison.
8679
8680 The epilogue has the same meaning as in C/C++ code and it can be
8681 used to define other classes used by the parser _outside_ the
8682 parser class.
8683
8684 ---------- Footnotes ----------
8685
8686 (1) Java parsers include the actions in a separate method than
8687 `yyparse' in order to have an intuitive syntax that corresponds to
8688 these C macros.
8689
8690 
8691 File: bison.info, Node: Java Declarations Summary, Prev: Java Differences, Up : Java Parsers
8692
8693 10.2.8 Java Declarations Summary
8694 --------------------------------
8695
8696 This summary only include declarations specific to Java or have special
8697 meaning when used in a Java parser.
8698
8699 -- Directive: %language "Java"
8700 Generate a Java class for the parser.
8701
8702 -- Directive: %lex-param {TYPE NAME}
8703 A parameter for the lexer class defined by `%code lexer' _only_,
8704 added as parameters to the lexer constructor and the parser
8705 constructor that _creates_ a lexer. Default is none. *Note Java
8706 Scanner Interface::.
8707
8708 -- Directive: %name-prefix "PREFIX"
8709 The prefix of the parser class name `PREFIXParser' if `%define
8710 parser_class_name' is not used. Default is `YY'. *Note Java
8711 Bison Interface::.
8712
8713 -- Directive: %parse-param {TYPE NAME}
8714 A parameter for the parser class added as parameters to
8715 constructor(s) and as fields initialized by the constructor(s).
8716 Default is none. *Note Java Parser Interface::.
8717
8718 -- Directive: %token <TYPE> TOKEN ...
8719 Declare tokens. Note that the angle brackets enclose a Java
8720 _type_. *Note Java Semantic Values::.
8721
8722 -- Directive: %type <TYPE> NONTERMINAL ...
8723 Declare the type of nonterminals. Note that the angle brackets
8724 enclose a Java _type_. *Note Java Semantic Values::.
8725
8726 -- Directive: %code { CODE ... }
8727 Code appended to the inside of the parser class. *Note Java
8728 Differences::.
8729
8730 -- Directive: %code imports { CODE ... }
8731 Code inserted just after the `package' declaration. *Note Java
8732 Differences::.
8733
8734 -- Directive: %code lexer { CODE ... }
8735 Code added to the body of a inner lexer class within the parser
8736 class. *Note Java Scanner Interface::.
8737
8738 -- Directive: %% CODE ...
8739 Code (after the second `%%') appended to the end of the file,
8740 _outside_ the parser class. *Note Java Differences::.
8741
8742 -- Directive: %{ CODE ... %}
8743 Not supported. Use `%code import' instead. *Note Java
8744 Differences::.
8745
8746 -- Directive: %define abstract
8747 Whether the parser class is declared `abstract'. Default is false.
8748 *Note Java Bison Interface::.
8749
8750 -- Directive: %define extends "SUPERCLASS"
8751 The superclass of the parser class. Default is none. *Note Java
8752 Bison Interface::.
8753
8754 -- Directive: %define final
8755 Whether the parser class is declared `final'. Default is false.
8756 *Note Java Bison Interface::.
8757
8758 -- Directive: %define implements "INTERFACES"
8759 The implemented interfaces of the parser class, a comma-separated
8760 list. Default is none. *Note Java Bison Interface::.
8761
8762 -- Directive: %define lex_throws "EXCEPTIONS"
8763 The exceptions thrown by the `yylex' method of the lexer, a
8764 comma-separated list. Default is `java.io.IOException'. *Note
8765 Java Scanner Interface::.
8766
8767 -- Directive: %define location_type "CLASS"
8768 The name of the class used for locations (a range between two
8769 positions). This class is generated as an inner class of the
8770 parser class by `bison'. Default is `Location'. *Note Java
8771 Location Values::.
8772
8773 -- Directive: %define package "PACKAGE"
8774 The package to put the parser class in. Default is none. *Note
8775 Java Bison Interface::.
8776
8777 -- Directive: %define parser_class_name "NAME"
8778 The name of the parser class. Default is `YYParser' or
8779 `NAME-PREFIXParser'. *Note Java Bison Interface::.
8780
8781 -- Directive: %define position_type "CLASS"
8782 The name of the class used for positions. This class must be
8783 supplied by the user. Default is `Position'. *Note Java Location
8784 Values::.
8785
8786 -- Directive: %define public
8787 Whether the parser class is declared `public'. Default is false.
8788 *Note Java Bison Interface::.
8789
8790 -- Directive: %define stype "CLASS"
8791 The base type of semantic values. Default is `Object'. *Note
8792 Java Semantic Values::.
8793
8794 -- Directive: %define strictfp
8795 Whether the parser class is declared `strictfp'. Default is false.
8796 *Note Java Bison Interface::.
8797
8798 -- Directive: %define throws "EXCEPTIONS"
8799 The exceptions thrown by user-supplied parser actions and
8800 `%initial-action', a comma-separated list. Default is none.
8801 *Note Java Parser Interface::.
8802
8803 
8804 File: bison.info, Node: FAQ, Next: Table of Symbols, Prev: Other Languages, Up: Top
8805
8806 11 Frequently Asked Questions
8807 *****************************
8808
8809 Several questions about Bison come up occasionally. Here some of them
8810 are addressed.
8811
8812 * Menu:
8813
8814 * Memory Exhausted:: Breaking the Stack Limits
8815 * How Can I Reset the Parser:: `yyparse' Keeps some State
8816 * Strings are Destroyed:: `yylval' Loses Track of Strings
8817 * Implementing Gotos/Loops:: Control Flow in the Calculator
8818 * Multiple start-symbols:: Factoring closely related grammars
8819 * Secure? Conform?:: Is Bison POSIX safe?
8820 * I can't build Bison:: Troubleshooting
8821 * Where can I find help?:: Troubleshouting
8822 * Bug Reports:: Troublereporting
8823 * More Languages:: Parsers in C++, Java, and so on
8824 * Beta Testing:: Experimenting development versions
8825 * Mailing Lists:: Meeting other Bison users
8826
8827 
8828 File: bison.info, Node: Memory Exhausted, Next: How Can I Reset the Parser, U p: FAQ
8829
8830 11.1 Memory Exhausted
8831 =====================
8832
8833 My parser returns with error with a `memory exhausted'
8834 message. What can I do?
8835
8836 This question is already addressed elsewhere, *Note Recursive Rules:
8837 Recursion.
8838
8839 
8840 File: bison.info, Node: How Can I Reset the Parser, Next: Strings are Destroye d, Prev: Memory Exhausted, Up: FAQ
8841
8842 11.2 How Can I Reset the Parser
8843 ===============================
8844
8845 The following phenomenon has several symptoms, resulting in the
8846 following typical questions:
8847
8848 I invoke `yyparse' several times, and on correct input it works
8849 properly; but when a parse error is found, all the other calls fail
8850 too. How can I reset the error flag of `yyparse'?
8851
8852 or
8853
8854 My parser includes support for an `#include'-like feature, in
8855 which case I run `yyparse' from `yyparse'. This fails
8856 although I did specify `%define api.pure'.
8857
8858 These problems typically come not from Bison itself, but from
8859 Lex-generated scanners. Because these scanners use large buffers for
8860 speed, they might not notice a change of input file. As a
8861 demonstration, consider the following source file, `first-line.l':
8862
8863
8864 %{
8865 #include <stdio.h>
8866 #include <stdlib.h>
8867 %}
8868 %%
8869 .*\n ECHO; return 1;
8870 %%
8871 int
8872 yyparse (char const *file)
8873 {
8874 yyin = fopen (file, "r");
8875 if (!yyin)
8876 exit (2);
8877 /* One token only. */
8878 yylex ();
8879 if (fclose (yyin) != 0)
8880 exit (3);
8881 return 0;
8882 }
8883
8884 int
8885 main (void)
8886 {
8887 yyparse ("input");
8888 yyparse ("input");
8889 return 0;
8890 }
8891
8892 If the file `input' contains
8893
8894
8895 input:1: Hello,
8896 input:2: World!
8897
8898 then instead of getting the first line twice, you get:
8899
8900 $ flex -ofirst-line.c first-line.l
8901 $ gcc -ofirst-line first-line.c -ll
8902 $ ./first-line
8903 input:1: Hello,
8904 input:2: World!
8905
8906 Therefore, whenever you change `yyin', you must tell the
8907 Lex-generated scanner to discard its current buffer and switch to the
8908 new one. This depends upon your implementation of Lex; see its
8909 documentation for more. For Flex, it suffices to call
8910 `YY_FLUSH_BUFFER' after each change to `yyin'. If your Flex-generated
8911 scanner needs to read from several input streams to handle features
8912 like include files, you might consider using Flex functions like
8913 `yy_switch_to_buffer' that manipulate multiple input buffers.
8914
8915 If your Flex-generated scanner uses start conditions (*note Start
8916 conditions: (flex)Start conditions.), you might also want to reset the
8917 scanner's state, i.e., go back to the initial start condition, through
8918 a call to `BEGIN (0)'.
8919
8920 
8921 File: bison.info, Node: Strings are Destroyed, Next: Implementing Gotos/Loops, Prev: How Can I Reset the Parser, Up: FAQ
8922
8923 11.3 Strings are Destroyed
8924 ==========================
8925
8926 My parser seems to destroy old strings, or maybe it loses track of
8927 them. Instead of reporting `"foo", "bar"', it reports
8928 `"bar", "bar"', or even `"foo\nbar", "bar"'.
8929
8930 This error is probably the single most frequent "bug report" sent to
8931 Bison lists, but is only concerned with a misunderstanding of the role
8932 of the scanner. Consider the following Lex code:
8933
8934
8935 %{
8936 #include <stdio.h>
8937 char *yylval = NULL;
8938 %}
8939 %%
8940 .* yylval = yytext; return 1;
8941 \n /* IGNORE */
8942 %%
8943 int
8944 main ()
8945 {
8946 /* Similar to using $1, $2 in a Bison action. */
8947 char *fst = (yylex (), yylval);
8948 char *snd = (yylex (), yylval);
8949 printf ("\"%s\", \"%s\"\n", fst, snd);
8950 return 0;
8951 }
8952
8953 If you compile and run this code, you get:
8954
8955 $ flex -osplit-lines.c split-lines.l
8956 $ gcc -osplit-lines split-lines.c -ll
8957 $ printf 'one\ntwo\n' | ./split-lines
8958 "one
8959 two", "two"
8960
8961 this is because `yytext' is a buffer provided for _reading_ in the
8962 action, but if you want to keep it, you have to duplicate it (e.g.,
8963 using `strdup'). Note that the output may depend on how your
8964 implementation of Lex handles `yytext'. For instance, when given the
8965 Lex compatibility option `-l' (which triggers the option `%array') Flex
8966 generates a different behavior:
8967
8968 $ flex -l -osplit-lines.c split-lines.l
8969 $ gcc -osplit-lines split-lines.c -ll
8970 $ printf 'one\ntwo\n' | ./split-lines
8971 "two", "two"
8972
8973 
8974 File: bison.info, Node: Implementing Gotos/Loops, Next: Multiple start-symbols , Prev: Strings are Destroyed, Up: FAQ
8975
8976 11.4 Implementing Gotos/Loops
8977 =============================
8978
8979 My simple calculator supports variables, assignments, and functions,
8980 but how can I implement gotos, or loops?
8981
8982 Although very pedagogical, the examples included in the document blur
8983 the distinction to make between the parser--whose job is to recover the
8984 structure of a text and to transmit it to subsequent modules of the
8985 program--and the processing (such as the execution) of this structure.
8986 This works well with so called straight line programs, i.e., precisely
8987 those that have a straightforward execution model: execute simple
8988 instructions one after the others.
8989
8990 If you want a richer model, you will probably need to use the parser
8991 to construct a tree that does represent the structure it has recovered;
8992 this tree is usually called the "abstract syntax tree", or "AST" for
8993 short. Then, walking through this tree, traversing it in various ways,
8994 will enable treatments such as its execution or its translation, which
8995 will result in an interpreter or a compiler.
8996
8997 This topic is way beyond the scope of this manual, and the reader is
8998 invited to consult the dedicated literature.
8999
9000 
9001 File: bison.info, Node: Multiple start-symbols, Next: Secure? Conform?, Prev: Implementing Gotos/Loops, Up: FAQ
9002
9003 11.5 Multiple start-symbols
9004 ===========================
9005
9006 I have several closely related grammars, and I would like to share their
9007 implementations. In fact, I could use a single grammar but with
9008 multiple entry points.
9009
9010 Bison does not support multiple start-symbols, but there is a very
9011 simple means to simulate them. If `foo' and `bar' are the two pseudo
9012 start-symbols, then introduce two new tokens, say `START_FOO' and
9013 `START_BAR', and use them as switches from the real start-symbol:
9014
9015 %token START_FOO START_BAR;
9016 %start start;
9017 start: START_FOO foo
9018 | START_BAR bar;
9019
9020 These tokens prevents the introduction of new conflicts. As far as
9021 the parser goes, that is all that is needed.
9022
9023 Now the difficult part is ensuring that the scanner will send these
9024 tokens first. If your scanner is hand-written, that should be
9025 straightforward. If your scanner is generated by Lex, them there is
9026 simple means to do it: recall that anything between `%{ ... %}' after
9027 the first `%%' is copied verbatim in the top of the generated `yylex'
9028 function. Make sure a variable `start_token' is available in the
9029 scanner (e.g., a global variable or using `%lex-param' etc.), and use
9030 the following:
9031
9032 /* Prologue. */
9033 %%
9034 %{
9035 if (start_token)
9036 {
9037 int t = start_token;
9038 start_token = 0;
9039 return t;
9040 }
9041 %}
9042 /* The rules. */
9043
9044 
9045 File: bison.info, Node: Secure? Conform?, Next: I can't build Bison, Prev: Mu ltiple start-symbols, Up: FAQ
9046
9047 11.6 Secure? Conform?
9048 ======================
9049
9050 Is Bison secure? Does it conform to POSIX?
9051
9052 If you're looking for a guarantee or certification, we don't provide
9053 it. However, Bison is intended to be a reliable program that conforms
9054 to the POSIX specification for Yacc. If you run into problems, please
9055 send us a bug report.
9056
9057 
9058 File: bison.info, Node: I can't build Bison, Next: Where can I find help?, Pr ev: Secure? Conform?, Up: FAQ
9059
9060 11.7 I can't build Bison
9061 ========================
9062
9063 I can't build Bison because `make' complains that
9064 `msgfmt' is not found.
9065 What should I do?
9066
9067 Like most GNU packages with internationalization support, that
9068 feature is turned on by default. If you have problems building in the
9069 `po' subdirectory, it indicates that your system's internationalization
9070 support is lacking. You can re-configure Bison with `--disable-nls' to
9071 turn off this support, or you can install GNU gettext from
9072 `ftp://ftp.gnu.org/gnu/gettext/' and re-configure Bison. See the file
9073 `ABOUT-NLS' for more information.
9074
9075 
9076 File: bison.info, Node: Where can I find help?, Next: Bug Reports, Prev: I ca n't build Bison, Up: FAQ
9077
9078 11.8 Where can I find help?
9079 ===========================
9080
9081 I'm having trouble using Bison. Where can I find help?
9082
9083 First, read this fine manual. Beyond that, you can send mail to
9084 <help-bison@gnu.org>. This mailing list is intended to be populated
9085 with people who are willing to answer questions about using and
9086 installing Bison. Please keep in mind that (most of) the people on the
9087 list have aspects of their lives which are not related to Bison (!), so
9088 you may not receive an answer to your question right away. This can be
9089 frustrating, but please try not to honk them off; remember that any
9090 help they provide is purely voluntary and out of the kindness of their
9091 hearts.
9092
9093 
9094 File: bison.info, Node: Bug Reports, Next: More Languages, Prev: Where can I find help?, Up: FAQ
9095
9096 11.9 Bug Reports
9097 ================
9098
9099 I found a bug. What should I include in the bug report?
9100
9101 Before you send a bug report, make sure you are using the latest
9102 version. Check `ftp://ftp.gnu.org/pub/gnu/bison/' or one of its
9103 mirrors. Be sure to include the version number in your bug report. If
9104 the bug is present in the latest version but not in a previous version,
9105 try to determine the most recent version which did not contain the bug.
9106
9107 If the bug is parser-related, you should include the smallest grammar
9108 you can which demonstrates the bug. The grammar file should also be
9109 complete (i.e., I should be able to run it through Bison without having
9110 to edit or add anything). The smaller and simpler the grammar, the
9111 easier it will be to fix the bug.
9112
9113 Include information about your compilation environment, including
9114 your operating system's name and version and your compiler's name and
9115 version. If you have trouble compiling, you should also include a
9116 transcript of the build session, starting with the invocation of
9117 `configure'. Depending on the nature of the bug, you may be asked to
9118 send additional files as well (such as `config.h' or `config.cache').
9119
9120 Patches are most welcome, but not required. That is, do not
9121 hesitate to send a bug report just because you can not provide a fix.
9122
9123 Send bug reports to <bug-bison@gnu.org>.
9124
9125 
9126 File: bison.info, Node: More Languages, Next: Beta Testing, Prev: Bug Reports , Up: FAQ
9127
9128 11.10 More Languages
9129 ====================
9130
9131 Will Bison ever have C++ and Java support? How about INSERT YOUR
9132 FAVORITE LANGUAGE HERE?
9133
9134 C++ and Java support is there now, and is documented. We'd love to
9135 add other languages; contributions are welcome.
9136
9137 
9138 File: bison.info, Node: Beta Testing, Next: Mailing Lists, Prev: More Languag es, Up: FAQ
9139
9140 11.11 Beta Testing
9141 ==================
9142
9143 What is involved in being a beta tester?
9144
9145 It's not terribly involved. Basically, you would download a test
9146 release, compile it, and use it to build and run a parser or two. After
9147 that, you would submit either a bug report or a message saying that
9148 everything is okay. It is important to report successes as well as
9149 failures because test releases eventually become mainstream releases,
9150 but only if they are adequately tested. If no one tests, development is
9151 essentially halted.
9152
9153 Beta testers are particularly needed for operating systems to which
9154 the developers do not have easy access. They currently have easy
9155 access to recent GNU/Linux and Solaris versions. Reports about other
9156 operating systems are especially welcome.
9157
9158 
9159 File: bison.info, Node: Mailing Lists, Prev: Beta Testing, Up: FAQ
9160
9161 11.12 Mailing Lists
9162 ===================
9163
9164 How do I join the help-bison and bug-bison mailing lists?
9165
9166 See `http://lists.gnu.org/'.
9167
9168 
9169 File: bison.info, Node: Table of Symbols, Next: Glossary, Prev: FAQ, Up: Top
9170
9171 Appendix A Bison Symbols
9172 ************************
9173
9174 -- Variable: @$
9175 In an action, the location of the left-hand side of the rule.
9176 *Note Locations Overview: Locations.
9177
9178 -- Variable: @N
9179 In an action, the location of the N-th symbol of the right-hand
9180 side of the rule. *Note Locations Overview: Locations.
9181
9182 -- Variable: $$
9183 In an action, the semantic value of the left-hand side of the rule.
9184 *Note Actions::.
9185
9186 -- Variable: $N
9187 In an action, the semantic value of the N-th symbol of the
9188 right-hand side of the rule. *Note Actions::.
9189
9190 -- Delimiter: %%
9191 Delimiter used to separate the grammar rule section from the Bison
9192 declarations section or the epilogue. *Note The Overall Layout of
9193 a Bison Grammar: Grammar Layout.
9194
9195 -- Delimiter: %{CODE%}
9196 All code listed between `%{' and `%}' is copied directly to the
9197 output file uninterpreted. Such code forms the prologue of the
9198 input file. *Note Outline of a Bison Grammar: Grammar Outline.
9199
9200 -- Construct: /*...*/
9201 Comment delimiters, as in C.
9202
9203 -- Delimiter: :
9204 Separates a rule's result from its components. *Note Syntax of
9205 Grammar Rules: Rules.
9206
9207 -- Delimiter: ;
9208 Terminates a rule. *Note Syntax of Grammar Rules: Rules.
9209
9210 -- Delimiter: |
9211 Separates alternate rules for the same result nonterminal. *Note
9212 Syntax of Grammar Rules: Rules.
9213
9214 -- Directive: <*>
9215 Used to define a default tagged `%destructor' or default tagged
9216 `%printer'.
9217
9218 This feature is experimental. More user feedback will help to
9219 determine whether it should become a permanent feature.
9220
9221 *Note Freeing Discarded Symbols: Destructor Decl.
9222
9223 -- Directive: <>
9224 Used to define a default tagless `%destructor' or default tagless
9225 `%printer'.
9226
9227 This feature is experimental. More user feedback will help to
9228 determine whether it should become a permanent feature.
9229
9230 *Note Freeing Discarded Symbols: Destructor Decl.
9231
9232 -- Symbol: $accept
9233 The predefined nonterminal whose only rule is `$accept: START
9234 $end', where START is the start symbol. *Note The Start-Symbol:
9235 Start Decl. It cannot be used in the grammar.
9236
9237 -- Directive: %code {CODE}
9238 -- Directive: %code QUALIFIER {CODE}
9239 Insert CODE verbatim into output parser source. *Note %code: Decl
9240 Summary.
9241
9242 -- Directive: %debug
9243 Equip the parser for debugging. *Note Decl Summary::.
9244
9245 -- Directive: %debug
9246 Equip the parser for debugging. *Note Decl Summary::.
9247
9248 -- Directive: %define DEFINE-VARIABLE
9249 -- Directive: %define DEFINE-VARIABLE VALUE
9250 Define a variable to adjust Bison's behavior. *Note %define: Decl
9251 Summary.
9252
9253 -- Directive: %defines
9254 Bison declaration to create a header file meant for the scanner.
9255 *Note Decl Summary::.
9256
9257 -- Directive: %defines DEFINES-FILE
9258 Same as above, but save in the file DEFINES-FILE. *Note Decl
9259 Summary::.
9260
9261 -- Directive: %destructor
9262 Specify how the parser should reclaim the memory associated to
9263 discarded symbols. *Note Freeing Discarded Symbols: Destructor
9264 Decl.
9265
9266 -- Directive: %dprec
9267 Bison declaration to assign a precedence to a rule that is used at
9268 parse time to resolve reduce/reduce conflicts. *Note Writing GLR
9269 Parsers: GLR Parsers.
9270
9271 -- Symbol: $end
9272 The predefined token marking the end of the token stream. It
9273 cannot be used in the grammar.
9274
9275 -- Symbol: error
9276 A token name reserved for error recovery. This token may be used
9277 in grammar rules so as to allow the Bison parser to recognize an
9278 error in the grammar without halting the process. In effect, a
9279 sentence containing an error may be recognized as valid. On a
9280 syntax error, the token `error' becomes the current lookahead
9281 token. Actions corresponding to `error' are then executed, and
9282 the lookahead token is reset to the token that originally caused
9283 the violation. *Note Error Recovery::.
9284
9285 -- Directive: %error-verbose
9286 Bison declaration to request verbose, specific error message
9287 strings when `yyerror' is called.
9288
9289 -- Directive: %file-prefix "PREFIX"
9290 Bison declaration to set the prefix of the output files. *Note
9291 Decl Summary::.
9292
9293 -- Directive: %glr-parser
9294 Bison declaration to produce a GLR parser. *Note Writing GLR
9295 Parsers: GLR Parsers.
9296
9297 -- Directive: %initial-action
9298 Run user code before parsing. *Note Performing Actions before
9299 Parsing: Initial Action Decl.
9300
9301 -- Directive: %language
9302 Specify the programming language for the generated parser. *Note
9303 Decl Summary::.
9304
9305 -- Directive: %left
9306 Bison declaration to assign left associativity to token(s). *Note
9307 Operator Precedence: Precedence Decl.
9308
9309 -- Directive: %lex-param {ARGUMENT-DECLARATION}
9310 Bison declaration to specifying an additional parameter that
9311 `yylex' should accept. *Note Calling Conventions for Pure
9312 Parsers: Pure Calling.
9313
9314 -- Directive: %merge
9315 Bison declaration to assign a merging function to a rule. If
9316 there is a reduce/reduce conflict with a rule having the same
9317 merging function, the function is applied to the two semantic
9318 values to get a single result. *Note Writing GLR Parsers: GLR
9319 Parsers.
9320
9321 -- Directive: %name-prefix "PREFIX"
9322 Bison declaration to rename the external symbols. *Note Decl
9323 Summary::.
9324
9325 -- Directive: %no-lines
9326 Bison declaration to avoid generating `#line' directives in the
9327 parser file. *Note Decl Summary::.
9328
9329 -- Directive: %nonassoc
9330 Bison declaration to assign nonassociativity to token(s). *Note
9331 Operator Precedence: Precedence Decl.
9332
9333 -- Directive: %output "FILE"
9334 Bison declaration to set the name of the parser file. *Note Decl
9335 Summary::.
9336
9337 -- Directive: %parse-param {ARGUMENT-DECLARATION}
9338 Bison declaration to specifying an additional parameter that
9339 `yyparse' should accept. *Note The Parser Function `yyparse':
9340 Parser Function.
9341
9342 -- Directive: %prec
9343 Bison declaration to assign a precedence to a specific rule.
9344 *Note Context-Dependent Precedence: Contextual Precedence.
9345
9346 -- Directive: %pure-parser
9347 Deprecated version of `%define api.pure' (*note %define: Decl
9348 Summary.), for which Bison is more careful to warn about
9349 unreasonable usage.
9350
9351 -- Directive: %require "VERSION"
9352 Require version VERSION or higher of Bison. *Note Require a
9353 Version of Bison: Require Decl.
9354
9355 -- Directive: %right
9356 Bison declaration to assign right associativity to token(s).
9357 *Note Operator Precedence: Precedence Decl.
9358
9359 -- Directive: %skeleton
9360 Specify the skeleton to use; usually for development. *Note Decl
9361 Summary::.
9362
9363 -- Directive: %start
9364 Bison declaration to specify the start symbol. *Note The
9365 Start-Symbol: Start Decl.
9366
9367 -- Directive: %token
9368 Bison declaration to declare token(s) without specifying
9369 precedence. *Note Token Type Names: Token Decl.
9370
9371 -- Directive: %token-table
9372 Bison declaration to include a token name table in the parser file.
9373 *Note Decl Summary::.
9374
9375 -- Directive: %type
9376 Bison declaration to declare nonterminals. *Note Nonterminal
9377 Symbols: Type Decl.
9378
9379 -- Symbol: $undefined
9380 The predefined token onto which all undefined values returned by
9381 `yylex' are mapped. It cannot be used in the grammar, rather, use
9382 `error'.
9383
9384 -- Directive: %union
9385 Bison declaration to specify several possible data types for
9386 semantic values. *Note The Collection of Value Types: Union Decl.
9387
9388 -- Macro: YYABORT
9389 Macro to pretend that an unrecoverable syntax error has occurred,
9390 by making `yyparse' return 1 immediately. The error reporting
9391 function `yyerror' is not called. *Note The Parser Function
9392 `yyparse': Parser Function.
9393
9394 For Java parsers, this functionality is invoked using `return
9395 YYABORT;' instead.
9396
9397 -- Macro: YYACCEPT
9398 Macro to pretend that a complete utterance of the language has been
9399 read, by making `yyparse' return 0 immediately. *Note The Parser
9400 Function `yyparse': Parser Function.
9401
9402 For Java parsers, this functionality is invoked using `return
9403 YYACCEPT;' instead.
9404
9405 -- Macro: YYBACKUP
9406 Macro to discard a value from the parser stack and fake a lookahead
9407 token. *Note Special Features for Use in Actions: Action Features.
9408
9409 -- Variable: yychar
9410 External integer variable that contains the integer value of the
9411 lookahead token. (In a pure parser, it is a local variable within
9412 `yyparse'.) Error-recovery rule actions may examine this variable.
9413 *Note Special Features for Use in Actions: Action Features.
9414
9415 -- Variable: yyclearin
9416 Macro used in error-recovery rule actions. It clears the previous
9417 lookahead token. *Note Error Recovery::.
9418
9419 -- Macro: YYDEBUG
9420 Macro to define to equip the parser with tracing code. *Note
9421 Tracing Your Parser: Tracing.
9422
9423 -- Variable: yydebug
9424 External integer variable set to zero by default. If `yydebug' is
9425 given a nonzero value, the parser will output information on input
9426 symbols and parser action. *Note Tracing Your Parser: Tracing.
9427
9428 -- Macro: yyerrok
9429 Macro to cause parser to recover immediately to its normal mode
9430 after a syntax error. *Note Error Recovery::.
9431
9432 -- Macro: YYERROR
9433 Macro to pretend that a syntax error has just been detected: call
9434 `yyerror' and then perform normal error recovery if possible
9435 (*note Error Recovery::), or (if recovery is impossible) make
9436 `yyparse' return 1. *Note Error Recovery::.
9437
9438 For Java parsers, this functionality is invoked using `return
9439 YYERROR;' instead.
9440
9441 -- Function: yyerror
9442 User-supplied function to be called by `yyparse' on error. *Note
9443 The Error Reporting Function `yyerror': Error Reporting.
9444
9445 -- Macro: YYERROR_VERBOSE
9446 An obsolete macro that you define with `#define' in the prologue
9447 to request verbose, specific error message strings when `yyerror'
9448 is called. It doesn't matter what definition you use for
9449 `YYERROR_VERBOSE', just whether you define it. Using
9450 `%error-verbose' is preferred.
9451
9452 -- Macro: YYINITDEPTH
9453 Macro for specifying the initial size of the parser stack. *Note
9454 Memory Management::.
9455
9456 -- Function: yylex
9457 User-supplied lexical analyzer function, called with no arguments
9458 to get the next token. *Note The Lexical Analyzer Function
9459 `yylex': Lexical.
9460
9461 -- Macro: YYLEX_PARAM
9462 An obsolete macro for specifying an extra argument (or list of
9463 extra arguments) for `yyparse' to pass to `yylex'. The use of this
9464 macro is deprecated, and is supported only for Yacc like parsers.
9465 *Note Calling Conventions for Pure Parsers: Pure Calling.
9466
9467 -- Variable: yylloc
9468 External variable in which `yylex' should place the line and column
9469 numbers associated with a token. (In a pure parser, it is a local
9470 variable within `yyparse', and its address is passed to `yylex'.)
9471 You can ignore this variable if you don't use the `@' feature in
9472 the grammar actions. *Note Textual Locations of Tokens: Token
9473 Locations. In semantic actions, it stores the location of the
9474 lookahead token. *Note Actions and Locations: Actions and
9475 Locations.
9476
9477 -- Type: YYLTYPE
9478 Data type of `yylloc'; by default, a structure with four members.
9479 *Note Data Types of Locations: Location Type.
9480
9481 -- Variable: yylval
9482 External variable in which `yylex' should place the semantic value
9483 associated with a token. (In a pure parser, it is a local
9484 variable within `yyparse', and its address is passed to `yylex'.)
9485 *Note Semantic Values of Tokens: Token Values. In semantic
9486 actions, it stores the semantic value of the lookahead token.
9487 *Note Actions: Actions.
9488
9489 -- Macro: YYMAXDEPTH
9490 Macro for specifying the maximum size of the parser stack. *Note
9491 Memory Management::.
9492
9493 -- Variable: yynerrs
9494 Global variable which Bison increments each time it reports a
9495 syntax error. (In a pure parser, it is a local variable within
9496 `yyparse'. In a pure push parser, it is a member of yypstate.)
9497 *Note The Error Reporting Function `yyerror': Error Reporting.
9498
9499 -- Function: yyparse
9500 The parser function produced by Bison; call this function to start
9501 parsing. *Note The Parser Function `yyparse': Parser Function.
9502
9503 -- Function: yypstate_delete
9504 The function to delete a parser instance, produced by Bison in
9505 push mode; call this function to delete the memory associated with
9506 a parser. *Note The Parser Delete Function `yypstate_delete':
9507 Parser Delete Function. (The current push parsing interface is
9508 experimental and may evolve. More user feedback will help to
9509 stabilize it.)
9510
9511 -- Function: yypstate_new
9512 The function to create a parser instance, produced by Bison in
9513 push mode; call this function to create a new parser. *Note The
9514 Parser Create Function `yypstate_new': Parser Create Function.
9515 (The current push parsing interface is experimental and may evolve.
9516 More user feedback will help to stabilize it.)
9517
9518 -- Function: yypull_parse
9519 The parser function produced by Bison in push mode; call this
9520 function to parse the rest of the input stream. *Note The Pull
9521 Parser Function `yypull_parse': Pull Parser Function. (The
9522 current push parsing interface is experimental and may evolve.
9523 More user feedback will help to stabilize it.)
9524
9525 -- Function: yypush_parse
9526 The parser function produced by Bison in push mode; call this
9527 function to parse a single token. *Note The Push Parser Function
9528 `yypush_parse': Push Parser Function. (The current push parsing
9529 interface is experimental and may evolve. More user feedback will
9530 help to stabilize it.)
9531
9532 -- Macro: YYPARSE_PARAM
9533 An obsolete macro for specifying the name of a parameter that
9534 `yyparse' should accept. The use of this macro is deprecated, and
9535 is supported only for Yacc like parsers. *Note Calling
9536 Conventions for Pure Parsers: Pure Calling.
9537
9538 -- Macro: YYRECOVERING
9539 The expression `YYRECOVERING ()' yields 1 when the parser is
9540 recovering from a syntax error, and 0 otherwise. *Note Special
9541 Features for Use in Actions: Action Features.
9542
9543 -- Macro: YYSTACK_USE_ALLOCA
9544 Macro used to control the use of `alloca' when the C LALR(1)
9545 parser needs to extend its stacks. If defined to 0, the parser
9546 will use `malloc' to extend its stacks. If defined to 1, the
9547 parser will use `alloca'. Values other than 0 and 1 are reserved
9548 for future Bison extensions. If not defined, `YYSTACK_USE_ALLOCA'
9549 defaults to 0.
9550
9551 In the all-too-common case where your code may run on a host with a
9552 limited stack and with unreliable stack-overflow checking, you
9553 should set `YYMAXDEPTH' to a value that cannot possibly result in
9554 unchecked stack overflow on any of your target hosts when `alloca'
9555 is called. You can inspect the code that Bison generates in order
9556 to determine the proper numeric values. This will require some
9557 expertise in low-level implementation details.
9558
9559 -- Type: YYSTYPE
9560 Data type of semantic values; `int' by default. *Note Data Types
9561 of Semantic Values: Value Type.
9562
9563 
9564 File: bison.info, Node: Glossary, Next: Copying This Manual, Prev: Table of S ymbols, Up: Top
9565
9566 Appendix B Glossary
9567 *******************
9568
9569 Backus-Naur Form (BNF; also called "Backus Normal Form")
9570 Formal method of specifying context-free grammars originally
9571 proposed by John Backus, and slightly improved by Peter Naur in
9572 his 1960-01-02 committee document contributing to what became the
9573 Algol 60 report. *Note Languages and Context-Free Grammars:
9574 Language and Grammar.
9575
9576 Context-free grammars
9577 Grammars specified as rules that can be applied regardless of
9578 context. Thus, if there is a rule which says that an integer can
9579 be used as an expression, integers are allowed _anywhere_ an
9580 expression is permitted. *Note Languages and Context-Free
9581 Grammars: Language and Grammar.
9582
9583 Dynamic allocation
9584 Allocation of memory that occurs during execution, rather than at
9585 compile time or on entry to a function.
9586
9587 Empty string
9588 Analogous to the empty set in set theory, the empty string is a
9589 character string of length zero.
9590
9591 Finite-state stack machine
9592 A "machine" that has discrete states in which it is said to exist
9593 at each instant in time. As input to the machine is processed, the
9594 machine moves from state to state as specified by the logic of the
9595 machine. In the case of the parser, the input is the language
9596 being parsed, and the states correspond to various stages in the
9597 grammar rules. *Note The Bison Parser Algorithm: Algorithm.
9598
9599 Generalized LR (GLR)
9600 A parsing algorithm that can handle all context-free grammars,
9601 including those that are not LALR(1). It resolves situations that
9602 Bison's usual LALR(1) algorithm cannot by effectively splitting
9603 off multiple parsers, trying all possible parsers, and discarding
9604 those that fail in the light of additional right context. *Note
9605 Generalized LR Parsing: Generalized LR Parsing.
9606
9607 Grouping
9608 A language construct that is (in general) grammatically divisible;
9609 for example, `expression' or `declaration' in C. *Note Languages
9610 and Context-Free Grammars: Language and Grammar.
9611
9612 Infix operator
9613 An arithmetic operator that is placed between the operands on
9614 which it performs some operation.
9615
9616 Input stream
9617 A continuous flow of data between devices or programs.
9618
9619 Language construct
9620 One of the typical usage schemas of the language. For example,
9621 one of the constructs of the C language is the `if' statement.
9622 *Note Languages and Context-Free Grammars: Language and Grammar.
9623
9624 Left associativity
9625 Operators having left associativity are analyzed from left to
9626 right: `a+b+c' first computes `a+b' and then combines with `c'.
9627 *Note Operator Precedence: Precedence.
9628
9629 Left recursion
9630 A rule whose result symbol is also its first component symbol; for
9631 example, `expseq1 : expseq1 ',' exp;'. *Note Recursive Rules:
9632 Recursion.
9633
9634 Left-to-right parsing
9635 Parsing a sentence of a language by analyzing it token by token
9636 from left to right. *Note The Bison Parser Algorithm: Algorithm.
9637
9638 Lexical analyzer (scanner)
9639 A function that reads an input stream and returns tokens one by
9640 one. *Note The Lexical Analyzer Function `yylex': Lexical.
9641
9642 Lexical tie-in
9643 A flag, set by actions in the grammar rules, which alters the way
9644 tokens are parsed. *Note Lexical Tie-ins::.
9645
9646 Literal string token
9647 A token which consists of two or more fixed characters. *Note
9648 Symbols::.
9649
9650 Lookahead token
9651 A token already read but not yet shifted. *Note Lookahead Tokens:
9652 Lookahead.
9653
9654 LALR(1)
9655 The class of context-free grammars that Bison (like most other
9656 parser generators) can handle; a subset of LR(1). *Note
9657 Mysterious Reduce/Reduce Conflicts: Mystery Conflicts.
9658
9659 LR(1)
9660 The class of context-free grammars in which at most one token of
9661 lookahead is needed to disambiguate the parsing of any piece of
9662 input.
9663
9664 Nonterminal symbol
9665 A grammar symbol standing for a grammatical construct that can be
9666 expressed through rules in terms of smaller constructs; in other
9667 words, a construct that is not a token. *Note Symbols::.
9668
9669 Parser
9670 A function that recognizes valid sentences of a language by
9671 analyzing the syntax structure of a set of tokens passed to it
9672 from a lexical analyzer.
9673
9674 Postfix operator
9675 An arithmetic operator that is placed after the operands upon
9676 which it performs some operation.
9677
9678 Reduction
9679 Replacing a string of nonterminals and/or terminals with a single
9680 nonterminal, according to a grammar rule. *Note The Bison Parser
9681 Algorithm: Algorithm.
9682
9683 Reentrant
9684 A reentrant subprogram is a subprogram which can be in invoked any
9685 number of times in parallel, without interference between the
9686 various invocations. *Note A Pure (Reentrant) Parser: Pure Decl.
9687
9688 Reverse polish notation
9689 A language in which all operators are postfix operators.
9690
9691 Right recursion
9692 A rule whose result symbol is also its last component symbol; for
9693 example, `expseq1: exp ',' expseq1;'. *Note Recursive Rules:
9694 Recursion.
9695
9696 Semantics
9697 In computer languages, the semantics are specified by the actions
9698 taken for each instance of the language, i.e., the meaning of each
9699 statement. *Note Defining Language Semantics: Semantics.
9700
9701 Shift
9702 A parser is said to shift when it makes the choice of analyzing
9703 further input from the stream rather than reducing immediately some
9704 already-recognized rule. *Note The Bison Parser Algorithm:
9705 Algorithm.
9706
9707 Single-character literal
9708 A single character that is recognized and interpreted as is.
9709 *Note From Formal Rules to Bison Input: Grammar in Bison.
9710
9711 Start symbol
9712 The nonterminal symbol that stands for a complete valid utterance
9713 in the language being parsed. The start symbol is usually listed
9714 as the first nonterminal symbol in a language specification.
9715 *Note The Start-Symbol: Start Decl.
9716
9717 Symbol table
9718 A data structure where symbol names and associated data are stored
9719 during parsing to allow for recognition and use of existing
9720 information in repeated uses of a symbol. *Note Multi-function
9721 Calc::.
9722
9723 Syntax error
9724 An error encountered during parsing of an input stream due to
9725 invalid syntax. *Note Error Recovery::.
9726
9727 Token
9728 A basic, grammatically indivisible unit of a language. The symbol
9729 that describes a token in the grammar is a terminal symbol. The
9730 input of the Bison parser is a stream of tokens which comes from
9731 the lexical analyzer. *Note Symbols::.
9732
9733 Terminal symbol
9734 A grammar symbol that has no rules in the grammar and therefore is
9735 grammatically indivisible. The piece of text it represents is a
9736 token. *Note Languages and Context-Free Grammars: Language and
9737 Grammar.
9738
9739 
9740 File: bison.info, Node: Copying This Manual, Next: Index, Prev: Glossary, Up : Top
9741
9742 Appendix C Copying This Manual
9743 ******************************
9744
9745 Version 1.2, November 2002
9746
9747 Copyright (C) 2000,2001,2002 Free Software Foundation, Inc.
9748 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
9749
9750 Everyone is permitted to copy and distribute verbatim copies
9751 of this license document, but changing it is not allowed.
9752
9753 0. PREAMBLE
9754
9755 The purpose of this License is to make a manual, textbook, or other
9756 functional and useful document "free" in the sense of freedom: to
9757 assure everyone the effective freedom to copy and redistribute it,
9758 with or without modifying it, either commercially or
9759 noncommercially. Secondarily, this License preserves for the
9760 author and publisher a way to get credit for their work, while not
9761 being considered responsible for modifications made by others.
9762
9763 This License is a kind of "copyleft", which means that derivative
9764 works of the document must themselves be free in the same sense.
9765 It complements the GNU General Public License, which is a copyleft
9766 license designed for free software.
9767
9768 We have designed this License in order to use it for manuals for
9769 free software, because free software needs free documentation: a
9770 free program should come with manuals providing the same freedoms
9771 that the software does. But this License is not limited to
9772 software manuals; it can be used for any textual work, regardless
9773 of subject matter or whether it is published as a printed book.
9774 We recommend this License principally for works whose purpose is
9775 instruction or reference.
9776
9777 1. APPLICABILITY AND DEFINITIONS
9778
9779 This License applies to any manual or other work, in any medium,
9780 that contains a notice placed by the copyright holder saying it
9781 can be distributed under the terms of this License. Such a notice
9782 grants a world-wide, royalty-free license, unlimited in duration,
9783 to use that work under the conditions stated herein. The
9784 "Document", below, refers to any such manual or work. Any member
9785 of the public is a licensee, and is addressed as "you". You
9786 accept the license if you copy, modify or distribute the work in a
9787 way requiring permission under copyright law.
9788
9789 A "Modified Version" of the Document means any work containing the
9790 Document or a portion of it, either copied verbatim, or with
9791 modifications and/or translated into another language.
9792
9793 A "Secondary Section" is a named appendix or a front-matter section
9794 of the Document that deals exclusively with the relationship of the
9795 publishers or authors of the Document to the Document's overall
9796 subject (or to related matters) and contains nothing that could
9797 fall directly within that overall subject. (Thus, if the Document
9798 is in part a textbook of mathematics, a Secondary Section may not
9799 explain any mathematics.) The relationship could be a matter of
9800 historical connection with the subject or with related matters, or
9801 of legal, commercial, philosophical, ethical or political position
9802 regarding them.
9803
9804 The "Invariant Sections" are certain Secondary Sections whose
9805 titles are designated, as being those of Invariant Sections, in
9806 the notice that says that the Document is released under this
9807 License. If a section does not fit the above definition of
9808 Secondary then it is not allowed to be designated as Invariant.
9809 The Document may contain zero Invariant Sections. If the Document
9810 does not identify any Invariant Sections then there are none.
9811
9812 The "Cover Texts" are certain short passages of text that are
9813 listed, as Front-Cover Texts or Back-Cover Texts, in the notice
9814 that says that the Document is released under this License. A
9815 Front-Cover Text may be at most 5 words, and a Back-Cover Text may
9816 be at most 25 words.
9817
9818 A "Transparent" copy of the Document means a machine-readable copy,
9819 represented in a format whose specification is available to the
9820 general public, that is suitable for revising the document
9821 straightforwardly with generic text editors or (for images
9822 composed of pixels) generic paint programs or (for drawings) some
9823 widely available drawing editor, and that is suitable for input to
9824 text formatters or for automatic translation to a variety of
9825 formats suitable for input to text formatters. A copy made in an
9826 otherwise Transparent file format whose markup, or absence of
9827 markup, has been arranged to thwart or discourage subsequent
9828 modification by readers is not Transparent. An image format is
9829 not Transparent if used for any substantial amount of text. A
9830 copy that is not "Transparent" is called "Opaque".
9831
9832 Examples of suitable formats for Transparent copies include plain
9833 ASCII without markup, Texinfo input format, LaTeX input format,
9834 SGML or XML using a publicly available DTD, and
9835 standard-conforming simple HTML, PostScript or PDF designed for
9836 human modification. Examples of transparent image formats include
9837 PNG, XCF and JPG. Opaque formats include proprietary formats that
9838 can be read and edited only by proprietary word processors, SGML or
9839 XML for which the DTD and/or processing tools are not generally
9840 available, and the machine-generated HTML, PostScript or PDF
9841 produced by some word processors for output purposes only.
9842
9843 The "Title Page" means, for a printed book, the title page itself,
9844 plus such following pages as are needed to hold, legibly, the
9845 material this License requires to appear in the title page. For
9846 works in formats which do not have any title page as such, "Title
9847 Page" means the text near the most prominent appearance of the
9848 work's title, preceding the beginning of the body of the text.
9849
9850 A section "Entitled XYZ" means a named subunit of the Document
9851 whose title either is precisely XYZ or contains XYZ in parentheses
9852 following text that translates XYZ in another language. (Here XYZ
9853 stands for a specific section name mentioned below, such as
9854 "Acknowledgements", "Dedications", "Endorsements", or "History".)
9855 To "Preserve the Title" of such a section when you modify the
9856 Document means that it remains a section "Entitled XYZ" according
9857 to this definition.
9858
9859 The Document may include Warranty Disclaimers next to the notice
9860 which states that this License applies to the Document. These
9861 Warranty Disclaimers are considered to be included by reference in
9862 this License, but only as regards disclaiming warranties: any other
9863 implication that these Warranty Disclaimers may have is void and
9864 has no effect on the meaning of this License.
9865
9866 2. VERBATIM COPYING
9867
9868 You may copy and distribute the Document in any medium, either
9869 commercially or noncommercially, provided that this License, the
9870 copyright notices, and the license notice saying this License
9871 applies to the Document are reproduced in all copies, and that you
9872 add no other conditions whatsoever to those of this License. You
9873 may not use technical measures to obstruct or control the reading
9874 or further copying of the copies you make or distribute. However,
9875 you may accept compensation in exchange for copies. If you
9876 distribute a large enough number of copies you must also follow
9877 the conditions in section 3.
9878
9879 You may also lend copies, under the same conditions stated above,
9880 and you may publicly display copies.
9881
9882 3. COPYING IN QUANTITY
9883
9884 If you publish printed copies (or copies in media that commonly
9885 have printed covers) of the Document, numbering more than 100, and
9886 the Document's license notice requires Cover Texts, you must
9887 enclose the copies in covers that carry, clearly and legibly, all
9888 these Cover Texts: Front-Cover Texts on the front cover, and
9889 Back-Cover Texts on the back cover. Both covers must also clearly
9890 and legibly identify you as the publisher of these copies. The
9891 front cover must present the full title with all words of the
9892 title equally prominent and visible. You may add other material
9893 on the covers in addition. Copying with changes limited to the
9894 covers, as long as they preserve the title of the Document and
9895 satisfy these conditions, can be treated as verbatim copying in
9896 other respects.
9897
9898 If the required texts for either cover are too voluminous to fit
9899 legibly, you should put the first ones listed (as many as fit
9900 reasonably) on the actual cover, and continue the rest onto
9901 adjacent pages.
9902
9903 If you publish or distribute Opaque copies of the Document
9904 numbering more than 100, you must either include a
9905 machine-readable Transparent copy along with each Opaque copy, or
9906 state in or with each Opaque copy a computer-network location from
9907 which the general network-using public has access to download
9908 using public-standard network protocols a complete Transparent
9909 copy of the Document, free of added material. If you use the
9910 latter option, you must take reasonably prudent steps, when you
9911 begin distribution of Opaque copies in quantity, to ensure that
9912 this Transparent copy will remain thus accessible at the stated
9913 location until at least one year after the last time you
9914 distribute an Opaque copy (directly or through your agents or
9915 retailers) of that edition to the public.
9916
9917 It is requested, but not required, that you contact the authors of
9918 the Document well before redistributing any large number of
9919 copies, to give them a chance to provide you with an updated
9920 version of the Document.
9921
9922 4. MODIFICATIONS
9923
9924 You may copy and distribute a Modified Version of the Document
9925 under the conditions of sections 2 and 3 above, provided that you
9926 release the Modified Version under precisely this License, with
9927 the Modified Version filling the role of the Document, thus
9928 licensing distribution and modification of the Modified Version to
9929 whoever possesses a copy of it. In addition, you must do these
9930 things in the Modified Version:
9931
9932 A. Use in the Title Page (and on the covers, if any) a title
9933 distinct from that of the Document, and from those of
9934 previous versions (which should, if there were any, be listed
9935 in the History section of the Document). You may use the
9936 same title as a previous version if the original publisher of
9937 that version gives permission.
9938
9939 B. List on the Title Page, as authors, one or more persons or
9940 entities responsible for authorship of the modifications in
9941 the Modified Version, together with at least five of the
9942 principal authors of the Document (all of its principal
9943 authors, if it has fewer than five), unless they release you
9944 from this requirement.
9945
9946 C. State on the Title page the name of the publisher of the
9947 Modified Version, as the publisher.
9948
9949 D. Preserve all the copyright notices of the Document.
9950
9951 E. Add an appropriate copyright notice for your modifications
9952 adjacent to the other copyright notices.
9953
9954 F. Include, immediately after the copyright notices, a license
9955 notice giving the public permission to use the Modified
9956 Version under the terms of this License, in the form shown in
9957 the Addendum below.
9958
9959 G. Preserve in that license notice the full lists of Invariant
9960 Sections and required Cover Texts given in the Document's
9961 license notice.
9962
9963 H. Include an unaltered copy of this License.
9964
9965 I. Preserve the section Entitled "History", Preserve its Title,
9966 and add to it an item stating at least the title, year, new
9967 authors, and publisher of the Modified Version as given on
9968 the Title Page. If there is no section Entitled "History" in
9969 the Document, create one stating the title, year, authors,
9970 and publisher of the Document as given on its Title Page,
9971 then add an item describing the Modified Version as stated in
9972 the previous sentence.
9973
9974 J. Preserve the network location, if any, given in the Document
9975 for public access to a Transparent copy of the Document, and
9976 likewise the network locations given in the Document for
9977 previous versions it was based on. These may be placed in
9978 the "History" section. You may omit a network location for a
9979 work that was published at least four years before the
9980 Document itself, or if the original publisher of the version
9981 it refers to gives permission.
9982
9983 K. For any section Entitled "Acknowledgements" or "Dedications",
9984 Preserve the Title of the section, and preserve in the
9985 section all the substance and tone of each of the contributor
9986 acknowledgements and/or dedications given therein.
9987
9988 L. Preserve all the Invariant Sections of the Document,
9989 unaltered in their text and in their titles. Section numbers
9990 or the equivalent are not considered part of the section
9991 titles.
9992
9993 M. Delete any section Entitled "Endorsements". Such a section
9994 may not be included in the Modified Version.
9995
9996 N. Do not retitle any existing section to be Entitled
9997 "Endorsements" or to conflict in title with any Invariant
9998 Section.
9999
10000 O. Preserve any Warranty Disclaimers.
10001
10002 If the Modified Version includes new front-matter sections or
10003 appendices that qualify as Secondary Sections and contain no
10004 material copied from the Document, you may at your option
10005 designate some or all of these sections as invariant. To do this,
10006 add their titles to the list of Invariant Sections in the Modified
10007 Version's license notice. These titles must be distinct from any
10008 other section titles.
10009
10010 You may add a section Entitled "Endorsements", provided it contains
10011 nothing but endorsements of your Modified Version by various
10012 parties--for example, statements of peer review or that the text
10013 has been approved by an organization as the authoritative
10014 definition of a standard.
10015
10016 You may add a passage of up to five words as a Front-Cover Text,
10017 and a passage of up to 25 words as a Back-Cover Text, to the end
10018 of the list of Cover Texts in the Modified Version. Only one
10019 passage of Front-Cover Text and one of Back-Cover Text may be
10020 added by (or through arrangements made by) any one entity. If the
10021 Document already includes a cover text for the same cover,
10022 previously added by you or by arrangement made by the same entity
10023 you are acting on behalf of, you may not add another; but you may
10024 replace the old one, on explicit permission from the previous
10025 publisher that added the old one.
10026
10027 The author(s) and publisher(s) of the Document do not by this
10028 License give permission to use their names for publicity for or to
10029 assert or imply endorsement of any Modified Version.
10030
10031 5. COMBINING DOCUMENTS
10032
10033 You may combine the Document with other documents released under
10034 this License, under the terms defined in section 4 above for
10035 modified versions, provided that you include in the combination
10036 all of the Invariant Sections of all of the original documents,
10037 unmodified, and list them all as Invariant Sections of your
10038 combined work in its license notice, and that you preserve all
10039 their Warranty Disclaimers.
10040
10041 The combined work need only contain one copy of this License, and
10042 multiple identical Invariant Sections may be replaced with a single
10043 copy. If there are multiple Invariant Sections with the same name
10044 but different contents, make the title of each such section unique
10045 by adding at the end of it, in parentheses, the name of the
10046 original author or publisher of that section if known, or else a
10047 unique number. Make the same adjustment to the section titles in
10048 the list of Invariant Sections in the license notice of the
10049 combined work.
10050
10051 In the combination, you must combine any sections Entitled
10052 "History" in the various original documents, forming one section
10053 Entitled "History"; likewise combine any sections Entitled
10054 "Acknowledgements", and any sections Entitled "Dedications". You
10055 must delete all sections Entitled "Endorsements."
10056
10057 6. COLLECTIONS OF DOCUMENTS
10058
10059 You may make a collection consisting of the Document and other
10060 documents released under this License, and replace the individual
10061 copies of this License in the various documents with a single copy
10062 that is included in the collection, provided that you follow the
10063 rules of this License for verbatim copying of each of the
10064 documents in all other respects.
10065
10066 You may extract a single document from such a collection, and
10067 distribute it individually under this License, provided you insert
10068 a copy of this License into the extracted document, and follow
10069 this License in all other respects regarding verbatim copying of
10070 that document.
10071
10072 7. AGGREGATION WITH INDEPENDENT WORKS
10073
10074 A compilation of the Document or its derivatives with other
10075 separate and independent documents or works, in or on a volume of
10076 a storage or distribution medium, is called an "aggregate" if the
10077 copyright resulting from the compilation is not used to limit the
10078 legal rights of the compilation's users beyond what the individual
10079 works permit. When the Document is included in an aggregate, this
10080 License does not apply to the other works in the aggregate which
10081 are not themselves derivative works of the Document.
10082
10083 If the Cover Text requirement of section 3 is applicable to these
10084 copies of the Document, then if the Document is less than one half
10085 of the entire aggregate, the Document's Cover Texts may be placed
10086 on covers that bracket the Document within the aggregate, or the
10087 electronic equivalent of covers if the Document is in electronic
10088 form. Otherwise they must appear on printed covers that bracket
10089 the whole aggregate.
10090
10091 8. TRANSLATION
10092
10093 Translation is considered a kind of modification, so you may
10094 distribute translations of the Document under the terms of section
10095 4. Replacing Invariant Sections with translations requires special
10096 permission from their copyright holders, but you may include
10097 translations of some or all Invariant Sections in addition to the
10098 original versions of these Invariant Sections. You may include a
10099 translation of this License, and all the license notices in the
10100 Document, and any Warranty Disclaimers, provided that you also
10101 include the original English version of this License and the
10102 original versions of those notices and disclaimers. In case of a
10103 disagreement between the translation and the original version of
10104 this License or a notice or disclaimer, the original version will
10105 prevail.
10106
10107 If a section in the Document is Entitled "Acknowledgements",
10108 "Dedications", or "History", the requirement (section 4) to
10109 Preserve its Title (section 1) will typically require changing the
10110 actual title.
10111
10112 9. TERMINATION
10113
10114 You may not copy, modify, sublicense, or distribute the Document
10115 except as expressly provided for under this License. Any other
10116 attempt to copy, modify, sublicense or distribute the Document is
10117 void, and will automatically terminate your rights under this
10118 License. However, parties who have received copies, or rights,
10119 from you under this License will not have their licenses
10120 terminated so long as such parties remain in full compliance.
10121
10122 10. FUTURE REVISIONS OF THIS LICENSE
10123
10124 The Free Software Foundation may publish new, revised versions of
10125 the GNU Free Documentation License from time to time. Such new
10126 versions will be similar in spirit to the present version, but may
10127 differ in detail to address new problems or concerns. See
10128 `http://www.gnu.org/copyleft/'.
10129
10130 Each version of the License is given a distinguishing version
10131 number. If the Document specifies that a particular numbered
10132 version of this License "or any later version" applies to it, you
10133 have the option of following the terms and conditions either of
10134 that specified version or of any later version that has been
10135 published (not as a draft) by the Free Software Foundation. If
10136 the Document does not specify a version number of this License,
10137 you may choose any version ever published (not as a draft) by the
10138 Free Software Foundation.
10139
10140 ADDENDUM: How to use this License for your documents
10141 ====================================================
10142
10143 To use this License in a document you have written, include a copy of
10144 the License in the document and put the following copyright and license
10145 notices just after the title page:
10146
10147 Copyright (C) YEAR YOUR NAME.
10148 Permission is granted to copy, distribute and/or modify this document
10149 under the terms of the GNU Free Documentation License, Version 1.2
10150 or any later version published by the Free Software Foundation;
10151 with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
10152 Texts. A copy of the license is included in the section entitled ``GNU
10153 Free Documentation License''.
10154
10155 If you have Invariant Sections, Front-Cover Texts and Back-Cover
10156 Texts, replace the "with...Texts." line with this:
10157
10158 with the Invariant Sections being LIST THEIR TITLES, with
10159 the Front-Cover Texts being LIST, and with the Back-Cover Texts
10160 being LIST.
10161
10162 If you have Invariant Sections without Cover Texts, or some other
10163 combination of the three, merge those two alternatives to suit the
10164 situation.
10165
10166 If your document contains nontrivial examples of program code, we
10167 recommend releasing these examples in parallel under your choice of
10168 free software license, such as the GNU General Public License, to
10169 permit their use in free software.
10170
10171 
10172 File: bison.info, Node: Index, Prev: Copying This Manual, Up: Top
10173
10174 Index
10175 *****
10176
10177 [index]
10178 * Menu:
10179
10180 * $ <1>: Table of Symbols. (line 19)
10181 * $ <2>: Action Features. (line 14)
10182 * $: Java Action Features.
10183 (line 13)
10184 * $$ <1>: Action Features. (line 10)
10185 * $$ <2>: Java Action Features.
10186 (line 21)
10187 * $$ <3>: Actions. (line 6)
10188 * $$: Table of Symbols. (line 15)
10189 * $< <1>: Java Action Features.
10190 (line 17)
10191 * $< <2>: Action Features. (line 23)
10192 * $< <3>: Java Action Features.
10193 (line 29)
10194 * $<: Action Features. (line 18)
10195 * $accept: Table of Symbols. (line 65)
10196 * $end: Table of Symbols. (line 104)
10197 * $N: Actions. (line 6)
10198 * $undefined: Table of Symbols. (line 212)
10199 * % <1>: Java Declarations Summary.
10200 (line 53)
10201 * %: Table of Symbols. (line 28)
10202 * %% <1>: Table of Symbols. (line 23)
10203 * %%: Java Declarations Summary.
10204 (line 49)
10205 * %code <1>: Table of Symbols. (line 71)
10206 * %code <2>: Prologue Alternatives.
10207 (line 6)
10208 * %code <3>: Java Declarations Summary.
10209 (line 37)
10210 * %code <4>: Calc++ Parser. (line 64)
10211 * %code: Decl Summary. (line 63)
10212 * %code imports <1>: Java Declarations Summary.
10213 (line 41)
10214 * %code imports: Decl Summary. (line 115)
10215 * %code lexer: Java Declarations Summary.
10216 (line 45)
10217 * %code provides <1>: Prologue Alternatives.
10218 (line 6)
10219 * %code provides: Decl Summary. (line 303)
10220 * %code requires <1>: Decl Summary. (line 72)
10221 * %code requires <2>: Calc++ Parser. (line 17)
10222 * %code requires: Prologue Alternatives.
10223 (line 6)
10224 * %code top <1>: Decl Summary. (line 98)
10225 * %code top: Prologue Alternatives.
10226 (line 6)
10227 * %debug <1>: Table of Symbols. (line 78)
10228 * %debug <2>: Tracing. (line 23)
10229 * %debug <3>: Decl Summary. (line 134)
10230 * %debug: Table of Symbols. (line 75)
10231 * %define <1>: Table of Symbols. (line 81)
10232 * %define <2>: Decl Summary. (line 140)
10233 * %define: Table of Symbols. (line 82)
10234 * %define abstract: Java Declarations Summary.
10235 (line 57)
10236 * %define api.pure <1>: Decl Summary. (line 166)
10237 * %define api.pure: Pure Decl. (line 6)
10238 * %define api.push_pull <1>: Push Decl. (line 6)
10239 * %define api.push_pull: Decl Summary. (line 177)
10240 * %define extends: Java Declarations Summary.
10241 (line 61)
10242 * %define final: Java Declarations Summary.
10243 (line 65)
10244 * %define implements: Java Declarations Summary.
10245 (line 69)
10246 * %define lex_throws: Java Declarations Summary.
10247 (line 73)
10248 * %define location_type: Java Declarations Summary.
10249 (line 78)
10250 * %define lr.keep_unreachable_states: Decl Summary. (line 190)
10251 * %define namespace <1>: Decl Summary. (line 232)
10252 * %define namespace: C++ Bison Interface. (line 10)
10253 * %define package: Java Declarations Summary.
10254 (line 84)
10255 * %define parser_class_name: Java Declarations Summary.
10256 (line 88)
10257 * %define position_type: Java Declarations Summary.
10258 (line 92)
10259 * %define public: Java Declarations Summary.
10260 (line 97)
10261 * %define strictfp: Java Declarations Summary.
10262 (line 105)
10263 * %define stype: Java Declarations Summary.
10264 (line 101)
10265 * %define throws: Java Declarations Summary.
10266 (line 109)
10267 * %defines <1>: Table of Symbols. (line 90)
10268 * %defines <2>: Decl Summary. (line 307)
10269 * %defines: Table of Symbols. (line 86)
10270 * %destructor <1>: Destructor Decl. (line 22)
10271 * %destructor <2>: Decl Summary. (line 310)
10272 * %destructor <3>: Destructor Decl. (line 6)
10273 * %destructor <4>: Mid-Rule Actions. (line 59)
10274 * %destructor <5>: Table of Symbols. (line 94)
10275 * %destructor: Destructor Decl. (line 22)
10276 * %dprec <1>: Table of Symbols. (line 99)
10277 * %dprec: Merging GLR Parses. (line 6)
10278 * %error-verbose <1>: Table of Symbols. (line 118)
10279 * %error-verbose: Error Reporting. (line 17)
10280 * %expect <1>: Decl Summary. (line 38)
10281 * %expect: Expect Decl. (line 6)
10282 * %expect-rr <1>: Expect Decl. (line 6)
10283 * %expect-rr: Simple GLR Parsers. (line 6)
10284 * %file-prefix <1>: Decl Summary. (line 315)
10285 * %file-prefix: Table of Symbols. (line 122)
10286 * %glr-parser <1>: Simple GLR Parsers. (line 6)
10287 * %glr-parser <2>: Table of Symbols. (line 126)
10288 * %glr-parser: GLR Parsers. (line 6)
10289 * %initial-action <1>: Table of Symbols. (line 130)
10290 * %initial-action: Initial Action Decl. (line 11)
10291 * %language <1>: Decl Summary. (line 319)
10292 * %language: Table of Symbols. (line 134)
10293 * %language "Java": Java Declarations Summary.
10294 (line 10)
10295 * %left <1>: Using Precedence. (line 6)
10296 * %left <2>: Decl Summary. (line 21)
10297 * %left: Table of Symbols. (line 138)
10298 * %lex-param <1>: Table of Symbols. (line 142)
10299 * %lex-param <2>: Pure Calling. (line 31)
10300 * %lex-param: Java Declarations Summary.
10301 (line 13)
10302 * %locations: Decl Summary. (line 327)
10303 * %merge <1>: Merging GLR Parses. (line 6)
10304 * %merge: Table of Symbols. (line 147)
10305 * %name-prefix <1>: Java Declarations Summary.
10306 (line 19)
10307 * %name-prefix <2>: Decl Summary. (line 334)
10308 * %name-prefix: Table of Symbols. (line 154)
10309 * %no-lines <1>: Decl Summary. (line 346)
10310 * %no-lines: Table of Symbols. (line 158)
10311 * %nonassoc <1>: Table of Symbols. (line 162)
10312 * %nonassoc <2>: Using Precedence. (line 6)
10313 * %nonassoc: Decl Summary. (line 25)
10314 * %output <1>: Decl Summary. (line 354)
10315 * %output: Table of Symbols. (line 166)
10316 * %parse-param <1>: Java Declarations Summary.
10317 (line 24)
10318 * %parse-param <2>: Parser Function. (line 36)
10319 * %parse-param <3>: Table of Symbols. (line 170)
10320 * %parse-param: Parser Function. (line 36)
10321 * %prec <1>: Table of Symbols. (line 175)
10322 * %prec: Contextual Precedence.
10323 (line 6)
10324 * %pure-parser <1>: Table of Symbols. (line 179)
10325 * %pure-parser: Decl Summary. (line 357)
10326 * %require <1>: Table of Symbols. (line 184)
10327 * %require <2>: Require Decl. (line 6)
10328 * %require: Decl Summary. (line 362)
10329 * %right <1>: Using Precedence. (line 6)
10330 * %right <2>: Decl Summary. (line 17)
10331 * %right: Table of Symbols. (line 188)
10332 * %skeleton <1>: Decl Summary. (line 366)
10333 * %skeleton: Table of Symbols. (line 192)
10334 * %start <1>: Table of Symbols. (line 196)
10335 * %start <2>: Decl Summary. (line 34)
10336 * %start: Start Decl. (line 6)
10337 * %token <1>: Decl Summary. (line 13)
10338 * %token <2>: Token Decl. (line 6)
10339 * %token <3>: Java Declarations Summary.
10340 (line 29)
10341 * %token: Table of Symbols. (line 200)
10342 * %token-table <1>: Decl Summary. (line 374)
10343 * %token-table: Table of Symbols. (line 204)
10344 * %type <1>: Java Declarations Summary.
10345 (line 33)
10346 * %type <2>: Type Decl. (line 6)
10347 * %type <3>: Table of Symbols. (line 208)
10348 * %type: Decl Summary. (line 30)
10349 * %union <1>: Decl Summary. (line 9)
10350 * %union <2>: Union Decl. (line 6)
10351 * %union: Table of Symbols. (line 217)
10352 * %verbose: Decl Summary. (line 407)
10353 * %yacc: Decl Summary. (line 413)
10354 * *yypstate_new: Parser Create Function.
10355 (line 15)
10356 * /*: Table of Symbols. (line 33)
10357 * :: Table of Symbols. (line 36)
10358 * ;: Table of Symbols. (line 40)
10359 * <*> <1>: Destructor Decl. (line 6)
10360 * <*>: Table of Symbols. (line 47)
10361 * <> <1>: Destructor Decl. (line 6)
10362 * <>: Table of Symbols. (line 56)
10363 * @$ <1>: Action Features. (line 98)
10364 * @$ <2>: Java Action Features.
10365 (line 39)
10366 * @$ <3>: Table of Symbols. (line 7)
10367 * @$: Actions and Locations.
10368 (line 6)
10369 * @N <1>: Action Features. (line 104)
10370 * @N <2>: Actions and Locations.
10371 (line 6)
10372 * @N <3>: Table of Symbols. (line 11)
10373 * @N <4>: Action Features. (line 104)
10374 * @N: Java Action Features.
10375 (line 35)
10376 * abstract syntax tree: Implementing Gotos/Loops.
10377 (line 17)
10378 * action: Actions. (line 6)
10379 * action data types: Action Types. (line 6)
10380 * action features summary: Action Features. (line 6)
10381 * actions in mid-rule <1>: Mid-Rule Actions. (line 6)
10382 * actions in mid-rule: Destructor Decl. (line 88)
10383 * actions, location: Actions and Locations.
10384 (line 6)
10385 * actions, semantic: Semantic Actions. (line 6)
10386 * additional C code section: Epilogue. (line 6)
10387 * algorithm of parser: Algorithm. (line 6)
10388 * ambiguous grammars <1>: Generalized LR Parsing.
10389 (line 6)
10390 * ambiguous grammars: Language and Grammar.
10391 (line 33)
10392 * associativity: Why Precedence. (line 33)
10393 * AST: Implementing Gotos/Loops.
10394 (line 17)
10395 * Backus-Naur form: Language and Grammar.
10396 (line 16)
10397 * begin of Location: Java Location Values.
10398 (line 21)
10399 * begin on location: C++ Location Values. (line 44)
10400 * Bison declaration summary: Decl Summary. (line 6)
10401 * Bison declarations: Declarations. (line 6)
10402 * Bison declarations (introduction): Bison Declarations. (line 6)
10403 * Bison grammar: Grammar in Bison. (line 6)
10404 * Bison invocation: Invocation. (line 6)
10405 * Bison parser: Bison Parser. (line 6)
10406 * Bison parser algorithm: Algorithm. (line 6)
10407 * Bison symbols, table of: Table of Symbols. (line 6)
10408 * Bison utility: Bison Parser. (line 6)
10409 * bison-i18n.m4: Internationalization.
10410 (line 20)
10411 * bison-po: Internationalization.
10412 (line 6)
10413 * BISON_I18N: Internationalization.
10414 (line 27)
10415 * BISON_LOCALEDIR: Internationalization.
10416 (line 27)
10417 * BNF: Language and Grammar.
10418 (line 16)
10419 * braced code: Rules. (line 31)
10420 * C code, section for additional: Epilogue. (line 6)
10421 * C-language interface: Interface. (line 6)
10422 * calc: Infix Calc. (line 6)
10423 * calculator, infix notation: Infix Calc. (line 6)
10424 * calculator, location tracking: Location Tracking Calc.
10425 (line 6)
10426 * calculator, multi-function: Multi-function Calc. (line 6)
10427 * calculator, simple: RPN Calc. (line 6)
10428 * character token: Symbols. (line 31)
10429 * column on position: C++ Location Values. (line 25)
10430 * columns on location: C++ Location Values. (line 48)
10431 * columns on position: C++ Location Values. (line 28)
10432 * compiling the parser: Rpcalc Compile. (line 6)
10433 * conflicts <1>: Shift/Reduce. (line 6)
10434 * conflicts <2>: Merging GLR Parses. (line 6)
10435 * conflicts <3>: GLR Parsers. (line 6)
10436 * conflicts: Simple GLR Parsers. (line 6)
10437 * conflicts, reduce/reduce: Reduce/Reduce. (line 6)
10438 * conflicts, suppressing warnings of: Expect Decl. (line 6)
10439 * context-dependent precedence: Contextual Precedence.
10440 (line 6)
10441 * context-free grammar: Language and Grammar.
10442 (line 6)
10443 * controlling function: Rpcalc Main. (line 6)
10444 * core, item set: Understanding. (line 129)
10445 * dangling else: Shift/Reduce. (line 6)
10446 * data type of locations: Location Type. (line 6)
10447 * data types in actions: Action Types. (line 6)
10448 * data types of semantic values: Value Type. (line 6)
10449 * debug_level on parser: C++ Parser Interface.
10450 (line 31)
10451 * debug_stream on parser: C++ Parser Interface.
10452 (line 26)
10453 * debugging: Tracing. (line 6)
10454 * declaration summary: Decl Summary. (line 6)
10455 * declarations: Prologue. (line 6)
10456 * declarations section: Prologue. (line 6)
10457 * declarations, Bison: Declarations. (line 6)
10458 * declarations, Bison (introduction): Bison Declarations. (line 6)
10459 * declaring literal string tokens: Token Decl. (line 6)
10460 * declaring operator precedence: Precedence Decl. (line 6)
10461 * declaring the start symbol: Start Decl. (line 6)
10462 * declaring token type names: Token Decl. (line 6)
10463 * declaring value types: Union Decl. (line 6)
10464 * declaring value types, nonterminals: Type Decl. (line 6)
10465 * default action: Actions. (line 50)
10466 * default data type: Value Type. (line 6)
10467 * default location type: Location Type. (line 6)
10468 * default stack limit: Memory Management. (line 30)
10469 * default start symbol: Start Decl. (line 6)
10470 * deferred semantic actions: GLR Semantic Actions.
10471 (line 6)
10472 * defining language semantics: Semantics. (line 6)
10473 * discarded symbols: Destructor Decl. (line 98)
10474 * discarded symbols, mid-rule actions: Mid-Rule Actions. (line 59)
10475 * else, dangling: Shift/Reduce. (line 6)
10476 * end of Location: Java Location Values.
10477 (line 22)
10478 * end on location: C++ Location Values. (line 45)
10479 * epilogue: Epilogue. (line 6)
10480 * error <1>: Error Recovery. (line 20)
10481 * error: Table of Symbols. (line 108)
10482 * error on parser: C++ Parser Interface.
10483 (line 37)
10484 * error recovery: Error Recovery. (line 6)
10485 * error recovery, mid-rule actions: Mid-Rule Actions. (line 59)
10486 * error recovery, simple: Simple Error Recovery.
10487 (line 6)
10488 * error reporting function: Error Reporting. (line 6)
10489 * error reporting routine: Rpcalc Error. (line 6)
10490 * examples, simple: Examples. (line 6)
10491 * exercises: Exercises. (line 6)
10492 * file format: Grammar Layout. (line 6)
10493 * file on position: C++ Location Values. (line 13)
10494 * finite-state machine: Parser States. (line 6)
10495 * formal grammar: Grammar in Bison. (line 6)
10496 * format of grammar file: Grammar Layout. (line 6)
10497 * freeing discarded symbols: Destructor Decl. (line 6)
10498 * frequently asked questions: FAQ. (line 6)
10499 * generalized LR (GLR) parsing <1>: Generalized LR Parsing.
10500 (line 6)
10501 * generalized LR (GLR) parsing <2>: Language and Grammar.
10502 (line 33)
10503 * generalized LR (GLR) parsing: GLR Parsers. (line 6)
10504 * generalized LR (GLR) parsing, ambiguous grammars: Merging GLR Parses.
10505 (line 6)
10506 * generalized LR (GLR) parsing, unambiguous grammars: Simple GLR Parsers.
10507 (line 6)
10508 * getDebugLevel on YYParser: Java Parser Interface.
10509 (line 67)
10510 * getDebugStream on YYParser: Java Parser Interface.
10511 (line 62)
10512 * getEndPos on Lexer: Java Scanner Interface.
10513 (line 39)
10514 * getLVal on Lexer: Java Scanner Interface.
10515 (line 47)
10516 * getStartPos on Lexer: Java Scanner Interface.
10517 (line 38)
10518 * gettext: Internationalization.
10519 (line 6)
10520 * glossary: Glossary. (line 6)
10521 * GLR parsers and inline: Compiler Requirements.
10522 (line 6)
10523 * GLR parsers and yychar: GLR Semantic Actions.
10524 (line 10)
10525 * GLR parsers and yyclearin: GLR Semantic Actions.
10526 (line 18)
10527 * GLR parsers and YYERROR: GLR Semantic Actions.
10528 (line 28)
10529 * GLR parsers and yylloc: GLR Semantic Actions.
10530 (line 10)
10531 * GLR parsers and YYLLOC_DEFAULT: Location Default Action.
10532 (line 6)
10533 * GLR parsers and yylval: GLR Semantic Actions.
10534 (line 10)
10535 * GLR parsing <1>: Language and Grammar.
10536 (line 33)
10537 * GLR parsing <2>: Generalized LR Parsing.
10538 (line 6)
10539 * GLR parsing: GLR Parsers. (line 6)
10540 * GLR parsing, ambiguous grammars: Merging GLR Parses. (line 6)
10541 * GLR parsing, unambiguous grammars: Simple GLR Parsers. (line 6)
10542 * grammar file: Grammar Layout. (line 6)
10543 * grammar rule syntax: Rules. (line 6)
10544 * grammar rules section: Grammar Rules. (line 6)
10545 * grammar, Bison: Grammar in Bison. (line 6)
10546 * grammar, context-free: Language and Grammar.
10547 (line 6)
10548 * grouping, syntactic: Language and Grammar.
10549 (line 47)
10550 * i18n: Internationalization.
10551 (line 6)
10552 * infix notation calculator: Infix Calc. (line 6)
10553 * inline: Compiler Requirements.
10554 (line 6)
10555 * interface: Interface. (line 6)
10556 * internationalization: Internationalization.
10557 (line 6)
10558 * introduction: Introduction. (line 6)
10559 * invoking Bison: Invocation. (line 6)
10560 * item: Understanding. (line 107)
10561 * item set core: Understanding. (line 129)
10562 * kernel, item set: Understanding. (line 129)
10563 * LALR(1): Mystery Conflicts. (line 36)
10564 * LALR(1) grammars: Language and Grammar.
10565 (line 22)
10566 * language semantics, defining: Semantics. (line 6)
10567 * layout of Bison grammar: Grammar Layout. (line 6)
10568 * left recursion: Recursion. (line 16)
10569 * lex-param: Pure Calling. (line 31)
10570 * lexical analyzer: Lexical. (line 6)
10571 * lexical analyzer, purpose: Bison Parser. (line 6)
10572 * lexical analyzer, writing: Rpcalc Lexer. (line 6)
10573 * lexical tie-in: Lexical Tie-ins. (line 6)
10574 * line on position: C++ Location Values. (line 19)
10575 * lines on location: C++ Location Values. (line 49)
10576 * lines on position: C++ Location Values. (line 22)
10577 * literal string token: Symbols. (line 53)
10578 * literal token: Symbols. (line 31)
10579 * location <1>: Locations Overview. (line 6)
10580 * location: Locations. (line 6)
10581 * location actions: Actions and Locations.
10582 (line 6)
10583 * Location on Location: Java Location Values.
10584 (line 25)
10585 * location tracking calculator: Location Tracking Calc.
10586 (line 6)
10587 * location, textual <1>: Locations. (line 6)
10588 * location, textual: Locations Overview. (line 6)
10589 * location_value_type: C++ Parser Interface.
10590 (line 16)
10591 * lookahead token: Lookahead. (line 6)
10592 * LR(1): Mystery Conflicts. (line 36)
10593 * LR(1) grammars: Language and Grammar.
10594 (line 22)
10595 * ltcalc: Location Tracking Calc.
10596 (line 6)
10597 * main function in simple example: Rpcalc Main. (line 6)
10598 * memory exhaustion: Memory Management. (line 6)
10599 * memory management: Memory Management. (line 6)
10600 * mfcalc: Multi-function Calc. (line 6)
10601 * mid-rule actions <1>: Destructor Decl. (line 88)
10602 * mid-rule actions: Mid-Rule Actions. (line 6)
10603 * multi-function calculator: Multi-function Calc. (line 6)
10604 * multicharacter literal: Symbols. (line 53)
10605 * mutual recursion: Recursion. (line 32)
10606 * NLS: Internationalization.
10607 (line 6)
10608 * nondeterministic parsing <1>: Generalized LR Parsing.
10609 (line 6)
10610 * nondeterministic parsing: Language and Grammar.
10611 (line 33)
10612 * nonterminal symbol: Symbols. (line 6)
10613 * nonterminal, useless: Understanding. (line 62)
10614 * operator precedence: Precedence. (line 6)
10615 * operator precedence, declaring: Precedence Decl. (line 6)
10616 * operator+ on location: C++ Location Values. (line 53)
10617 * operator+ on position: C++ Location Values. (line 33)
10618 * operator+= on location: C++ Location Values. (line 57)
10619 * operator+= on position: C++ Location Values. (line 31)
10620 * operator- on position: C++ Location Values. (line 36)
10621 * operator-= on position: C++ Location Values. (line 35)
10622 * operator<< on position: C++ Location Values. (line 40)
10623 * options for invoking Bison: Invocation. (line 6)
10624 * overflow of parser stack: Memory Management. (line 6)
10625 * parse error: Error Reporting. (line 6)
10626 * parse on parser: C++ Parser Interface.
10627 (line 23)
10628 * parse on YYParser: Java Parser Interface.
10629 (line 54)
10630 * parser: Bison Parser. (line 6)
10631 * parser on parser: C++ Parser Interface.
10632 (line 19)
10633 * parser stack: Algorithm. (line 6)
10634 * parser stack overflow: Memory Management. (line 6)
10635 * parser state: Parser States. (line 6)
10636 * pointed rule: Understanding. (line 107)
10637 * polish notation calculator: RPN Calc. (line 6)
10638 * precedence declarations: Precedence Decl. (line 6)
10639 * precedence of operators: Precedence. (line 6)
10640 * precedence, context-dependent: Contextual Precedence.
10641 (line 6)
10642 * precedence, unary operator: Contextual Precedence.
10643 (line 6)
10644 * preventing warnings about conflicts: Expect Decl. (line 6)
10645 * Prologue <1>: Decl Summary. (line 129)
10646 * Prologue <2>: Prologue. (line 6)
10647 * Prologue: Decl Summary. (line 50)
10648 * Prologue Alternatives: Prologue Alternatives.
10649 (line 6)
10650 * pure parser: Pure Decl. (line 6)
10651 * push parser: Push Decl. (line 6)
10652 * questions: FAQ. (line 6)
10653 * recovering: Java Action Features.
10654 (line 59)
10655 * recovering on YYParser: Java Parser Interface.
10656 (line 58)
10657 * recovery from errors: Error Recovery. (line 6)
10658 * recursive rule: Recursion. (line 6)
10659 * reduce/reduce conflict: Reduce/Reduce. (line 6)
10660 * reduce/reduce conflicts <1>: GLR Parsers. (line 6)
10661 * reduce/reduce conflicts <2>: Simple GLR Parsers. (line 6)
10662 * reduce/reduce conflicts: Merging GLR Parses. (line 6)
10663 * reduction: Algorithm. (line 6)
10664 * reentrant parser: Pure Decl. (line 6)
10665 * requiring a version of Bison: Require Decl. (line 6)
10666 * return YYABORT;: Java Action Features.
10667 (line 43)
10668 * return YYACCEPT;: Java Action Features.
10669 (line 47)
10670 * return YYERROR;: Java Action Features.
10671 (line 51)
10672 * return YYFAIL;: Java Action Features.
10673 (line 55)
10674 * reverse polish notation: RPN Calc. (line 6)
10675 * right recursion: Recursion. (line 16)
10676 * rpcalc: RPN Calc. (line 6)
10677 * rule syntax: Rules. (line 6)
10678 * rule, pointed: Understanding. (line 107)
10679 * rule, useless: Understanding. (line 62)
10680 * rules section for grammar: Grammar Rules. (line 6)
10681 * running Bison (introduction): Rpcalc Generate. (line 6)
10682 * semantic actions: Semantic Actions. (line 6)
10683 * semantic value: Semantic Values. (line 6)
10684 * semantic value type: Value Type. (line 6)
10685 * semantic_value_type: C++ Parser Interface.
10686 (line 15)
10687 * set_debug_level on parser: C++ Parser Interface.
10688 (line 32)
10689 * set_debug_stream on parser: C++ Parser Interface.
10690 (line 27)
10691 * setDebugLevel on YYParser: Java Parser Interface.
10692 (line 68)
10693 * setDebugStream on YYParser: Java Parser Interface.
10694 (line 63)
10695 * shift/reduce conflicts <1>: Simple GLR Parsers. (line 6)
10696 * shift/reduce conflicts <2>: Shift/Reduce. (line 6)
10697 * shift/reduce conflicts: GLR Parsers. (line 6)
10698 * shifting: Algorithm. (line 6)
10699 * simple examples: Examples. (line 6)
10700 * single-character literal: Symbols. (line 31)
10701 * stack overflow: Memory Management. (line 6)
10702 * stack, parser: Algorithm. (line 6)
10703 * stages in using Bison: Stages. (line 6)
10704 * start symbol: Language and Grammar.
10705 (line 96)
10706 * start symbol, declaring: Start Decl. (line 6)
10707 * state (of parser): Parser States. (line 6)
10708 * step on location: C++ Location Values. (line 60)
10709 * string token: Symbols. (line 53)
10710 * summary, action features: Action Features. (line 6)
10711 * summary, Bison declaration: Decl Summary. (line 6)
10712 * suppressing conflict warnings: Expect Decl. (line 6)
10713 * symbol: Symbols. (line 6)
10714 * symbol table example: Mfcalc Symbol Table. (line 6)
10715 * symbols (abstract): Language and Grammar.
10716 (line 47)
10717 * symbols in Bison, table of: Table of Symbols. (line 6)
10718 * syntactic grouping: Language and Grammar.
10719 (line 47)
10720 * syntax error: Error Reporting. (line 6)
10721 * syntax of grammar rules: Rules. (line 6)
10722 * terminal symbol: Symbols. (line 6)
10723 * textual location <1>: Locations Overview. (line 6)
10724 * textual location: Locations. (line 6)
10725 * token: Language and Grammar.
10726 (line 47)
10727 * token type: Symbols. (line 6)
10728 * token type names, declaring: Token Decl. (line 6)
10729 * token, useless: Understanding. (line 62)
10730 * toString on Location: Java Location Values.
10731 (line 32)
10732 * tracing the parser: Tracing. (line 6)
10733 * unary operator precedence: Contextual Precedence.
10734 (line 6)
10735 * useless nonterminal: Understanding. (line 62)
10736 * useless rule: Understanding. (line 62)
10737 * useless token: Understanding. (line 62)
10738 * using Bison: Stages. (line 6)
10739 * value type, semantic: Value Type. (line 6)
10740 * value types, declaring: Union Decl. (line 6)
10741 * value types, nonterminals, declaring: Type Decl. (line 6)
10742 * value, semantic: Semantic Values. (line 6)
10743 * version requirement: Require Decl. (line 6)
10744 * warnings, preventing: Expect Decl. (line 6)
10745 * writing a lexical analyzer: Rpcalc Lexer. (line 6)
10746 * YYABORT <1>: Table of Symbols. (line 221)
10747 * YYABORT: Parser Function. (line 29)
10748 * YYABORT;: Action Features. (line 28)
10749 * YYACCEPT <1>: Table of Symbols. (line 230)
10750 * YYACCEPT: Parser Function. (line 26)
10751 * YYACCEPT;: Action Features. (line 32)
10752 * YYBACKUP <1>: Table of Symbols. (line 238)
10753 * YYBACKUP: Action Features. (line 36)
10754 * yychar <1>: Action Features. (line 69)
10755 * yychar <2>: Lookahead. (line 47)
10756 * yychar <3>: Table of Symbols. (line 242)
10757 * yychar: GLR Semantic Actions.
10758 (line 10)
10759 * yyclearin <1>: GLR Semantic Actions.
10760 (line 18)
10761 * yyclearin <2>: Table of Symbols. (line 248)
10762 * yyclearin: Error Recovery. (line 97)
10763 * yyclearin;: Action Features. (line 76)
10764 * yydebug <1>: Tracing. (line 6)
10765 * yydebug: Table of Symbols. (line 256)
10766 * YYDEBUG <1>: Table of Symbols. (line 252)
10767 * YYDEBUG: Tracing. (line 12)
10768 * YYEMPTY: Action Features. (line 49)
10769 * YYENABLE_NLS: Internationalization.
10770 (line 27)
10771 * YYEOF: Action Features. (line 52)
10772 * yyerrok <1>: Table of Symbols. (line 261)
10773 * yyerrok: Error Recovery. (line 92)
10774 * yyerrok;: Action Features. (line 81)
10775 * YYERROR: Action Features. (line 56)
10776 * yyerror: Java Action Features.
10777 (line 64)
10778 * YYERROR: Table of Symbols. (line 265)
10779 * yyerror <1>: Table of Symbols. (line 274)
10780 * yyerror: Error Reporting. (line 6)
10781 * YYERROR: GLR Semantic Actions.
10782 (line 28)
10783 * yyerror on Lexer: Java Scanner Interface.
10784 (line 25)
10785 * YYERROR;: Action Features. (line 56)
10786 * YYERROR_VERBOSE: Table of Symbols. (line 278)
10787 * YYINITDEPTH <1>: Table of Symbols. (line 285)
10788 * YYINITDEPTH: Memory Management. (line 32)
10789 * yylex <1>: Table of Symbols. (line 289)
10790 * yylex: Lexical. (line 6)
10791 * yylex on Lexer: Java Scanner Interface.
10792 (line 30)
10793 * yylex on parser: C++ Scanner Interface.
10794 (line 12)
10795 * YYLEX_PARAM: Table of Symbols. (line 294)
10796 * yylloc <1>: Token Locations. (line 6)
10797 * yylloc <2>: Table of Symbols. (line 300)
10798 * yylloc <3>: GLR Semantic Actions.
10799 (line 10)
10800 * yylloc <4>: Action Features. (line 86)
10801 * yylloc <5>: Lookahead. (line 47)
10802 * yylloc: Actions and Locations.
10803 (line 60)
10804 * YYLLOC_DEFAULT: Location Default Action.
10805 (line 6)
10806 * YYLTYPE <1>: Table of Symbols. (line 310)
10807 * YYLTYPE: Token Locations. (line 19)
10808 * yylval <1>: Actions. (line 74)
10809 * yylval <2>: Action Features. (line 92)
10810 * yylval <3>: Table of Symbols. (line 314)
10811 * yylval <4>: GLR Semantic Actions.
10812 (line 10)
10813 * yylval <5>: Lookahead. (line 47)
10814 * yylval: Token Values. (line 6)
10815 * YYMAXDEPTH <1>: Table of Symbols. (line 322)
10816 * YYMAXDEPTH: Memory Management. (line 14)
10817 * yynerrs <1>: Error Reporting. (line 92)
10818 * yynerrs: Table of Symbols. (line 326)
10819 * yyparse <1>: Table of Symbols. (line 332)
10820 * yyparse: Parser Function. (line 6)
10821 * YYPARSE_PARAM: Table of Symbols. (line 365)
10822 * YYParser on YYParser: Java Parser Interface.
10823 (line 41)
10824 * YYPRINT: Tracing. (line 71)
10825 * yypstate_delete <1>: Table of Symbols. (line 336)
10826 * yypstate_delete: Parser Delete Function.
10827 (line 6)
10828 * yypstate_new <1>: Parser Create Function.
10829 (line 6)
10830 * yypstate_new: Table of Symbols. (line 344)
10831 * yypull_parse <1>: Pull Parser Function.
10832 (line 6)
10833 * yypull_parse <2>: Table of Symbols. (line 351)
10834 * yypull_parse: Pull Parser Function.
10835 (line 14)
10836 * yypush_parse <1>: Push Parser Function.
10837 (line 15)
10838 * yypush_parse: Table of Symbols. (line 358)
10839 * YYRECOVERING <1>: Action Features. (line 64)
10840 * YYRECOVERING <2>: Error Recovery. (line 109)
10841 * YYRECOVERING <3>: Action Features. (line 64)
10842 * YYRECOVERING: Table of Symbols. (line 371)
10843 * YYSTACK_USE_ALLOCA: Table of Symbols. (line 376)
10844 * YYSTYPE: Table of Symbols. (line 392)
10845 * | <1>: Table of Symbols. (line 43)
10846 * |: Rules. (line 49)
10847
10848
10849 
10850 Tag Table:
10851 Node: Top1174
10852 Node: Introduction13739
10853 Node: Conditions15002
10854 Node: Copying16893
10855 Node: Concepts54431
10856 Node: Language and Grammar55612
10857 Node: Grammar in Bison61501
10858 Node: Semantic Values63430
10859 Node: Semantic Actions65536
10860 Node: GLR Parsers66723
10861 Node: Simple GLR Parsers69470
10862 Node: Merging GLR Parses76122
10863 Node: GLR Semantic Actions80691
10864 Node: Compiler Requirements82581
10865 Node: Locations Overview83317
10866 Node: Bison Parser84770
10867 Node: Stages87710
10868 Node: Grammar Layout88998
10869 Node: Examples90330
10870 Node: RPN Calc91533
10871 Node: Rpcalc Declarations92533
10872 Node: Rpcalc Rules94461
10873 Node: Rpcalc Input96277
10874 Node: Rpcalc Line97752
10875 Node: Rpcalc Expr98880
10876 Node: Rpcalc Lexer100847
10877 Node: Rpcalc Main103441
10878 Node: Rpcalc Error103848
10879 Node: Rpcalc Generate104881
10880 Node: Rpcalc Compile106016
10881 Node: Infix Calc106895
10882 Node: Simple Error Recovery109658
10883 Node: Location Tracking Calc111553
10884 Node: Ltcalc Declarations112249
10885 Node: Ltcalc Rules113338
10886 Node: Ltcalc Lexer115354
10887 Node: Multi-function Calc117677
10888 Node: Mfcalc Declarations119253
10889 Node: Mfcalc Rules121300
10890 Node: Mfcalc Symbol Table122695
10891 Node: Exercises128871
10892 Node: Grammar File129385
10893 Node: Grammar Outline130234
10894 Node: Prologue131084
10895 Node: Prologue Alternatives132873
10896 Node: Bison Declarations142558
10897 Node: Grammar Rules142986
10898 Node: Epilogue143457
10899 Node: Symbols144473
10900 Node: Rules151176
10901 Node: Recursion153655
10902 Node: Semantics155373
10903 Node: Value Type156472
10904 Node: Multiple Types157307
10905 Node: Actions158474
10906 Node: Action Types161889
10907 Node: Mid-Rule Actions163201
10908 Node: Locations169666
10909 Node: Location Type170317
10910 Node: Actions and Locations171103
10911 Node: Location Default Action173564
10912 Node: Declarations177284
10913 Node: Require Decl178811
10914 Node: Token Decl179130
10915 Node: Precedence Decl181556
10916 Node: Union Decl183566
10917 Node: Type Decl185340
10918 Node: Initial Action Decl186266
10919 Node: Destructor Decl187037
10920 Node: Expect Decl192501
10921 Node: Start Decl194494
10922 Node: Pure Decl194882
10923 Node: Push Decl196632
10924 Node: Decl Summary201131
10925 Ref: Decl Summary-Footnote-1218017
10926 Node: Multiple Parsers218221
10927 Node: Interface219860
10928 Node: Parser Function221178
10929 Node: Push Parser Function223194
10930 Node: Pull Parser Function224004
10931 Node: Parser Create Function224655
10932 Node: Parser Delete Function225478
10933 Node: Lexical226249
10934 Node: Calling Convention227681
10935 Node: Token Values230641
10936 Node: Token Locations231805
10937 Node: Pure Calling232699
10938 Node: Error Reporting234580
10939 Node: Action Features238710
10940 Node: Internationalization243012
10941 Node: Algorithm245553
10942 Node: Lookahead247919
10943 Node: Shift/Reduce250128
10944 Node: Precedence253023
10945 Node: Why Precedence253679
10946 Node: Using Precedence255552
10947 Node: Precedence Examples256529
10948 Node: How Precedence257239
10949 Node: Contextual Precedence258396
10950 Node: Parser States260192
10951 Node: Reduce/Reduce261436
10952 Node: Mystery Conflicts264977
10953 Node: Generalized LR Parsing268684
10954 Node: Memory Management273303
10955 Node: Error Recovery275516
10956 Node: Context Dependency280819
10957 Node: Semantic Tokens281668
10958 Node: Lexical Tie-ins284738
10959 Node: Tie-in Recovery286315
10960 Node: Debugging288492
10961 Node: Understanding289158
10962 Node: Tracing300317
10963 Node: Invocation304419
10964 Node: Bison Options305818
10965 Node: Option Cross Key312822
10966 Node: Yacc Library313874
10967 Node: Other Languages314699
10968 Node: C++ Parsers315026
10969 Node: C++ Bison Interface315523
10970 Node: C++ Semantic Values316791
10971 Ref: C++ Semantic Values-Footnote-1317733
10972 Node: C++ Location Values317886
10973 Node: C++ Parser Interface320259
10974 Node: C++ Scanner Interface321976
10975 Node: A Complete C++ Example322678
10976 Node: Calc++ --- C++ Calculator323620
10977 Node: Calc++ Parsing Driver324134
10978 Node: Calc++ Parser327915
10979 Node: Calc++ Scanner331705
10980 Node: Calc++ Top Level335131
10981 Node: Java Parsers335780
10982 Node: Java Bison Interface336457
10983 Node: Java Semantic Values338420
10984 Node: Java Location Values340034
10985 Node: Java Parser Interface341590
10986 Node: Java Scanner Interface344828
10987 Node: Java Action Features347013
10988 Node: Java Differences349740
10989 Ref: Java Differences-Footnote-1352315
10990 Node: Java Declarations Summary352465
10991 Node: FAQ356713
10992 Node: Memory Exhausted357660
10993 Node: How Can I Reset the Parser357970
10994 Node: Strings are Destroyed360239
10995 Node: Implementing Gotos/Loops361828
10996 Node: Multiple start-symbols363111
10997 Node: Secure? Conform?364656
10998 Node: I can't build Bison365104
10999 Node: Where can I find help?365822
11000 Node: Bug Reports366615
11001 Node: More Languages368076
11002 Node: Beta Testing368434
11003 Node: Mailing Lists369308
11004 Node: Table of Symbols369519
11005 Node: Glossary384901
11006 Node: Copying This Manual391798
11007 Node: Index414191
11008 
11009 End Tag Table
OLDNEW
« no previous file with comments | « bison/src/bison/2.4.1/bison-2.4.1-src/doc/bison.1 ('k') | bison/src/bison/2.4.1/bison-2.4.1-src/doc/bison.texinfo » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698