OLD | NEW |
(Empty) | |
| 1 YACC(1P) POSIX Programmer's Manual YACC(1P) |
| 2 |
| 3 |
| 4 |
| 5 PROLOG |
| 6 This manual page is part of the POSIX Programmer's Man- |
| 7 ual. The Linux implementation of this interface may |
| 8 differ (consult the corresponding Linux manual page for |
| 9 details of Linux behavior), or the interface may not be |
| 10 implemented on Linux. |
| 11 |
| 12 NAME |
| 13 yacc - yet another compiler compiler (DEVELOPMENT) |
| 14 |
| 15 SYNOPSIS |
| 16 yacc [-dltv][-b file_prefix][-p sym_prefix] grammar |
| 17 |
| 18 DESCRIPTION |
| 19 The yacc utility shall read a description of a context- |
| 20 free grammar in grammar and write C source code, con- |
| 21 forming to the ISO C standard, to a code file, and |
| 22 optionally header information into a header file, in the |
| 23 current directory. The C code shall define a function |
| 24 and related routines and macros for an automaton that |
| 25 executes a parsing algorithm meeting the requirements in |
| 26 Algorithms . |
| 27 |
| 28 The form and meaning of the grammar are described in the |
| 29 EXTENDED DESCRIPTION section. |
| 30 |
| 31 The C source code and header file shall be produced in a |
| 32 form suitable as input for the C compiler (see c99 ). |
| 33 |
| 34 OPTIONS |
| 35 The yacc utility shall conform to the Base Definitions |
| 36 volume of IEEE Std 1003.1-2001, Section 12.2, Utility |
| 37 Syntax Guidelines. |
| 38 |
| 39 The following options shall be supported: |
| 40 |
| 41 -b file_prefix |
| 42 Use file_prefix instead of y as the prefix for |
| 43 all output filenames. The code file y.tab.c, the |
| 44 header file y.tab.h (created when -d is speci- |
| 45 fied), and the description file y.output (created |
| 46 when -v is specified), shall be changed to |
| 47 file_prefix .tab.c, file_prefix .tab.h, and |
| 48 file_prefix .output, respectively. |
| 49 |
| 50 -d Write the header file; by default only the code |
| 51 file is written. The #define statements associate |
| 52 the token codes assigned by yacc with the user- |
| 53 declared token names. This allows source files |
| 54 other than y.tab.c to access the token codes. |
| 55 |
| 56 -l Produce a code file that does not contain any |
| 57 #line constructs. If this option is not present, |
| 58 it is unspecified whether the code file or header |
| 59 file contains #line directives. This should only |
| 60 be used after the grammar and the associated |
| 61 actions are fully debugged. |
| 62 |
| 63 -p sym_prefix |
| 64 |
| 65 Use sym_prefix instead of yy as the prefix for |
| 66 all external names produced by yacc. The names |
| 67 affected shall include the functions yyparse(), |
| 68 yylex(), and yyerror(), and the variables yylval, |
| 69 yychar, and yydebug. (In the remainder of this |
| 70 section, the six symbols cited are referenced |
| 71 using their default names only as a notational |
| 72 convenience.) Local names may also be affected by |
| 73 the -p option; however, the -p option shall not |
| 74 affect #define symbols generated by yacc. |
| 75 |
| 76 -t Modify conditional compilation directives to per- |
| 77 mit compilation of debugging code in the code |
| 78 file. Runtime debugging statements shall always |
| 79 be contained in the code file, but by default |
| 80 conditional compilation directives prevent their |
| 81 compilation. |
| 82 |
| 83 -v Write a file containing a description of the |
| 84 parser and a report of conflicts generated by |
| 85 ambiguities in the grammar. |
| 86 |
| 87 |
| 88 OPERANDS |
| 89 The following operand is required: |
| 90 |
| 91 grammar |
| 92 A pathname of a file containing instructions, |
| 93 hereafter called grammar, for which a parser is |
| 94 to be created. The format for the grammar is |
| 95 described in the EXTENDED DESCRIPTION section. |
| 96 |
| 97 |
| 98 STDIN |
| 99 Not used. |
| 100 |
| 101 INPUT FILES |
| 102 The file grammar shall be a text file formatted as spec- |
| 103 ified in the EXTENDED DESCRIPTION section. |
| 104 |
| 105 ENVIRONMENT VARIABLES |
| 106 The following environment variables shall affect the |
| 107 execution of yacc: |
| 108 |
| 109 LANG Provide a default value for the internationaliza- |
| 110 tion variables that are unset or null. (See the |
| 111 Base Definitions volume of IEEE Std 1003.1-2001, |
| 112 Section 8.2, Internationalization Variables for |
| 113 the precedence of internationalization variables |
| 114 used to determine the values of locale cate- |
| 115 gories.) |
| 116 |
| 117 LC_ALL If set to a non-empty string value, override the |
| 118 values of all the other internationalization |
| 119 variables. |
| 120 |
| 121 LC_CTYPE |
| 122 Determine the locale for the interpretation of |
| 123 sequences of bytes of text data as characters |
| 124 (for example, single-byte as opposed to multi- |
| 125 byte characters in arguments and input files). |
| 126 |
| 127 LC_MESSAGES |
| 128 Determine the locale that should be used to |
| 129 affect the format and contents of diagnostic mes- |
| 130 sages written to standard error. |
| 131 |
| 132 NLSPATH |
| 133 Determine the location of message catalogs for |
| 134 the processing of LC_MESSAGES . |
| 135 |
| 136 |
| 137 The LANG and LC_* variables affect the execution of the |
| 138 yacc utility as stated. The main() function defined in |
| 139 Yacc Library shall call: |
| 140 |
| 141 |
| 142 setlocale(LC_ALL, "") |
| 143 |
| 144 and thus the program generated by yacc shall also be |
| 145 affected by the contents of these variables at runtime. |
| 146 |
| 147 ASYNCHRONOUS EVENTS |
| 148 Default. |
| 149 |
| 150 STDOUT |
| 151 Not used. |
| 152 |
| 153 STDERR |
| 154 If shift/reduce or reduce/reduce conflicts are detected |
| 155 in grammar, yacc shall write a report of those conflicts |
| 156 to the standard error in an unspecified format. |
| 157 |
| 158 Standard error shall also be used for diagnostic mes- |
| 159 sages. |
| 160 |
| 161 OUTPUT FILES |
| 162 The code file, the header file, and the description file |
| 163 shall be text files. All are described in the following |
| 164 sections. |
| 165 |
| 166 Code File |
| 167 This file shall contain the C source code for the |
| 168 yyparse() function. It shall contain code for the vari- |
| 169 ous semantic actions with macro substitution performed |
| 170 on them as described in the EXTENDED DESCRIPTION sec- |
| 171 tion. It also shall contain a copy of the #define state- |
| 172 ments in the header file. If a %union declaration is |
| 173 used, the declaration for YYSTYPE shall also be included |
| 174 in this file. |
| 175 |
| 176 Header File |
| 177 The header file shall contain #define statements that |
| 178 associate the token numbers with the token names. This |
| 179 allows source files other than the code file to access |
| 180 the token codes. If a %union declaration is used, the |
| 181 declaration for YYSTYPE and an extern YYSTYPE yylval |
| 182 declaration shall also be included in this file. |
| 183 |
| 184 Description File |
| 185 The description file shall be a text file containing a |
| 186 description of the state machine corresponding to the |
| 187 parser, using an unspecified format. Limits for internal |
| 188 tables (see Limits ) shall also be reported, in an |
| 189 implementation-defined manner. (Some implementations may |
| 190 use dynamic allocation techniques and have no specific |
| 191 limit values to report.) |
| 192 |
| 193 EXTENDED DESCRIPTION |
| 194 The yacc command accepts a language that is used to |
| 195 define a grammar for a target language to be parsed by |
| 196 the tables and code generated by yacc. The language |
| 197 accepted by yacc as a grammar for the target language is |
| 198 described below using the yacc input language itself. |
| 199 |
| 200 The input grammar includes rules describing the input |
| 201 structure of the target language and code to be invoked |
| 202 when these rules are recognized to provide the associ- |
| 203 ated semantic action. The code to be executed shall |
| 204 appear as bodies of text that are intended to be C-lan- |
| 205 guage code. The C-language inclusions are presumed to |
| 206 form a correct function when processed by yacc into its |
| 207 output files. The code included in this way shall be |
| 208 executed during the recognition of the target language. |
| 209 |
| 210 Given a grammar, the yacc utility generates the files |
| 211 described in the OUTPUT FILES section. The code file can |
| 212 be compiled and linked using c99. If the declaration and |
| 213 programs sections of the grammar file did not include |
| 214 definitions of main(), yylex(), and yyerror(), the com- |
| 215 piled output requires linking with externally supplied |
| 216 versions of those functions. Default versions of main() |
| 217 and yyerror() are supplied in the yacc library and can |
| 218 be linked in by using the -l y operand to c99. The yacc |
| 219 library interfaces need not support interfaces with |
| 220 other than the default yy symbol prefix. The application |
| 221 provides the lexical analyzer function, yylex(); the lex |
| 222 utility is specifically designed to generate such a rou- |
| 223 tine. |
| 224 |
| 225 Input Language |
| 226 The application shall ensure that every specification |
| 227 file consists of three sections in order: declarations, |
| 228 grammar rules, and programs, separated by double percent |
| 229 signs ( "%%" ). The declarations and programs sections |
| 230 can be empty. If the latter is empty, the preceding "%%" |
| 231 mark separating it from the rules section can be omit- |
| 232 ted. |
| 233 |
| 234 The input is free form text following the structure of |
| 235 the grammar defined below. |
| 236 |
| 237 Lexical Structure of the Grammar |
| 238 The <blank>s, <newline>s, and <form-feed>s shall be |
| 239 ignored, except that the application shall ensure that |
| 240 they do not appear in names or multi-character reserved |
| 241 symbols. Comments shall be enclosed in "/* ... */", and |
| 242 can appear wherever a name is valid. |
| 243 |
| 244 Names are of arbitrary length, made up of letters, peri- |
| 245 ods ( '.' ), underscores ( '_' ), and non-initial dig- |
| 246 its. Uppercase and lowercase letters are distinct. Con- |
| 247 forming applications shall not use names beginning in yy |
| 248 or YY since the yacc parser uses such names. Many of the |
| 249 names appear in the final output of yacc, and thus they |
| 250 should be chosen to conform with any additional rules |
| 251 created by the C compiler to be used. In particular they |
| 252 appear in #define statements. |
| 253 |
| 254 A literal shall consist of a single character enclosed |
| 255 in single-quotes ( '" ). All of the escape sequences |
| 256 supported for character constants by the ISO C standard |
| 257 shall be supported by yacc. |
| 258 |
| 259 The relationship with the lexical analyzer is discussed |
| 260 in detail below. |
| 261 |
| 262 The application shall ensure that the NUL character is |
| 263 not used in grammar rules or literals. |
| 264 |
| 265 Declarations Section |
| 266 The declarations section is used to define the symbols |
| 267 used to define the target language and their relation- |
| 268 ship with each other. In particular, much of the addi- |
| 269 tional information required to resolve ambiguities in |
| 270 the context-free grammar for the target language is pro- |
| 271 vided here. |
| 272 |
| 273 Usually yacc assigns the relationship between the sym- |
| 274 bolic names it generates and their underlying numeric |
| 275 value. The declarations section makes it possible to |
| 276 control the assignment of these values. |
| 277 |
| 278 It is also possible to keep semantic information associ- |
| 279 ated with the tokens currently on the parse stack in a |
| 280 user-defined C-language union, if the members of the |
| 281 union are associated with the various names in the gram- |
| 282 mar. The declarations section provides for this as well. |
| 283 |
| 284 The first group of declarators below all take a list of |
| 285 names as arguments. That list can optionally be pre- |
| 286 ceded by the name of a C union member (called a tag |
| 287 below) appearing within '<' and '>' . (As an exception |
| 288 to the typographical conventions of the rest of this |
| 289 volume of IEEE Std 1003.1-2001, in this case <tag> does |
| 290 not represent a metavariable, but the literal angle |
| 291 bracket characters surrounding a symbol.) The use of tag |
| 292 specifies that the tokens named on this line shall be of |
| 293 the same C type as the union member referenced by tag. |
| 294 This is discussed in more detail below. |
| 295 |
| 296 For lists used to define tokens, the first appearance of |
| 297 a given token can be followed by a positive integer (as |
| 298 a string of decimal digits). If this is done, the under- |
| 299 lying value assigned to it for lexical purposes shall be |
| 300 taken to be that number. |
| 301 |
| 302 The following declares name to be a token: |
| 303 |
| 304 |
| 305 %token [<tag>] name [number][name [number]]... |
| 306 |
| 307 If tag is present, the C type for all tokens on this |
| 308 line shall be declared to be the type referenced by tag. |
| 309 If a positive integer, number, follows a name, that |
| 310 value shall be assigned to the token. |
| 311 |
| 312 The following declares name to be a token, and assigns |
| 313 precedence to it: |
| 314 |
| 315 |
| 316 %left [<tag>] name [number][name [number]]... |
| 317 %right [<tag>] name [number][name [number]]... |
| 318 |
| 319 One or more lines, each beginning with one of these sym- |
| 320 bols, can appear in this section. All tokens on the same |
| 321 line have the same precedence level and associativity; |
| 322 the lines are in order of increasing precedence or bind- |
| 323 ing strength. %left denotes that the operators on that |
| 324 line are left associative, and %right similarly denotes |
| 325 right associative operators. If tag is present, it shall |
| 326 declare a C type for names as described for %token. |
| 327 |
| 328 The following declares name to be a token, and indicates |
| 329 that this cannot be used associatively: |
| 330 |
| 331 |
| 332 %nonassoc [<tag>] name [number][name [number]]... |
| 333 |
| 334 If the parser encounters associative use of this token |
| 335 it reports an error. If tag is present, it shall declare |
| 336 a C type for names as described for %token. |
| 337 |
| 338 The following declares that union member names are non- |
| 339 terminals, and thus it is required to have a tag field |
| 340 at its beginning: |
| 341 |
| 342 |
| 343 %type <tag> name... |
| 344 |
| 345 Because it deals with non-terminals only, assigning a |
| 346 token number or using a literal is also prohibited. If |
| 347 this construct is present, yacc shall perform type |
| 348 checking; if this construct is not present, the parse |
| 349 stack shall hold only the int type. |
| 350 |
| 351 Every name used in grammar not defined by a %token, |
| 352 %left, %right, or %nonassoc declaration is assumed to |
| 353 represent a non-terminal symbol. The yacc utility shall |
| 354 report an error for any non-terminal symbol that does |
| 355 not appear on the left side of at least one grammar |
| 356 rule. |
| 357 |
| 358 Once the type, precedence, or token number of a name is |
| 359 specified, it shall not be changed. If the first decla- |
| 360 ration of a token does not assign a token number, yacc |
| 361 shall assign a token number. Once this assignment is |
| 362 made, the token number shall not be changed by explicit |
| 363 assignment. |
| 364 |
| 365 The following declarators do not follow the previous |
| 366 pattern. |
| 367 |
| 368 The following declares the non-terminal name to be the |
| 369 start symbol, which represents the largest, most general |
| 370 structure described by the grammar rules: |
| 371 |
| 372 |
| 373 %start name |
| 374 |
| 375 By default, it is the left-hand side of the first gram- |
| 376 mar rule; this default can be overridden with this dec- |
| 377 laration. |
| 378 |
| 379 The following declares the yacc value stack to be a |
| 380 union of the various types of values desired: |
| 381 |
| 382 |
| 383 %union { body of union (in C) } |
| 384 |
| 385 By default, the values returned by actions (see below) |
| 386 and the lexical analyzer shall be of type int. The yacc |
| 387 utility keeps track of types, and it shall insert corre- |
| 388 sponding union member names in order to perform strict |
| 389 type checking of the resulting parser. |
| 390 |
| 391 Alternatively, given that at least one <tag> construct |
| 392 is used, the union can be declared in a header file |
| 393 (which shall be included in the declarations section by |
| 394 using a #include construct within %{ and %}), and a |
| 395 typedef used to define the symbol YYSTYPE to represent |
| 396 this union. The effect of %union is to provide the dec- |
| 397 laration of YYSTYPE directly from the yacc input. |
| 398 |
| 399 C-language declarations and definitions can appear in |
| 400 the declarations section, enclosed by the following |
| 401 marks: |
| 402 |
| 403 |
| 404 %{ ... %} |
| 405 |
| 406 These statements shall be copied into the code file, and |
| 407 have global scope within it so that they can be used in |
| 408 the rules and program sections. |
| 409 |
| 410 The application shall ensure that the declarations sec- |
| 411 tion is terminated by the token %%. |
| 412 |
| 413 Grammar Rules in yacc |
| 414 The rules section defines the context-free grammar to be |
| 415 accepted by the function yacc generates, and associates |
| 416 with those rules C-language actions and additional |
| 417 precedence information. The grammar is described below, |
| 418 and a formal definition follows. |
| 419 |
| 420 The rules section is comprised of one or more grammar |
| 421 rules. A grammar rule has the form: |
| 422 |
| 423 |
| 424 A : BODY ; |
| 425 |
| 426 The symbol A represents a non-terminal name, and BODY |
| 427 represents a sequence of zero or more names, literals, |
| 428 and semantic actions that can then be followed by |
| 429 optional precedence rules. Only the names and literals |
| 430 participate in the formation of the grammar; the seman- |
| 431 tic actions and precedence rules are used in other ways. |
| 432 The colon and the semicolon are yacc punctuation. If |
| 433 there are several successive grammar rules with the same |
| 434 left-hand side, the vertical bar '|' can be used to |
| 435 avoid rewriting the left-hand side; in this case the |
| 436 semicolon appears only after the last rule. The BODY |
| 437 part can be empty (or empty of names and literals) to |
| 438 indicate that the non-terminal symbol matches the empty |
| 439 string. |
| 440 |
| 441 The yacc utility assigns a unique number to each rule. |
| 442 Rules using the vertical bar notation are distinct |
| 443 rules. The number assigned to the rule appears in the |
| 444 description file. |
| 445 |
| 446 The elements comprising a BODY are: |
| 447 |
| 448 name, literal |
| 449 These form the rules of the grammar: name is |
| 450 either a token or a non-terminal; literal stands |
| 451 for itself (less the lexically required quotation |
| 452 marks). |
| 453 |
| 454 semantic action |
| 455 |
| 456 With each grammar rule, the user can associate |
| 457 actions to be performed each time the rule is |
| 458 recognized in the input process. (Note that the |
| 459 word "action" can also refer to the actions of |
| 460 the parser-shift, reduce, and so on.) |
| 461 |
| 462 These actions can return values and can obtain the val- |
| 463 ues returned by previous actions. These values are kept |
| 464 in objects of type YYSTYPE (see %union). The result |
| 465 value of the action shall be kept on the parse stack |
| 466 with the left-hand side of the rule, to be accessed by |
| 467 other reductions as part of their right-hand side. By |
| 468 using the <tag> information provided in the declarations |
| 469 section, the code generated by yacc can be strictly type |
| 470 checked and contain arbitrary information. In addition, |
| 471 the lexical analyzer can provide the same kinds of val- |
| 472 ues for tokens, if desired. |
| 473 |
| 474 An action is an arbitrary C statement and as such can do |
| 475 input or output, call subprograms, and alter external |
| 476 variables. An action is one or more C statements |
| 477 enclosed in curly braces '{' and '}' . |
| 478 |
| 479 Certain pseudo-variables can be used in the action. |
| 480 These are macros for access to data structures known |
| 481 internally to yacc. |
| 482 |
| 483 $$ |
| 484 The value of the action can be set by assigning |
| 485 it to $$. If type checking is enabled and the |
| 486 type of the value to be assigned cannot be deter- |
| 487 mined, a diagnostic message may be generated. |
| 488 |
| 489 $number |
| 490 This refers to the value returned by the compo- |
| 491 nent specified by the token number in the right |
| 492 side of a rule, reading from left to right; num- |
| 493 ber can be zero or negative. If number is zero or |
| 494 negative, it refers to the data associated with |
| 495 the name on the parser's stack preceding the |
| 496 leftmost symbol of the current rule. (That is, |
| 497 "$0" refers to the name immediately preceding the |
| 498 leftmost name in the current rule to be found on |
| 499 the parser's stack and "$-1" refers to the symbol |
| 500 to its left.) If number refers to an element past |
| 501 the current point in the rule, or beyond the bot- |
| 502 tom of the stack, the result is undefined. If |
| 503 type checking is enabled and the type of the |
| 504 value to be assigned cannot be determined, a |
| 505 diagnostic message may be generated. |
| 506 |
| 507 $<tag>number |
| 508 |
| 509 These correspond exactly to the corresponding |
| 510 symbols without the tag inclusion, but allow for |
| 511 strict type checking (and preclude unwanted type |
| 512 conversions). The effect is that the macro is |
| 513 expanded to use tag to select an element from the |
| 514 YYSTYPE union (using dataname.tag). This is par- |
| 515 ticularly useful if number is not positive. |
| 516 |
| 517 $<tag>$ |
| 518 This imposes on the reference the type of the |
| 519 union member referenced by tag. This construction |
| 520 is applicable when a reference to a left context |
| 521 value occurs in the grammar, and provides yacc |
| 522 with a means for selecting a type. |
| 523 |
| 524 |
| 525 Actions can occur anywhere in a rule (not just at the |
| 526 end); an action can access values returned by actions to |
| 527 its left, and in turn the value it returns can be |
| 528 accessed by actions to its right. An action appearing |
| 529 in the middle of a rule shall be equivalent to replacing |
| 530 the action with a new non-terminal symbol and adding an |
| 531 empty rule with that non-terminal symbol on the left- |
| 532 hand side. The semantic action associated with the new |
| 533 rule shall be equivalent to the original action. The use |
| 534 of actions within rules might introduce conflicts that |
| 535 would not otherwise exist. |
| 536 |
| 537 By default, the value of a rule shall be the value of |
| 538 the first element in it. If the first element does not |
| 539 have a type (particularly in the case of a literal) and |
| 540 type checking is turned on by %type, an error message |
| 541 shall result. |
| 542 |
| 543 precedence |
| 544 The keyword %prec can be used to change the |
| 545 precedence level associated with a particular |
| 546 grammar rule. Examples of this are in cases where |
| 547 a unary and binary operator have the same sym- |
| 548 bolic representation, but need to be given dif- |
| 549 ferent precedences, or where the handling of an |
| 550 ambiguous if-else construction is necessary. The |
| 551 reserved symbol %prec can appear immediately |
| 552 after the body of the grammar rule and can be |
| 553 followed by a token name or a literal. It shall |
| 554 cause the precedence of the grammar rule to |
| 555 become that of the following token name or lit- |
| 556 eral. The action for the rule as a whole can fol- |
| 557 low %prec. |
| 558 |
| 559 |
| 560 If a program section follows, the application shall |
| 561 ensure that the grammar rules are terminated by %%. |
| 562 |
| 563 Programs Section |
| 564 The programs section can include the definition of the |
| 565 lexical analyzer yylex(), and any other functions; for |
| 566 example, those used in the actions specified in the |
| 567 grammar rules. It is unspecified whether the programs |
| 568 section precedes or follows the semantic actions in the |
| 569 output file; therefore, if the application contains any |
| 570 macro definitions and declarations intended to apply to |
| 571 the code in the semantic actions, it shall place them |
| 572 within "%{ ... %}" in the declarations section. |
| 573 |
| 574 Input Grammar |
| 575 The following input to yacc yields a parser for the |
| 576 input to yacc. This formal syntax takes precedence over |
| 577 the preceding text syntax description. |
| 578 |
| 579 The lexical structure is defined less precisely; Lexical |
| 580 Structure of the Grammar defines most terms. The corre- |
| 581 spondence between the previous terms and the tokens |
| 582 below is as follows. |
| 583 |
| 584 IDENTIFIER |
| 585 This corresponds to the concept of name, given |
| 586 previously. It also includes literals as defined |
| 587 previously. |
| 588 |
| 589 C_IDENTIFIER |
| 590 This is a name, and additionally it is known to |
| 591 be followed by a colon. A literal cannot yield |
| 592 this token. |
| 593 |
| 594 NUMBER A string of digits (a non-negative decimal inte- |
| 595 ger). |
| 596 |
| 597 TYPE, LEFT, MARK, LCURL, RCURL |
| 598 |
| 599 These correspond directly to %type, %left, %%, |
| 600 %{, and %}. |
| 601 |
| 602 { ... } |
| 603 This indicates C-language source code, with the |
| 604 possible inclusion of '$' macros as discussed |
| 605 previously. |
| 606 |
| 607 |
| 608 |
| 609 /* Grammar for the input to yacc. */ |
| 610 /* Basic entries. */ |
| 611 /* The following are recognized by the lexical analyzer. */ |
| 612 |
| 613 |
| 614 %token IDENTIFIER /* Includes identifiers and literals */ |
| 615 %token C_IDENTIFIER /* identifier (but not literal) |
| 616 followed by a :. */ |
| 617 %token NUMBER /* [0-9][0-9]* */ |
| 618 |
| 619 |
| 620 /* Reserved words : %type=>TYPE %left=>LEFT, and so on */ |
| 621 |
| 622 |
| 623 %token LEFT RIGHT NONASSOC TOKEN PREC TYPE START UNION |
| 624 |
| 625 |
| 626 %token MARK /* The %% mark. */ |
| 627 %token LCURL /* The %{ mark. */ |
| 628 %token RCURL /* The %} mark. */ |
| 629 |
| 630 |
| 631 /* 8-bit character literals stand for themselves; */ |
| 632 /* tokens have to be defined for multi-byte characters. */ |
| 633 |
| 634 |
| 635 %start spec |
| 636 |
| 637 |
| 638 %% |
| 639 |
| 640 |
| 641 spec : defs MARK rules tail |
| 642 ; |
| 643 tail : MARK |
| 644 { |
| 645 /* In this action, set up the rest of the file. */ |
| 646 } |
| 647 | /* Empty; the second MARK is optional. */ |
| 648 ; |
| 649 defs : /* Empty. */ |
| 650 | defs def |
| 651 ; |
| 652 def : START IDENTIFIER |
| 653 | UNION |
| 654 { |
| 655 /* Copy union definition to output. */ |
| 656 } |
| 657 | LCURL |
| 658 { |
| 659 /* Copy C code to output file. */ |
| 660 } |
| 661 RCURL |
| 662 | rword tag nlist |
| 663 ; |
| 664 rword : TOKEN |
| 665 | LEFT |
| 666 | RIGHT |
| 667 | NONASSOC |
| 668 | TYPE |
| 669 ; |
| 670 tag : /* Empty: union tag ID optional. */ |
| 671 | '<' IDENTIFIER '>' |
| 672 ; |
| 673 nlist : nmno |
| 674 | nlist nmno |
| 675 ; |
| 676 nmno : IDENTIFIER /* Note: literal invalid with % type. *
/ |
| 677 | IDENTIFIER NUMBER /* Note: invalid with % type. */ |
| 678 ; |
| 679 |
| 680 |
| 681 /* Rule section */ |
| 682 |
| 683 |
| 684 rules : C_IDENTIFIER rbody prec |
| 685 | rules rule |
| 686 ; |
| 687 rule : C_IDENTIFIER rbody prec |
| 688 | '|' rbody prec |
| 689 ; |
| 690 rbody : /* empty */ |
| 691 | rbody IDENTIFIER |
| 692 | rbody act |
| 693 ; |
| 694 act : '{' |
| 695 { |
| 696 /* Copy action, translate $$, and so on. */ |
| 697 } |
| 698 '}' |
| 699 ; |
| 700 prec : /* Empty */ |
| 701 | PREC IDENTIFIER |
| 702 | PREC IDENTIFIER act |
| 703 | prec ';' |
| 704 ; |
| 705 |
| 706 Conflicts |
| 707 The parser produced for an input grammar may contain |
| 708 states in which conflicts occur. The conflicts occur |
| 709 because the grammar is not LALR(1). An ambiguous grammar |
| 710 always contains at least one LALR(1) conflict. The yacc |
| 711 utility shall resolve all conflicts, using either |
| 712 default rules or user-specified precedence rules. |
| 713 |
| 714 Conflicts are either shift/reduce conflicts or |
| 715 reduce/reduce conflicts. A shift/reduce conflict is |
| 716 where, for a given state and lookahead symbol, both a |
| 717 shift action and a reduce action are possible. A |
| 718 reduce/reduce conflict is where, for a given state and |
| 719 lookahead symbol, reductions by two different rules are |
| 720 possible. |
| 721 |
| 722 The rules below describe how to specify what actions to |
| 723 take when a conflict occurs. Not all shift/reduce con- |
| 724 flicts can be successfully resolved this way because the |
| 725 conflict may be due to something other than ambiguity, |
| 726 so incautious use of these facilities can cause the lan- |
| 727 guage accepted by the parser to be much different from |
| 728 that which was intended. The description file shall con- |
| 729 tain sufficient information to understand the cause of |
| 730 the conflict. Where ambiguity is the reason either the |
| 731 default or explicit rules should be adequate to produce |
| 732 a working parser. |
| 733 |
| 734 The declared precedences and associativities (see Decla- |
| 735 rations Section ) are used to resolve parsing conflicts |
| 736 as follows: |
| 737 |
| 738 1. A precedence and associativity is associated with |
| 739 each grammar rule; it is the precedence and associa- |
| 740 tivity of the last token or literal in the body of |
| 741 the rule. If the %prec keyword is used, it overrides |
| 742 this default. Some grammar rules might not have both |
| 743 precedence and associativity. |
| 744 |
| 745 |
| 746 2. If there is a shift/reduce conflict, and both the |
| 747 grammar rule and the input symbol have precedence |
| 748 and associativity associated with them, then the |
| 749 conflict is resolved in favor of the action (shift |
| 750 or reduce) associated with the higher precedence. If |
| 751 the precedences are the same, then the associativity |
| 752 is used; left associative implies reduce, right |
| 753 associative implies shift, and non-associative |
| 754 implies an error in the string being parsed. |
| 755 |
| 756 |
| 757 3. When there is a shift/reduce conflict that cannot be |
| 758 resolved by rule 2, the shift is done. Conflicts |
| 759 resolved this way are counted in the diagnostic out- |
| 760 put described in Error Handling . |
| 761 |
| 762 |
| 763 4. When there is a reduce/reduce conflict, a reduction |
| 764 is done by the grammar rule that occurs earlier in |
| 765 the input sequence. Conflicts resolved this way are |
| 766 counted in the diagnostic output described in Error |
| 767 Handling . |
| 768 |
| 769 |
| 770 Conflicts resolved by precedence or associativity shall |
| 771 not be counted in the shift/reduce and reduce/reduce |
| 772 conflicts reported by yacc on either standard error or |
| 773 in the description file. |
| 774 |
| 775 Error Handling |
| 776 The token error shall be reserved for error handling. |
| 777 The name error can be used in grammar rules. It indi- |
| 778 cates places where the parser can recover from a syntax |
| 779 error. The default value of error shall be 256. Its |
| 780 value can be changed using a %token declaration. The |
| 781 lexical analyzer should not return the value of error. |
| 782 |
| 783 The parser shall detect a syntax error when it is in a |
| 784 state where the action associated with the lookahead |
| 785 symbol is error. A semantic action can cause the parser |
| 786 to initiate error handling by executing the macro YYER- |
| 787 ROR. When YYERROR is executed, the semantic action |
| 788 passes control back to the parser. YYERROR cannot be |
| 789 used outside of semantic actions. |
| 790 |
| 791 When the parser detects a syntax error, it normally |
| 792 calls yyerror() with the character string "syntax error" |
| 793 as its argument. The call shall not be made if the |
| 794 parser is still recovering from a previous error when |
| 795 the error is detected. The parser is considered to be |
| 796 recovering from a previous error until the parser has |
| 797 shifted over at least three normal input symbols since |
| 798 the last error was detected or a semantic action has |
| 799 executed the macro yyerrok. The parser shall not call |
| 800 yyerror() when YYERROR is executed. |
| 801 |
| 802 The macro function YYRECOVERING shall return 1 if a syn- |
| 803 tax error has been detected and the parser has not yet |
| 804 fully recovered from it. Otherwise, zero shall be |
| 805 returned. |
| 806 |
| 807 When a syntax error is detected by the parser, the |
| 808 parser shall check if a previous syntax error has been |
| 809 detected. If a previous error was detected, and if no |
| 810 normal input symbols have been shifted since the preced- |
| 811 ing error was detected, the parser checks if the looka- |
| 812 head symbol is an endmarker (see Interface to the Lexi- |
| 813 cal Analyzer ). If it is, the parser shall return with a |
| 814 non-zero value. Otherwise, the lookahead symbol shall be |
| 815 discarded and normal parsing shall resume. |
| 816 |
| 817 When YYERROR is executed or when the parser detects a |
| 818 syntax error and no previous error has been detected, or |
| 819 at least one normal input symbol has been shifted since |
| 820 the previous error was detected, the parser shall pop |
| 821 back one state at a time until the parse stack is empty |
| 822 or the current state allows a shift over error. If the |
| 823 parser empties the parse stack, it shall return with a |
| 824 non-zero value. Otherwise, it shall shift over error and |
| 825 then resume normal parsing. If the parser reads a looka- |
| 826 head symbol before the error was detected, that symbol |
| 827 shall still be the lookahead symbol when parsing is |
| 828 resumed. |
| 829 |
| 830 The macro yyerrok in a semantic action shall cause the |
| 831 parser to act as if it has fully recovered from any pre- |
| 832 vious errors. The macro yyclearin shall cause the parser |
| 833 to discard the current lookahead token. If the current |
| 834 lookahead token has not yet been read, yyclearin shall |
| 835 have no effect. |
| 836 |
| 837 The macro YYACCEPT shall cause the parser to return with |
| 838 the value zero. The macro YYABORT shall cause the parser |
| 839 to return with a non-zero value. |
| 840 |
| 841 Interface to the Lexical Analyzer |
| 842 The yylex() function is an integer-valued function that |
| 843 returns a token number representing the kind of token |
| 844 read. If there is a value associated with the token |
| 845 returned by yylex() (see the discussion of tag above), |
| 846 it shall be assigned to the external variable yylval. |
| 847 |
| 848 If the parser and yylex() do not agree on these token |
| 849 numbers, reliable communication between them cannot |
| 850 occur. For (single-byte character) literals, the token |
| 851 is simply the numeric value of the character in the cur- |
| 852 rent character set. The numbers for other tokens can |
| 853 either be chosen by yacc, or chosen by the user. In |
| 854 either case, the #define construct of C is used to allow |
| 855 yylex() to return these numbers symbolically. The |
| 856 #define statements are put into the code file, and the |
| 857 header file if that file is requested. The set of char- |
| 858 acters permitted by yacc in an identifier is larger than |
| 859 that permitted by C. Token names found to contain such |
| 860 characters shall not be included in the #define declara- |
| 861 tions. |
| 862 |
| 863 If the token numbers are chosen by yacc, the tokens |
| 864 other than literals shall be assigned numbers greater |
| 865 than 256, although no order is implied. A token can be |
| 866 explicitly assigned a number by following its first |
| 867 appearance in the declarations section with a number. |
| 868 Names and literals not defined this way retain their |
| 869 default definition. All token numbers assigned by yacc |
| 870 shall be unique and distinct from the token numbers used |
| 871 for literals and user-assigned tokens. If duplicate |
| 872 token numbers cause conflicts in parser generation, yacc |
| 873 shall report an error; otherwise, it is unspecified |
| 874 whether the token assignment is accepted or an error is |
| 875 reported. |
| 876 |
| 877 The end of the input is marked by a special token called |
| 878 the endmarker, which has a token number that is zero or |
| 879 negative. (These values are invalid for any other |
| 880 token.) All lexical analyzers shall return zero or nega- |
| 881 tive as a token number upon reaching the end of their |
| 882 input. If the tokens up to, but excluding, the endmarker |
| 883 form a structure that matches the start symbol, the |
| 884 parser shall accept the input. If the endmarker is seen |
| 885 in any other context, it shall be considered an error. |
| 886 |
| 887 Completing the Program |
| 888 In addition to yyparse() and yylex(), the functions |
| 889 yyerror() and main() are required to make a complete |
| 890 program. The application can supply main() and yyer- |
| 891 ror(), or those routines can be obtained from the yacc |
| 892 library. |
| 893 |
| 894 Yacc Library |
| 895 The following functions shall appear only in the yacc |
| 896 library accessible through the -l y operand to c99; they |
| 897 can therefore be redefined by a conforming application: |
| 898 |
| 899 int main(void) |
| 900 |
| 901 This function shall call yyparse() and exit with |
| 902 an unspecified value. Other actions within this |
| 903 function are unspecified. |
| 904 |
| 905 int yyerror(const char *s) |
| 906 |
| 907 This function shall write the NUL-terminated |
| 908 argument to standard error, followed by a <new- |
| 909 line>. |
| 910 |
| 911 |
| 912 The order of the -l y and -l l operands given to c99 is |
| 913 significant; the application shall either provide its |
| 914 own main() function or ensure that -l y precedes -l l. |
| 915 |
| 916 Debugging the Parser |
| 917 The parser generated by yacc shall have diagnostic |
| 918 facilities in it that can be optionally enabled at |
| 919 either compile time or at runtime (if enabled at compile |
| 920 time). The compilation of the runtime debugging code is |
| 921 under the control of YYDEBUG, a preprocessor symbol. If |
| 922 YYDEBUG has a non-zero value, the debugging code shall |
| 923 be included. If its value is zero, the code shall not be |
| 924 included. |
| 925 |
| 926 In parsers where the debugging code has been included, |
| 927 the external int yydebug can be used to turn debugging |
| 928 on (with a non-zero value) and off (zero value) at run- |
| 929 time. The initial value of yydebug shall be zero. |
| 930 |
| 931 When -t is specified, the code file shall be built such |
| 932 that, if YYDEBUG is not already defined at compilation |
| 933 time (using the c99 -D YYDEBUG option, for example), |
| 934 YYDEBUG shall be set explicitly to 1. When -t is not |
| 935 specified, the code file shall be built such that, if |
| 936 YYDEBUG is not already defined, it shall be set explic- |
| 937 itly to zero. |
| 938 |
| 939 The format of the debugging output is unspecified but |
| 940 includes at least enough information to determine the |
| 941 shift and reduce actions, and the input symbols. It also |
| 942 provides information about error recovery. |
| 943 |
| 944 Algorithms |
| 945 The parser constructed by yacc implements an LALR(1) |
| 946 parsing algorithm as documented in the literature. It is |
| 947 unspecified whether the parser is table-driven or |
| 948 direct-coded. |
| 949 |
| 950 A parser generated by yacc shall never request an input |
| 951 symbol from yylex() while in a state where the only |
| 952 actions other than the error action are reductions by a |
| 953 single rule. |
| 954 |
| 955 The literature of parsing theory defines these concepts. |
| 956 |
| 957 Limits |
| 958 The yacc utility may have several internal tables. The |
| 959 minimum maximums for these tables are shown in the fol- |
| 960 lowing table. The exact meaning of these values is |
| 961 implementation-defined. The implementation shall define |
| 962 the relationship between these values and between them |
| 963 and any error messages that the implementation may gen- |
| 964 erate should it run out of space for any internal struc- |
| 965 ture. An implementation may combine groups of these |
| 966 resources into a single pool as long as the total avail- |
| 967 able to the user does not fall below the sum of the |
| 968 sizes specified by this section. |
| 969 |
| 970 Table: Internal Limits in yacc |
| 971 |
| 972 Minimum |
| 973 Limit Maximum Description |
| 974 {NTERMS} 126 Number of tokens. |
| 975 {NNONTERM} 200 Number of non-terminals. |
| 976 {NPROD} 300 Number of rules. |
| 977 {NSTATES} 600 Number of states. |
| 978 {MEMSIZE} 5200 Length of rules. The total length, in |
| 979 names (tokens and non-terminals), of all |
| 980 the rules of the grammar. The left-hand |
| 981 side is counted for each rule, even if |
| 982 it is not explicitly repeated, as speci- |
| 983 fied in Grammar Rules in yacc . |
| 984 {ACTSIZE} 4000 Number of actions. "Actions" here (and |
| 985 in the description file) refer to parser |
| 986 actions (shift, reduce, and so on) not |
| 987 to semantic actions defined in Grammar |
| 988 Rules in yacc . |
| 989 |
| 990 EXIT STATUS |
| 991 The following exit values shall be returned: |
| 992 |
| 993 0 Successful completion. |
| 994 |
| 995 >0 An error occurred. |
| 996 |
| 997 |
| 998 CONSEQUENCES OF ERRORS |
| 999 If any errors are encountered, the run is aborted and |
| 1000 yacc exits with a non-zero status. Partial code files |
| 1001 and header files may be produced. The summary informa- |
| 1002 tion in the description file shall always be produced if |
| 1003 the -v flag is present. |
| 1004 |
| 1005 The following sections are informative. |
| 1006 |
| 1007 APPLICATION USAGE |
| 1008 Historical implementations experience name conflicts on |
| 1009 the names yacc.tmp, yacc.acts, yacc.debug, y.tab.c, |
| 1010 y.tab.h, and y.output if more than one copy of yacc is |
| 1011 running in a single directory at one time. The -b option |
| 1012 was added to overcome this problem. The related problem |
| 1013 of allowing multiple yacc parsers to be placed in the |
| 1014 same file was addressed by adding a -p option to over- |
| 1015 ride the previously hard-coded yy variable prefix. |
| 1016 |
| 1017 The description of the -p option specifies the minimal |
| 1018 set of function and variable names that cause conflict |
| 1019 when multiple parsers are linked together. YYSTYPE does |
| 1020 not need to be changed. Instead, the programmer can use |
| 1021 -b to give the header files for different parsers dif- |
| 1022 ferent names, and then the file with the yylex() for a |
| 1023 given parser can include the header for that parser. |
| 1024 Names such as yyclearerr do not need to be changed |
| 1025 because they are used only in the actions; they do not |
| 1026 have linkage. It is possible that an implementation has |
| 1027 other names, either internal ones for implementing |
| 1028 things such as yyclearerr, or providing non-standard |
| 1029 features that it wants to change with -p. |
| 1030 |
| 1031 Unary operators that are the same token as a binary |
| 1032 operator in general need their precedence adjusted. This |
| 1033 is handled by the %prec advisory symbol associated with |
| 1034 the particular grammar rule defining that unary opera- |
| 1035 tor. (See Grammar Rules in yacc .) Applications are not |
| 1036 required to use this operator for unary operators, but |
| 1037 the grammars that do not require it are rare. |
| 1038 |
| 1039 EXAMPLES |
| 1040 Access to the yacc library is obtained with library |
| 1041 search operands to c99. To use the yacc library main(): |
| 1042 |
| 1043 |
| 1044 c99 y.tab.c -l y |
| 1045 |
| 1046 Both the lex library and the yacc library contain |
| 1047 main(). To access the yacc main(): |
| 1048 |
| 1049 |
| 1050 c99 y.tab.c lex.yy.c -l y -l l |
| 1051 |
| 1052 This ensures that the yacc library is searched first, so |
| 1053 that its main() is used. |
| 1054 |
| 1055 The historical yacc libraries have contained two simple |
| 1056 functions that are normally coded by the application |
| 1057 programmer. These functions are similar to the follow- |
| 1058 ing code: |
| 1059 |
| 1060 |
| 1061 #include <locale.h> |
| 1062 int main(void) |
| 1063 { |
| 1064 extern int yyparse(); |
| 1065 |
| 1066 |
| 1067 setlocale(LC_ALL, ""); |
| 1068 |
| 1069 |
| 1070 /* If the following parser is one created by lex, the |
| 1071 application must be careful to ensure that LC_CTYPE |
| 1072 and LC_COLLATE are set to the POSIX locale. */ |
| 1073 (void) yyparse(); |
| 1074 return (0); |
| 1075 } |
| 1076 |
| 1077 |
| 1078 #include <stdio.h> |
| 1079 |
| 1080 |
| 1081 int yyerror(const char *msg) |
| 1082 { |
| 1083 (void) fprintf(stderr, "%s\n", msg); |
| 1084 return (0); |
| 1085 } |
| 1086 |
| 1087 RATIONALE |
| 1088 The references in may be helpful in constructing the |
| 1089 parser generator. The referenced DeRemer and Pennello |
| 1090 article (along with the works it references) describes a |
| 1091 technique to generate parsers that conform to this vol- |
| 1092 ume of IEEE Std 1003.1-2001. Work in this area contin- |
| 1093 ues to be done, so implementors should consult current |
| 1094 literature before doing any new implementations. The |
| 1095 original Knuth article is the theoretical basis for this |
| 1096 kind of parser, but the tables it generates are imprac- |
| 1097 tically large for reasonable grammars and should not be |
| 1098 used. The "equivalent to" wording is intentional to |
| 1099 assure that the best tables that are LALR(1) can be gen- |
| 1100 erated. |
| 1101 |
| 1102 There has been confusion between the class of grammars, |
| 1103 the algorithms needed to generate parsers, and the algo- |
| 1104 rithms needed to parse the languages. They are all rea- |
| 1105 sonably orthogonal. In particular, a parser generator |
| 1106 that accepts the full range of LR(1) grammars need not |
| 1107 generate a table any more complex than one that accepts |
| 1108 SLR(1) (a relatively weak class of LR grammars) for a |
| 1109 grammar that happens to be SLR(1). Such an implementa- |
| 1110 tion need not recognize the case, either; table compres- |
| 1111 sion can yield the SLR(1) table (or one even smaller |
| 1112 than that) without recognizing that the grammar is |
| 1113 SLR(1). The speed of an LR(1) parser for any class is |
| 1114 dependent more upon the table representation and com- |
| 1115 pression (or the code generation if a direct parser is |
| 1116 generated) than upon the class of grammar that the table |
| 1117 generator handles. |
| 1118 |
| 1119 The speed of the parser generator is somewhat dependent |
| 1120 upon the class of grammar it handles. However, the orig- |
| 1121 inal Knuth article algorithms for constructing LR |
| 1122 parsers were judged by its author to be impractically |
| 1123 slow at that time. Although full LR is more complex than |
| 1124 LALR(1), as computer speeds and algorithms improve, the |
| 1125 difference (in terms of acceptable wall-clock execution |
| 1126 time) is becoming less significant. |
| 1127 |
| 1128 Potential authors are cautioned that the referenced |
| 1129 DeRemer and Pennello article previously cited identifies |
| 1130 a bug (an over-simplification of the computation of |
| 1131 LALR(1) lookahead sets) in some of the LALR(1) algorithm |
| 1132 statements that preceded it to publication. They should |
| 1133 take the time to seek out that paper, as well as current |
| 1134 relevant work, particularly Aho's. |
| 1135 |
| 1136 The -b option was added to provide a portable method for |
| 1137 permitting yacc to work on multiple separate parsers in |
| 1138 the same directory. If a directory contains more than |
| 1139 one yacc grammar, and both grammars are constructed at |
| 1140 the same time (by, for example, a parallel make pro- |
| 1141 gram), conflict results. While the solution is not his- |
| 1142 torical practice, it corrects a known deficiency in his- |
| 1143 torical implementations. Corresponding changes were made |
| 1144 to all sections that referenced the filenames y.tab.c |
| 1145 (now "the code file"), y.tab.h (now "the header file"), |
| 1146 and y.output (now "the description file"). |
| 1147 |
| 1148 The grammar for yacc input is based on System V documen- |
| 1149 tation. The textual description shows there that the |
| 1150 ';' is required at the end of the rule. The grammar and |
| 1151 the implementation do not require this. (The use of |
| 1152 C_IDENTIFIER causes a reduce to occur in the right |
| 1153 place.) |
| 1154 |
| 1155 Also, in that implementation, the constructs such as |
| 1156 %token can be terminated by a semicolon, but this is not |
| 1157 permitted by the grammar. The keywords such as %token |
| 1158 can also appear in uppercase, which is again not dis- |
| 1159 cussed. In most places where '%' is used, '\' can be |
| 1160 substituted, and there are alternate spellings for some |
| 1161 of the symbols (for example, %LEFT can be "%<" or even |
| 1162 "\<" ). |
| 1163 |
| 1164 Historically, <tag> can contain any characters except |
| 1165 '>', including white space, in the implementation. How- |
| 1166 ever, since the tag must reference an ISO C standard |
| 1167 union member, in practice conforming implementations |
| 1168 need to support only the set of characters for ISO C |
| 1169 standard identifiers in this context. |
| 1170 |
| 1171 Some historical implementations are known to accept |
| 1172 actions that are terminated by a period. Historical |
| 1173 implementations often allow '$' in names. A conforming |
| 1174 implementation does not need to support either of these |
| 1175 behaviors. |
| 1176 |
| 1177 Deciding when to use %prec illustrates the difficulty in |
| 1178 specifying the behavior of yacc. There may be situations |
| 1179 in which the grammar is not, strictly speaking, in |
| 1180 error, and yet yacc cannot interpret it unambiguously. |
| 1181 The resolution of ambiguities in the grammar can in many |
| 1182 instances be resolved by providing additional informa- |
| 1183 tion, such as using %type or %union declarations. It is |
| 1184 often easier and it usually yields a smaller parser to |
| 1185 take this alternative when it is appropriate. |
| 1186 |
| 1187 The size and execution time of a program produced with- |
| 1188 out the runtime debugging code is usually smaller and |
| 1189 slightly faster in historical implementations. |
| 1190 |
| 1191 Statistics messages from several historical implementa- |
| 1192 tions include the following types of information: |
| 1193 |
| 1194 |
| 1195 n/512 terminals, n/300 non-terminals |
| 1196 n/600 grammar rules, n/1500 states |
| 1197 n shift/reduce, n reduce/reduce conflicts reported |
| 1198 n/350 working sets used |
| 1199 Memory: states, etc. n/15000, parser n/15000 |
| 1200 n/600 distinct lookahead sets |
| 1201 n extra closures |
| 1202 n shift entries, n exceptions |
| 1203 n goto entries |
| 1204 n entries saved by goto default |
| 1205 Optimizer space used: input n/15000, output n/15000 |
| 1206 n table entries, n zero |
| 1207 Maximum spread: n, Maximum offset: n |
| 1208 |
| 1209 The report of internal tables in the description file is |
| 1210 left implementation-defined because all aspects of these |
| 1211 limits are also implementation-defined. Some implementa- |
| 1212 tions may use dynamic allocation techniques and have no |
| 1213 specific limit values to report. |
| 1214 |
| 1215 The format of the y.output file is not given because |
| 1216 specification of the format was not seen to enhance |
| 1217 applications portability. The listing is primarily |
| 1218 intended to help human users understand and debug the |
| 1219 parser; use of y.output by a conforming application |
| 1220 script would be unusual. Furthermore, implementations |
| 1221 have not produced consistent output and no popular for- |
| 1222 mat was apparent. The format selected by the implementa- |
| 1223 tion should be human-readable, in addition to the |
| 1224 requirement that it be a text file. |
| 1225 |
| 1226 Standard error reports are not specifically described |
| 1227 because they are seldom of use to conforming applica- |
| 1228 tions and there was no reason to restrict implementa- |
| 1229 tions. |
| 1230 |
| 1231 Some implementations recognize "={" as equivalent to '{' |
| 1232 because it appears in historical documentation. This |
| 1233 construction was recognized and documented as obsolete |
| 1234 as long ago as 1978, in the referenced Yacc: Yet Another |
| 1235 Compiler-Compiler. This volume of IEEE Std 1003.1-2001 |
| 1236 chose to leave it as obsolete and omit it. |
| 1237 |
| 1238 Multi-byte characters should be recognized by the lexi- |
| 1239 cal analyzer and returned as tokens. They should not be |
| 1240 returned as multi-byte character literals. The token |
| 1241 error that is used for error recovery is normally |
| 1242 assigned the value 256 in the historical implementation. |
| 1243 Thus, the token value 256, which is used in many multi- |
| 1244 byte character sets, is not available for use as the |
| 1245 value of a user-defined token. |
| 1246 |
| 1247 FUTURE DIRECTIONS |
| 1248 None. |
| 1249 |
| 1250 SEE ALSO |
| 1251 c99, lex |
| 1252 |
| 1253 COPYRIGHT |
| 1254 Portions of this text are reprinted and reproduced in |
| 1255 electronic form from IEEE Std 1003.1, 2003 Edition, |
| 1256 Standard for Information Technology -- Portable Operat- |
| 1257 ing System Interface (POSIX), The Open Group Base Speci- |
| 1258 fications Issue 6, Copyright (C) 2001-2003 by the Insti- |
| 1259 tute of Electrical and Electronics Engineers, Inc and |
| 1260 The Open Group. In the event of any discrepancy between |
| 1261 this version and the original IEEE and The Open Group |
| 1262 Standard, the original IEEE and The Open Group Standard |
| 1263 is the referee document. The original Standard can be |
| 1264 obtained online at http://www.open- |
| 1265 group.org/unix/online.html . |
| 1266 |
| 1267 |
| 1268 |
| 1269 IEEE/The Open Group 2003 YACC(1P) |
OLD | NEW |