You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
193 lines
6.7 KiB
193 lines
6.7 KiB
This directory contains data needed by Bison.
|
|
|
|
# Directory content
|
|
## Skeletons
|
|
Bison skeletons: the general shapes of the different parser kinds, that are
|
|
specialized for specific grammars by the bison program.
|
|
|
|
Currently, the supported skeletons are:
|
|
|
|
- yacc.c
|
|
It used to be named bison.simple: it corresponds to C Yacc
|
|
compatible LALR(1) parsers.
|
|
|
|
- lalr1.cc
|
|
Produces a C++ parser class.
|
|
|
|
- lalr1.java
|
|
Produces a Java parser class.
|
|
|
|
- glr.c
|
|
A Generalized LR C parser based on Bison's LALR(1) tables.
|
|
|
|
- glr.cc
|
|
A Generalized LR C++ parser. Actually a C++ wrapper around glr.c.
|
|
|
|
These skeletons are the only ones supported by the Bison team. Because the
|
|
interface between skeletons and the bison program is not finished, *we are
|
|
not bound to it*. In particular, Bison is not mature enough for us to
|
|
consider that "foreign skeletons" are supported.
|
|
|
|
## m4sugar
|
|
This directory contains M4sugar, sort of an extended library for M4, which
|
|
is used by Bison to instantiate the skeletons.
|
|
|
|
## xslt
|
|
This directory contains XSLT programs that transform Bison's XML output into
|
|
various formats.
|
|
|
|
- bison.xsl
|
|
A library of routines used by the other XSLT programs.
|
|
|
|
- xml2dot.xsl
|
|
Conversion into GraphViz's dot format.
|
|
|
|
- xml2text.xsl
|
|
Conversion into text.
|
|
|
|
- xml2xhtml.xsl
|
|
Conversion into XHTML.
|
|
|
|
# Implementation note about the skeletons
|
|
|
|
"Skeleton" in Bison parlance means "backend": a skeleton is fed by the bison
|
|
executable with LR tables, facts about the symbols, etc. and they generate
|
|
the output (say parser.cc, parser.hh, location.hh, etc.). They are only in
|
|
charge of generating the parser and its auxiliary files, they do not
|
|
generate the XML output, the parser.output reports, nor the graphical
|
|
rendering.
|
|
|
|
The bits of information passing from bison to the backend is named
|
|
"muscles". Muscles are passed to M4 via its standard input: it's a set of
|
|
m4 definitions. To see them, use `--trace=muscles`.
|
|
|
|
Except for muscles, whose names are generated by bison, the skeletons have
|
|
no constraint at all on the macro names: there is no technical/theoretical
|
|
limitation, as long as you generate the output, you can do what you want.
|
|
However, of course, that would be a bad idea if, say, the C and C++
|
|
skeletons used different approaches and had completely different
|
|
implementations. That would be a maintenance nightmare.
|
|
|
|
Below, we document some of the macros that we use in several of the
|
|
skeletons. If you are to write a new skeleton, please, implement them for
|
|
your language. Overall, be sure to follow the same patterns as the existing
|
|
skeletons.
|
|
|
|
## Symbols
|
|
|
|
### `b4_symbol(NUM, FIELD)`
|
|
In order to unify the handling of the various aspects of symbols (tag, type
|
|
name, whether terminal, etc.), bison.exe defines one macro per (token,
|
|
field), where field can `has_id`, `id`, etc.: see
|
|
`prepare_symbols_definitions()` in `src/output.c`.
|
|
|
|
The macro `b4_symbol(NUM, FIELD)` gives access to the following FIELDS:
|
|
|
|
- `has_id`: 0 or 1.
|
|
|
|
Whether the symbol has an id.
|
|
|
|
- `id`: string
|
|
If has_id, the id (prefixed by api.token.prefix if defined), otherwise
|
|
defined as empty. Guaranteed to be usable as a C identifier.
|
|
|
|
- `tag`: string.
|
|
A representation of the symbol. Can be 'foo', 'foo.id', '"foo"' etc.
|
|
|
|
- `user_number`: integer
|
|
The external number as used by yylex. Can be ASCII code when a character,
|
|
some number chosen by bison, or some user number in the case of
|
|
%token FOO <NUM>. Corresponds to yychar in yacc.c.
|
|
|
|
- `is_token`: 0 or 1
|
|
Whether this is a terminal symbol.
|
|
|
|
- `number`: integer
|
|
The internal number (computed from the external number by yytranslate).
|
|
Corresponds to yytoken in yacc.c. This is the same number that serves as
|
|
key in b4_symbol(NUM, FIELD).
|
|
|
|
In bison, symbols are first assigned increasing numbers in order of
|
|
appearance (but tokens first, then nterms). After grammar reduction,
|
|
unused nterms are then renumbered to appear last (i.e., first tokens, then
|
|
used nterms and finally unused nterms). This final number NUM is the one
|
|
contained in this field, and it is the one used as key in `b4_symbol(NUM,
|
|
FIELD)`.
|
|
|
|
The code of the rule actions, however, is emitted before we know what
|
|
symbols are unused, so they use the original numbers. To avoid confusion,
|
|
they actually use "orig NUM" instead of just "NUM". bison also emits
|
|
definitions for `b4_symbol(orig NUM, number)` that map from original
|
|
numbers to the new ones. `b4_symbol` actually resolves `orig NUM` in the
|
|
other case, i.e., `b4_symbol(orig 42, tag)` would return the tag of the
|
|
symbols whose original number was 42.
|
|
|
|
- `has_type`: 0, 1
|
|
Whether has a semantic value.
|
|
|
|
- `type_tag`: string
|
|
When api.value.type=union, the generated name for the union member.
|
|
yytype_INT etc. for symbols that has_id, otherwise yytype_1 etc.
|
|
|
|
- `type`
|
|
If it has a semantic value, its type tag, or, if variant are used,
|
|
its type.
|
|
In the case of api.value.type=union, type is the real type (e.g. int).
|
|
|
|
- `has_printer`: 0, 1
|
|
- `printer`: string
|
|
- `printer_file`: string
|
|
- `printer_line`: integer
|
|
- `printer_loc`: location
|
|
If the symbol has a printer, everything about it.
|
|
|
|
- `has_destructor`, `destructor`, `destructor_file`, `destructor_line`, `destructor_loc`
|
|
Likewise.
|
|
|
|
### `b4_symbol_value(VAL, [SYMBOL-NUM], [TYPE-TAG])`
|
|
Expansion of $$, $1, $<TYPE-TAG>3, etc.
|
|
|
|
The semantic value from a given VAL.
|
|
- `VAL`: some semantic value storage (typically a union). e.g., `yylval`
|
|
- `SYMBOL-NUM`: the symbol number from which we extract the type tag.
|
|
- `TYPE-TAG`, the user forced the `<TYPE-TAG>`.
|
|
|
|
The result can be used safely, it is put in parens to avoid nasty precedence
|
|
issues.
|
|
|
|
### `b4_lhs_value(SYMBOL-NUM, [TYPE])`
|
|
Expansion of `$$` or `$<TYPE>$`, for symbol `SYMBOL-NUM`.
|
|
|
|
### `b4_rhs_data(RULE-LENGTH, POS)`
|
|
The data corresponding to the symbol `#POS`, where the current rule has
|
|
`RULE-LENGTH` symbols on RHS.
|
|
|
|
### `b4_rhs_value(RULE-LENGTH, POS, SYMBOL-NUM, [TYPE])`
|
|
Expansion of `$<TYPE>POS`, where the current rule has `RULE-LENGTH` symbols
|
|
on RHS.
|
|
|
|
-----
|
|
|
|
Local Variables:
|
|
mode: markdown
|
|
fill-column: 76
|
|
ispell-dictionary: "american"
|
|
End:
|
|
|
|
Copyright (C) 2002, 2008-2015, 2018-2019 Free Software Foundation, Inc.
|
|
|
|
This file is part of GNU Bison.
|
|
|
|
This program is free software: you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation, either version 3 of the License, or
|
|
(at your option) any later version.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program. If not, see <http://www.gnu.org/licenses/>.
|