## Generalised reduction modified LR parsing for domain specific language prototyping (2002)

Venue: | Proc. 35th Annual Hawaii International Conference On System Sciences (HICSS02), IEEE Computer Society |

Citations: | 5 - 1 self |

### BibTeX

@INPROCEEDINGS{Johnstone02generalisedreduction,

author = {Adrian Johnstone and Elizabeth Scott},

title = {Generalised reduction modified LR parsing for domain specific language prototyping},

booktitle = {Proc. 35th Annual Hawaii International Conference On System Sciences (HICSS02), IEEE Computer Society},

year = {2002}

}

### OpenURL

### Abstract

Domain specific languages should support syntax that is comfortable for specialist users. We discuss the impact of the standard deterministic parsing techniques such as LALR(1) and LL(1) on the design of programming languages and the desirability of more flexible parsers in a development environment. We present a new bottom-up nondeterministic parsing algorithm (GRMLR) that combines a modified notion of reduction with a Tomita-style breadth-first search of parallel parsing stacks. We give experimental results for standard programming language grammars and LR(0), SLR(1) and LR(1) tables; the weaker tables generate significant amounts of nondeterminism. We show that GRMLR parsing corrects errors in the standard Tomita algorithm without incurring the performance overheads associated with other published solutions. We also demonstrate that the performance of GRMLR is upper-bounded by the performance of Tomita’s algorithm, and that for one realistic language grammar GRMLR only needs to search around 74 % of the nodes. Our heavily instrumented development version of the algorithm achieves parsing rates of around 4,000–10,000 tokens per second on a 400MHz Pentium II processor. Proof of correctness and details of our implementation are omitted here for space reasons but are available in an accompanying technical report.

### Citations

301 |
Compilers: principles, techniques, and tools
- AHO, SETHI, et al.
- 1986
(Show Context)
Citation Context ...rence on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEEsProceedings of the 35th Hawaii International Conference on System Sciences - 2002 The labels � ��� « ¡ ¬� � are called items =-=[ASU86]-=- ofs, and the right hand sides of the rules, together with their position in the input string are the so-called handles. If � � Ì �� � we have the so-called LR(0)-items. To get the SLR(1)-items we tak... |

198 |
E±cient Parsing for Natural Language
- Tomita
- 1986
(Show Context)
Citation Context ...oretical approach to the resolution of nondeterminism in bottom up parsers for general context free languages [Lan74]. The best known concrete algorithmic solution is the Tomita GLR parsing algorithm =-=[Tom86]-=-, originally described in 1985, which generalises LR-style parsing through the use of a recombinant graph that captures the state of multiple parallel stacks. Tomita’s contribution was the design of t... |

174 | Yacc: Yet Another Compiler-Compiler
- Johnson
- 1978
(Show Context)
Citation Context ... parsing model and table structure as for LR parsing but with table construction algorithms that produce smaller tables whilst maintaining most of the LR algorithm’s parsing power. The Unix tool YACC =-=[Joh75]-=- combines LALR parsing with disambiguation constructs advocated by Aho, Johnson and Ullman [AJU75] to provide a practical parser generator that still forms a sort-of parsing lingua Franca a quarter of... |

163 |
On the translation of languages from left to right
- Knuth
- 1965
(Show Context)
Citation Context ...or the description of Algol-60 syntax [BBG 63] and its subsequent connection to Chomsky’s work on Context Free Languages by Ginsburg and Rice [GR62]. By 1965, Knuth had described bottom-up LR parsing =-=[Knu65]-=- and demonstrated that LR parsers run in time proportional to the length of the input string. This theoretical work was not immediately applicable because, although the time complexity of an LR parser... |

126 | Syntax Definition for Language Prototyping
- Visser
- 1997
(Show Context)
Citation Context ...trial application [vdBHdJ ar] and thus has been reliably field tested.) Rekers took Farshi’s algorithm (see page 17 of [Rek92]) and added efficient derivation tree construction mechanisms, and Visser =-=[Vis97]-=- modified the algorithm further to improve the efficiency in cases which are typically encountered when languages are specified down to lexical level in the grammar. However, both of these algorithms ... |

104 | Parser Generation for Interactive Environments
- Rekers
- 1992
(Show Context)
Citation Context ...grams. More formal approaches include the use of the ASF+SDF toolkit [vdBHdJ ar] to perform COBOL reverse-engineering work, in which a generalised LR parser of the type described in Section 4 is used =-=[Rek92]-=-. A perhaps more significant observation is that humans, even when working formally, rarely restrict themselves to simple deterministically-parsable notations. Most mathematical papers require extensi... |

85 |
Deterministic techniques for efficient non-deterministic parsers
- Lang
- 1974
(Show Context)
Citation Context ...therwise. 3 Resolution of non-determinism in bottomup parsers In 1974 Lang described a theoretical approach to the resolution of nondeterminism in bottom up parsers for general context free languages =-=[Lan74]-=-. The best known concrete algorithmic solution is the Tomita GLR parsing algorithm [Tom86], originally described in 1985, which generalises LR-style parsing through the use of a recombinant graph that... |

77 |
Syntaxdirected transduction
- Lewis, Stearns
- 1968
(Show Context)
Citation Context ...CSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEEsProceedings of the 35th Hawaii International Conference on System Sciences - 2002 written compilers had been formalised into the theory of LL(1) parsers =-=[IS68]-=-. Top down parsing also brings with it the advantage of easily implementable inherited attributes (i.e. the propagation of information down a derivation tree whilst it is being constructed) which can ... |

43 |
Simple LR(k) Grammars
- DeRemer
- 1971
(Show Context)
Citation Context ...parser is linear in the length of the input string, the space complexity is in worst case exponential in the size of the grammar 1 . This situation was transformed with DeRemer’s SLR [DeR69] and LALR =-=[DeR71]-=- algorithms which use the same parsing model and table structure as for LR parsing but with table construction algorithms that produce smaller tables whilst maintaining most of the LR algorithm’s pars... |

42 |
Two families of languages related to ALGOL
- Ginsburg, Rice
- 1962
(Show Context)
Citation Context ... successes was the development of Backus-Naur Form (BNF) for the description of Algol-60 syntax [BBG 63] and its subsequent connection to Chomsky’s work on Context Free Languages by Ginsburg and Rice =-=[GR62]-=-. By 1965, Knuth had described bottom-up LR parsing [Knu65] and demonstrated that LR parsers run in time proportional to the length of the input string. This theoretical work was not immediately appli... |

37 |
Deterministic parsing of ambiguous grammars
- Aho, Johnson, et al.
- 1975
(Show Context)
Citation Context ...roduce smaller tables whilst maintaining most of the LR algorithm’s parsing power. The Unix tool YACC [Joh75] combines LALR parsing with disambiguation constructs advocated by Aho, Johnson and Ullman =-=[AJU75]-=- to provide a practical parser generator that still forms a sort-of parsing lingua Franca a quarter of a century later. Meanwhile, in another part of the forest, the top-down recursive descent techniq... |

25 |
Practical Translators for LR(k) Languages
- DeRemer
- 1969
(Show Context)
Citation Context ...plexity of an LR parser is linear in the length of the input string, the space complexity is in worst case exponential in the size of the grammar 1 . This situation was transformed with DeRemer’s SLR =-=[DeR69]-=- and LALR [DeR71] algorithms which use the same parsing model and table structure as for LR parsing but with table construction algorithms that produce smaller tables whilst maintaining most of the LR... |

20 | Language Translation Using PCCTS - Parr - 1997 |

11 | Heering, Hayco de Jong, Merijn de Jonge - Brand, Deursen |

7 |
GLR parsing for -grammars
- Nozohoor-Farshi
- 1991
(Show Context)
Citation Context ...imple recognition rather than constructing a derivation tree. Subsequently, errors were found in Tomita’s algorithms for epsilon grammars: the (now) well know problem of hidden left recursion. Farshi =-=[NF91]-=- produced a corrected algorithm which was combined with more efficient ‘parse-forest’ construction by Ñ¯ 3 Rekers [Rek92]. The Farshi algorithm is a rather obvious fix with unpleasant performance impl... |

6 |
Faster generalised LR parsing
- Aycock, Horspool
- 1999
(Show Context)
Citation Context ...thm is a rather obvious fix with unpleasant performance implications. In 1999, Aycock and Horspool described an approach which they interpret as GLR parsing but seems to us to be only loosely related =-=[AH99]-=- although displaying very good performance on their examples. 4 Tomita’s construction In this section we review the LR parsing technique and Tomita’s generalized LR parsing technique. We describe the ... |

3 |
Generalised recursive descent parsing and follow determinism
- Johnstone, Scott
- 1998
(Show Context)
Citation Context ...nsions beyond the strict LL(1) properties of a grammar whilst allowing a designer to prototype a language, and then support the refinement of a grammar to produce a parser with acceptable performance =-=[JS98]-=-. In this paper we explore an alternative approach based on a new reworking of Lang and Tomita’s Generalised LR parsing. Our Reduction Modified parsers are LR-like, but use a slightly different defini... |

3 | Tomita-style generalised lr parsers
- Scott, Johnstone, et al.
- 2000
(Show Context)
Citation Context ...to a minimum. There is a straightforward modification to Tomita’s Algorithm 1 which allows it to accept grammars with ¯-rules, which we have described in the technical report that supports this paper =-=[JS00]-=-, and we have used this algorithm, which we call Tomita’s Algorithm 1E or GLR-1E, in the experiments reported below. Algorithm GLR-1E is simpler and more efficient than Tomita’s Algorithm 2, and it te... |

1 | report on the programming language ALGOL 60 - Backus, Bauer, et al. - 1963 |