#### DMCA

## Parse table composition - separate compilation and binary extensibility of grammars (2008)

Venue: | In Proc. of Intl. Conf. on Software Language Engineering (SLE |

Citations: | 5 - 0 self |

### Citations

4809 |
Introduction to automata theory, languages and computation, Addison-Wesley Pub
- Hopcroft, Ullman
- 1979
(Show Context)
Citation Context ...nts of this algorithm and the correspondence to the one discussed previously naturally lead to the solution to the modularity problem of LR(0) parse table generation. Generating LR(0) -NFA. An -NFA =-=[27]-=- is an NFA that allows transitions on , the empty string. Using -transitions an -NFA can make a transition without reading an input symbol. An -NFA A is a tuple 〈Q, Σ, δ〉 with Q a set of states, Σ... |

1404 | Depth-first search and linear graph algorithms
- Tarjan
- 1972
(Show Context)
Citation Context ...calculation of the follow set of a nonterminal requires a global analysis of the grammar. For this reason, reduce actions of parse table components are guarded by a symbolic reference to the follow set of a nonterminal. The actual follow sets are calculated at composition-time. Parse table components store their contribution to the global nullable, first, and follow relations. These relations induce first and follow graphs, which are traversed using the efficient Digraph [34] algorithm to calculate follow sets. The digraph algorithm is based on Tarjan’s strongly connected components algorithm [35, 36]. We also apply a series of optimizations for strongly connected component transitive closure algorithms [37]. Section 6 shows that the time spent on calculating follow sets at composition-time is minimal. The full algorithm is presented in [38]. Lexical Analysis. Before syntax analysis, a lexical analyzer splits the input stream of characters into a sequence of tokens that correspond to the terminals of a context-free grammar. Until now we have ignored the composition of the lexical syntax definition, but any solution for extensible syntax needs to consider the lexical analysis phase as well.... |

797 | An ecient context-free parsing algorithm
- Earley
- 1970
(Show Context)
Citation Context ...emantic actions. On modification of the grammar it generates a new LR(0) automaton. Dypgen is currently being extended by its developers to incorporate our algorithm for runtime extensibility. Earley =-=[35]-=- parsers work directly on the productions of a context-free grammar at parse-time. Because of this the Earley algorithm is relatively easy to extend to an extensible parser [36, 37]. Due to the lack o... |

360 |
Compilers: principles, techniques, and tools
- Aho, Sethi, et al.
- 1986
(Show Context)
Citation Context ... to the item resulting from shifting the dot over A. For an item that predicts a terminal, there is just a single transition to the next item. Figure 8 shows the algorithm for generating the LR(0) -NFA for a grammar G. Note that states are singleton sets of an item or just a dot before a nonterminal (the station states). The -NFA of a grammar G accepts the same language as the DFA generated by the algorithm of Figure 6, i.e. the language of viable prefixes. Eliminating -Transitions. The -NFA can be turned into a DFA by eliminating the - transitions using the subset construction algorithm [28, 25], well-known from automata theory and lexical analysis. Figure 9 shows the algorithm for converting an -NFA to a DFA. The function -closure extends a given set of states S to include all the states reachable through -transitions. The function move determines the states reachable from a set of states S through transitions on the argument X. The function labels is a utility function that returns the symbols (which does not include ) for which there are transitions from the states of S . The main function -subset-construction drives the construction of the DFA. For every state S ⊆ QE it dete... |

253 | Polyglot: An extensible compiler framework for java
- Nystrom, Clarkson, et al.
- 2003
(Show Context)
Citation Context ...r help protect the program against security vulnerabilities caused by user input that is not properly escaped. Extensible compilers such as ableJ [1] (based on Silver [2]), JastAddJ [3], and Polyglot =-=[4]-=- support the modular extension of the base language with new language features or embeddings of domain-specific languages. For example, the security vulnerabilities caused by the use of string literal... |

198 |
On the translation of languages from left to right
- Knuth
- 1965
(Show Context)
Citation Context ...or reduce A → α, goto a function Q × N → Q, and finally accept ⊆ Q, where we use the following additional notation: q for variables ranging over Q; and S for variables ranging over P(Q). An LR parser =-=[23, 24]-=- is a transition system with as configuration a stack of states and symbols qoX1q1X2q2 . . . Xnqn and an input string v of terminals. The next configuration of the parser is determined by reading the ... |

156 | Syntax Definition for Language Prototyping.
- Visser
- 1997
(Show Context)
Citation Context ...g of executable programs and libraries. We have implemented parse table composition in a prototype that generates parse tables for scannerless [16] generalized LR (GLR) [17, 18] parsers. It takes SDF =-=[19]-=- grammars as input. The technical contributions of this work are: – The idea of parse table composition as symmetric composition of parse tables as opposed to incrementally adding productions, as done... |

156 | Randomized search trees.
- Seidel, Aragon
- 1996
(Show Context)
Citation Context ...3 Optimization In worst case, subset construction applied to a NFA can result in an exponential number of states in the resulting DFA. There is nothing that can be done about the number of states that have to be created in subset reconstruction, except for creating these states as efficiently as possible. As stated by research on subset construction [29, 30], it is important to choose the appropriate algorithms and data structures. For example, the fixpoint iteration should be replaced, checking for the existence of a subset of states in Q must be efficient (we use uniquely represented treaps [31]), and the kernel for a transition on a symbol X from a subset must be determined efficiently. In our implementation we have applied some of the basic optimizations, but have focused on optimizations specific to parse table composition. The performance of parse table composition mostly depends on (1) the number of -closure invocations and (2) the cardinality of the resulting -closures. Avoiding Closure Calls. In the plain subset construction algorithm -closure calls are inevitable for every subset. However, subset construction has already been applied to the parse table components. If we kn... |

147 |
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
- Tomita
- 1986
(Show Context)
Citation Context ...iler, similar to dynamic linking of executable programs and libraries. We have implemented parse table composition in a prototype that generates parse tables for scannerless [16] generalized LR (GLR) =-=[17, 18]-=- parsers. It takes SDF [19] grammars as input. The technical contributions of this work are: – The idea of parse table composition as symmetric composition of parse tables as opposed to incrementally ... |

116 | JTS: Tools for Implementing Domain-Specific Languages”. ICSR
- Batory, Lofaso, et al.
- 1998
(Show Context)
Citation Context ...d by the SDF parser generator. 6 Related and Future Work Modular Grammar Formalisms. There are a number of parser generators that support splitting a grammar into multiple files, e.g. Rats! [33], JTS =-=[9]-=-, PPG [4], and SDF [19]. They vary in the expressiveness of their modularity features, their support for the extension of lexical syntax, and the parsing algorithm that is employed. Many tools ignore ... |

113 | Parser Generation for Interactive Environments.
- Rekers
- 1992
(Show Context)
Citation Context ...iler, similar to dynamic linking of executable programs and libraries. We have implemented parse table composition in a prototype that generates parse tables for scannerless [16] generalized LR (GLR) =-=[17, 18]-=- parsers. It takes SDF [19] grammars as input. The technical contributions of this work are: – The idea of parse table composition as symmetric composition of parse tables as opposed to incrementally ... |

107 |
The JastAdd extensible java compiler
- EKMAN, G
- 2007
(Show Context)
Citation Context ...or can the compiler help protect the program against security vulnerabilities caused by user input that is not properly escaped. Extensible compilers such as ableJ [1] (based on Silver [2]), JastAddJ =-=[3]-=-, and Polyglot [4] support the modular extension of the base language with new language features or embeddings of domain-specific languages. For example, the security vulnerabilities caused by the use... |

102 | Scalable component abstractions
- Odersky, Zenger
- 2005
(Show Context)
Citation Context ...re a grammar for the object language, i.e. the compilation of the embedded syntax is generically defined [7, 11]. Extensibility and Composition. Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. Thus, every extension or combination of extensions results in a different compiler. Some recent extensible compilers support composition of extensions by specifying language extensions as attribute grammars [1, 12] or using new language features for modular, type-safe scalable software composition [13, 14]. In contrast to the extensive research on composition of later compiler phases, the grammar formalisms used by current extensible compilers do not have advanced features for the modular definition of syntax. They do not support separate compilation of grammars and do not feature a method for deploying grammars as components. Indeed, for the parsing phase of the compiler, a compound parser needs to be generated for every combination of extensions using whole program compilation. Similarly, in metaprogramming systems with support for concrete syntax, grammars for a particular combination of obj... |

82 | Meta-programming with concrete object syntax. In Generative programming and component engineering,
- Visser
- 2002
(Show Context)
Citation Context ...t syntax of Java, the second uses the concrete syntax. Metaprogramming systems often only require a grammar for the object language, i.e. the compilation of the embedded syntax is generically defined =-=[8, 10]-=-. Extensibility and Composition. Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. Thus, every... |

64 | Silver: an extensible attribute grammar system.
- Wyk, Bodin, et al.
- 2010
(Show Context)
Citation Context ...ally correct, nor can the compiler help protect the program against security vulnerabilities caused by user input that is not properly escaped. Extensible compilers such as ableJ [1] (based on Silver =-=[2]-=-), JastAddJ [3], and Polyglot [4] support the modular extension of the base language with new language features or embeddings of domain-specific languages. For example, the security vulnerabilities ca... |

63 | Parsing techniques: a practical guide.
- Grune, Jacobs
- 2008
(Show Context)
Citation Context ...gorithm that first constructs a nondeterministic finite automaton (NFA) with -transitions (-NFA) and converts the - NFA into an LR(0) DFA in a separate step using the subset construction algorithm =-=[25, 26]-=-. The ingredients of this algorithm and the correspondence to the one discussed previously naturally lead to the solution to the modularity problem of LR(0) parse table generation. Generating LR(0) -... |

57 | Jedd: a BDD-based relational extension of Java
- Lhotak, Hendren
- 2004
(Show Context)
Citation Context ...orts grammars of PHP and Shell. Finally, the Polyglot [4] compiler has been used for the implementation of many language extensions, for example Jedd’s database relations and binary decision diagrams =-=[6]-=- and JPred’s predicate dispatch [7]. Figure 3 illustrates the JPred extension. Similar to the syntax extensions implemented using extensible compilers, several metaprogramming systems feature an exten... |

57 |
Simple LR(k) grammars.
- DeRemer
- 1971
(Show Context)
Citation Context ...dy visited. The visit function stops traversing the graph if it encounters a node with a different source (but continues if the source is ), thus avoiding states that are already closed in other station states. 5 Extensions SLR. For many languages, the LR(0) parse table generation algorithm results in states where the parser can perform a shift as well as a reduce action. If the parser generator targets a deterministic parser, then the parser generation fails at this point, or applies heuristics to resolve the conflict. To support a bigger class of languages, the SLR (Simple LR) algorithm [32] extends LR(0) by guarding the application of reduce actions by examining the next terminal in the input stream. An SLR parse table generator determines the set of terminals that can follow the nonterminal A and a reduce action for A → α is only applied if the next terminal in the input stream is in this set. If using a deterministic parser, then the main reason for restricting the application of reduce actions is to support a larger class of grammars. For a GLR parser this is not necessary: a GLR parser can perform both actions of a shift/reduce conflict and the continuation of the parsing pr... |

50 |
Incremental generation of parsers
- Heering, Klint, et al.
- 1989
(Show Context)
Citation Context ... very sparse, therefore we use Digraph. This could be done incrementally as well, but due to the very limited amount of time spend on the follow sets, it is hard to make a substantial difference. IPG =-=[21]-=- is a lazy and incremental parser generator targeting a GLR parser using LR(0) parse tables. This work was motivated by interactive metaprogramming environments. The parse table is generated by need d... |

47 | Attribute grammar-based language extensions for Java. In:
- Wyk, Krishnan, et al.
- 2007
(Show Context)
Citation Context ...rograms are syntactically correct, nor can the compiler help protect the program against security vulnerabilities caused by user input that is not properly escaped. Extensible compilers such as ableJ =-=[1]-=- (based on Silver [2]), JastAddJ [3], and Polyglot [4] support the modular extension of the base language with new language features or embeddings of domain-specific languages. For example, the securi... |

47 | Extensible syntax with lexical scoping
- Cardelli, Matthes, et al.
- 1994
(Show Context)
Citation Context ...ficient than GLR parsers for programming languages that are close to LR. Maya [38] uses LALR for providing extensible syntax but regenerates the automaton from scratch for every extension. Cardelli’s =-=[22]-=- extensible syntax uses an extensible LL(1) parser. Camlp4 [39] is a preprocessor for OCaml using an extensible top down recursive descent parser. Tatoo [40] allows incomplete grammars to be compiled ... |

46 | Practical predicate dispatch. In:
- Millstein
- 2004
(Show Context)
Citation Context ...L and Shell commands in PHP. In both cases, the StringBorg compiler guarantees that embedded sentences are syntactically correct and that strings are properly escaped, as opposed to the unhygienic string concatenation on the previous lines. The implementations of these extensions are modular, e.g. the grammar for the embedding of Shell in PHP is a module that imports grammars of PHP and Shell. Finally, the Polyglot [3] compiler has been used for the implementation of many language extensions, for example Jedd’s database relations and binary decision diagrams [5] and JPred’s predicate dispatch [6]. Figure 3 illustrates the JPred extension. Similar to the syntax extensions implemented using extensible compilers, several metaprogramming systems feature an extensible syntax. Metaprograms usually manipulate programs in a structured representation, but writing code generators and transformations in the abstract syntax of a language can be very unwieldy. Therefore, several metaprogramming systems [7–10] support the embedding of an object language syntax in the metalanguage. The embedded code fragments are parsed statically and translated to a structured representation, thus providing a conci... |

45 |
G.V.: Scannerless NSLR(1) parsing of programming languages. In:
- Salomon, Cormack
- 1989
(Show Context)
Citation Context ...every invocation of a compiler, similar to dynamic linking of executable programs and libraries. We have implemented parse table composition in a prototype that generates parse tables for scannerless =-=[16]-=- generalized LR (GLR) [17, 18] parsers. It takes SDF [19] grammars as input. The technical contributions of this work are: – The idea of parse table composition as symmetric composition of parse table... |

42 |
Better extensibility through modular syntax.
- Grimm
- 2006
(Show Context)
Citation Context ...e time used by the SDF parser generator. 6 Related and Future Work Modular Grammar Formalisms. There are a number of parser generators that support splitting a grammar into multiple files, e.g. Rats! =-=[33]-=-, JTS [9], PPG [4], and SDF [19]. They vary in the expressiveness of their modularity features, their support for the extension of lexical syntax, and the parsing algorithm that is employed. Many tool... |

41 | E.: Stratego/XT 0.16. Components for transformation systems. In:
- Bravenboer, Kalleberg, et al.
- 2006
(Show Context)
Citation Context ...n heavily optimized because its performance is not relevant to the performance of runtime parse table composition. We have implemented a generator and composer for scannerless GLR SLR parse tables. This affects the performance of the composer: grammars have more productions, more symbols, and layout occurs between many symbols. Note that the composed parse tables are identical to parse tables generated using the SDF parser generator. Therefore, the performance of the parsers is equivalent. Figure 14 presents the results for a series of metaprogramming concrete syntax extensions using Stratego [9], StringBorg [4] extensions, and AspectJ. For Stratego and StringBorg, the number of overlapping symbols is very limited and there are many single-state closures. Depending on the application, different comparisons of the timings are useful. For runtime parse table composition, we need to compare the performance to generation of the full automaton. The total composition time (col. 16) is only about 2% to 16% of the SDF parse table generation time (col. 4). The performance benefit increases more if we include the required normalization (col. 3) (SDF does not support separate normalization of mo... |

39 |
T.J.: Efficient computation of LALR(1) look-ahead sets. In:
- DeRemer, Pennello
- 1979
(Show Context)
Citation Context ...unately, the follow set of a nonterminal can (and usually will) change if new productions are added to a grammar. Thus, the calculation of the follow set of a nonterminal requires a global analysis of the grammar. For this reason, reduce actions of parse table components are guarded by a symbolic reference to the follow set of a nonterminal. The actual follow sets are calculated at composition-time. Parse table components store their contribution to the global nullable, first, and follow relations. These relations induce first and follow graphs, which are traversed using the efficient Digraph [34] algorithm to calculate follow sets. The digraph algorithm is based on Tarjan’s strongly connected components algorithm [35, 36]. We also apply a series of optimizations for strongly connected component transitive closure algorithms [37]. Section 6 shows that the time spent on calculating follow sets at composition-time is minimal. The full algorithm is presented in [38]. Lexical Analysis. Before syntax analysis, a lexical analyzer splits the input stream of characters into a sequence of tokens that correspond to the terminals of a context-free grammar. Until now we have ignored the compositio... |

34 | Preventing Injection Attacks with Syntax Embeddings.
- Bravenboer, Dolstra, et al.
- 2007
(Show Context)
Citation Context ...piler have also been applied to implement extensions for complex numbers, algebraic datatypes, and computational geometry. Similar to the SQL extension of ableJ, the StringBorg syntactic preprocessor =-=[5]-=- supports the embedding of languages in arbitrary base languages to prevent security vulnerabilities. Figure 2 shows applications of embeddings of SQL and Shell commands in PHP. In both cases, the Str... |

31 | Domain specific language implementation via compile-time metaprogramming.
- Tratt
- 2008
(Show Context)
Citation Context ... extensibility. Earley [35] parsers work directly on the productions of a context-free grammar at parse-time. Because of this the Earley algorithm is relatively easy to extend to an extensible parser =-=[36, 37]-=-. Due to the lack of a generation phase, Earley parsers are less efficient than GLR parsers for programming languages that are close to LR. Maya [38] uses LALR for providing extensible syntax but rege... |

29 |
Efficient Transitive Closure Computation in Large Digraphs.
- Nuutila
- 1995
(Show Context)
Citation Context ... actions of parse table components are guarded by a symbolic reference to the follow set of a nonterminal. The actual follow sets are calculated at composition-time. Parse table components store their contribution to the global nullable, first, and follow relations. These relations induce first and follow graphs, which are traversed using the efficient Digraph [34] algorithm to calculate follow sets. The digraph algorithm is based on Tarjan’s strongly connected components algorithm [35, 36]. We also apply a series of optimizations for strongly connected component transitive closure algorithms [37]. Section 6 shows that the time spent on calculating follow sets at composition-time is minimal. The full algorithm is presented in [38]. Lexical Analysis. Before syntax analysis, a lexical analyzer splits the input stream of characters into a sequence of tokens that correspond to the terminals of a context-free grammar. Until now we have ignored the composition of the lexical syntax definition, but any solution for extensible syntax needs to consider the lexical analysis phase as well. If different languages are used together in the same source file, then the finite automatabased implementati... |

24 |
Context-aware scanning for Parsing Extensible Languages.
- Vyk, Schwerdfeger
- 2006
(Show Context)
Citation Context ...n of lexical syntax, and the parsing algorithm that is employed. Many tools ignore the intricate lexical aspects of syntax extensions, whereas some apply scannerless parsing or context-aware scanning =-=[34]-=-. However, except for a few research prototypes discussed next, these parser generators all generate a parser by first collecting all the sources, essentially resulting in whole-program compilation. E... |

22 | Declarative, formal, and extensible syntax definition for AspectJ – A case for scannerless generalized-LR parsing. In: OOPSLA’06,
- Bravenboer, Tanter, et al.
- 2006
(Show Context)
Citation Context ...ate normalization of modules). For larger grammars (e.g. involving Java) the performance benefit is bigger. The AspectJ composition is clearly different from the other embeddings. The AspectJ grammar =-=[32]-=- uses grammar mixins to reuse the Java grammar in 5 different contexts. All contexts customize their instance of Java, e.g. reserved keywords differ per context. Without separate compilation this resu... |

22 |
C.J.H.: Parsing Techniques - A Practical Guide. Ellis Horwood, Upper Saddle River,
- Grune, Jacobs
- 1990
(Show Context)
Citation Context ...peat until Q and δ do not change 7 for each q = {[A→ α • Xβ]} ∈ Q 8 q′ B {[A→ αX • β]} 9 Q B Q ∪ {q′} 10 δ B δ ∪ {q→X q′} 11 for each q = {[A→ α • Bγ]} ∈ Q 12 δ B δ ∪ {q→ •B} 13 return 〈Q, Σ(G) ∪ N(G), δ〉 Fig. 7: LR(0) -NFA for grammar G1 Fig. 8: LR(0) -NFA generation generation time. As an introduction to the solution for separate compilation of grammars, we discuss a variation of the LR(0) algorithm that first constructs a non-deterministic finite automaton (NFA) with -transitions (-NFA) and converts the -NFA into an LR(0) DFA in a separate step using the subset construction algorithm [26, 27]. The ingredients of this algorithm and the correspondence to the one discussed previously naturally lead to the solution to the modularity problem of LR(0) parse table generation. Generating LR(0) -NFA. An -NFA [28] is an NFA that allows transitions on , the empty string. Using -transitions an -NFA can make a transition without reading an input symbol. An -NFA A is a tuple 〈Q, Σ, δ〉 with Q a set of states, Σ a set of symbols, and δ a transition function Q × (Σ ∪ {}) → P(Q), where we use the following notation: q for variables ranging over Q; S for variables ranging over P(Q), but rangi... |

19 | A.C.: J&: nested intersection for scalable software composition.
- Nystrom, Qi, et al.
- 2006
(Show Context)
Citation Context ...ensible compilers support composition of extensions by specifying language extensions as attribute grammars [1, 12] or using new language features for modular, type-safe scalable software composition =-=[13, 14]-=-. In contrast to the extensive research on composition of later compiler phases, the grammar formalisms used by current extensible compilers do not have advanced features for the modular definition of... |

19 |
On Computing the Transitive Closure of a Relation,”
- Eve, Kurki-Suonio
- 1977
(Show Context)
Citation Context ...calculation of the follow set of a nonterminal requires a global analysis of the grammar. For this reason, reduce actions of parse table components are guarded by a symbolic reference to the follow set of a nonterminal. The actual follow sets are calculated at composition-time. Parse table components store their contribution to the global nullable, first, and follow relations. These relations induce first and follow graphs, which are traversed using the efficient Digraph [34] algorithm to calculate follow sets. The digraph algorithm is based on Tarjan’s strongly connected components algorithm [35, 36]. We also apply a series of optimizations for strongly connected component transitive closure algorithms [37]. Section 6 shows that the time spent on calculating follow sets at composition-time is minimal. The full algorithm is presented in [38]. Lexical Analysis. Before syntax analysis, a lexical analyzer splits the input stream of characters into a sequence of tokens that correspond to the terminals of a context-free grammar. Until now we have ignored the composition of the lexical syntax definition, but any solution for extensible syntax needs to consider the lexical analysis phase as well.... |

17 | Efficient approaches to subset construction
- Leslie
- 1995
(Show Context)
Citation Context ...that can be done about the number of states that have to be created in subset reconstruction, except for creating these states as efficiently as possible. As stated by research on subset construction =-=[28, 29]-=-, it is important to choose the appropriate algorithms and data structures. For example, the fixpoint iteration should be replaced, checking for the existence of a subset of states in Q must be effici... |

12 |
The Converge programming language.
- Tratt
- 2005
(Show Context)
Citation Context ...e to our applications, since the syntax extensions are not a fixed set and typically provided by other parties. Dypgen [15] is a GLR self-extensible parser generator focusing on scoped modification of the grammar from its semantic actions. On modification of the grammar it generates a new LR(0) automaton. Dypgen is currently being extended by its developers to incorporate our algorithm for runtime extensibility. Earley [42] parsers work directly on the productions of a context-free grammar at parse-time. Because of this the Earley algorithm is relatively easy to extend to an extensible parser [43, 44]. Due to the lack of a generation phase, Earley parsers are less efficient than GLR parsers for programming languages that are close to LR. Maya [45] uses LALR for providing extensible syntax but regenerates the automaton from scratch for every extension. Cardelli’s [22] extensible syntax uses an extensible LL(1) parser. Camlp4 [46] is a preprocessor for OCaml using an extensible top down recursive descent parser. Automata Theory and Applications. The egrep pattern matching tool uses a DFA for efficient matching in combination with lazy state construction to avoid the initial overhead of const... |

10 | The treatment of epsilon moves in subset construction.
- vanNoord
- 1998
(Show Context)
Citation Context ...that can be done about the number of states that have to be created in subset reconstruction, except for creating these states as efficiently as possible. As stated by research on subset construction =-=[28, 29]-=-, it is important to choose the appropriate algorithms and data structures. For example, the fixpoint iteration should be replaced, checking for the existence of a subset of states in Q must be effici... |

10 | S.C.: LR parsing. - Aho, Johnson - 1974 |

9 | Incremental generation of LR parsers.
- Horspool
- 1990
(Show Context)
Citation Context ...LR(0) automaton can change drastically if it is combined with another LR(0) automaton. The composition with an automaton for just a single production can introduce an exponential number of new states =-=[20]-=-. Parse table composition cannot circumvent this worst case. Fortunately, grammars of typical programming languages do not exhibit this behaviour. To evaluate our method, we measured how parse table c... |

9 |
multiple-dispatch syntax extension in Java
- Baker, Hsieh
- 2002
(Show Context)
Citation Context ...atively easy to extend to an extensible parser [36, 37]. Due to the lack of a generation phase, Earley parsers are less efficient than GLR parsers for programming languages that are close to LR. Maya =-=[38]-=- uses LALR for providing extensible syntax but regenerates the automaton from scratch for every extension. Cardelli’s [22] extensible syntax uses an extensible LL(1) parser. Camlp4 [39] is a preproces... |

8 | M.G.J.: Repleo: A syntax-safe template engine
- Arnoldus, Bijpost, et al.
- 2007
(Show Context)
Citation Context ...t syntax of Java, the second uses the concrete syntax. Metaprogramming systems often only require a grammar for the object language, i.e. the compilation of the embedded syntax is generically defined =-=[8, 10]-=-. Extensibility and Composition. Current extensible compilers focus on source-level extensibility, which requires users to compile the compiler with a specific configuration of extensions. Thus, every... |

8 |
Randomized search trees, Algorithmica 16
- Seidel, Aragon
- 1996
(Show Context)
Citation Context ...ate algorithms and data structures. For example, the fixpoint iteration should be replaced, checking for the existence of a subset of states in Q must be efficient (we use uniquely represented treaps =-=[30]-=-), and the kernel for a transition on a symbol X from a subset must be determined efficiently. In our implementation we have applied some of the basic optimizations, but have focused on optimizations ... |

7 | Adding syntax and static analysis to libraries via extensible compilers and language extensions. In:
- Wyk, Bodin, et al.
- 2006
(Show Context)
Citation Context ...very extension or combination of extensions results in a different compiler. Some recent extensible compilers support composition of extensions by specifying language extensions as attribute grammars =-=[1, 12]-=- or using new language features for modular, type-safe scalable software composition [13, 14]. In contrast to the extensive research on composition of later compiler phases, the grammar formalisms use... |

7 | Extensible Language Implementation.
- Kolbly
- 2002
(Show Context)
Citation Context ... extensibility. Earley [35] parsers work directly on the productions of a context-free grammar at parse-time. Because of this the Earley algorithm is relatively easy to extend to an extensible parser =-=[36, 37]-=-. Due to the lack of a generation phase, Earley parsers are less efficient than GLR parsers for programming languages that are close to LR. Maya [38] uses LALR for providing extensible syntax but rege... |

7 | W.: Maya: multiple-dispatch syntax extension in java. In:
- Baker, Hsieh
- 2002
(Show Context)
Citation Context ...er generator focusing on scoped modification of the grammar from its semantic actions. On modification of the grammar it generates a new LR(0) automaton. Dypgen is currently being extended by its developers to incorporate our algorithm for runtime extensibility. Earley [42] parsers work directly on the productions of a context-free grammar at parse-time. Because of this the Earley algorithm is relatively easy to extend to an extensible parser [43, 44]. Due to the lack of a generation phase, Earley parsers are less efficient than GLR parsers for programming languages that are close to LR. Maya [45] uses LALR for providing extensible syntax but regenerates the automaton from scratch for every extension. Cardelli’s [22] extensible syntax uses an extensible LL(1) parser. Camlp4 [46] is a preprocessor for OCaml using an extensible top down recursive descent parser. Automata Theory and Applications. The egrep pattern matching tool uses a DFA for efficient matching in combination with lazy state construction to avoid the initial overhead of constructing a DFA. egrep determines the transitions of the DFA only when they are actually needed at runtime. Conceptually, this is related to lazy parse... |

6 | Generalised reduction modified LR parsing for domain specific language prototyping. In:
- Johnstone, Scott
- 2002
(Show Context)
Citation Context ...gorithm that first constructs a nondeterministic finite automaton (NFA) with -transitions (-NFA) and converts the - NFA into an LR(0) DFA in a separate step using the subset construction algorithm =-=[25, 26]-=-. The ingredients of this algorithm and the correspondence to the one discussed previously naturally lead to the solution to the modularity problem of LR(0) parse table generation. Generating LR(0) -... |

4 |
D.: Camlp4 Reference Manual.
- Rauglaudre
- 2003
(Show Context)
Citation Context ...se to LR. Maya [38] uses LALR for providing extensible syntax but regenerates the automaton from scratch for every extension. Cardelli’s [22] extensible syntax uses an extensible LL(1) parser. Camlp4 =-=[39]-=- is a preprocessor for OCaml using an extensible top down recursive descent parser. Tatoo [40] allows incomplete grammars to be compiled into separate parse tables that can be linked together. However... |

4 |
Efficient approaches to subset construction. Master’s thesis,
- Leslie
- 1995
(Show Context)
Citation Context ... be applied partially to an automaton, so extending a deterministic automaton with new states and transitions and applying subset construction subsequently is not different from applying subset construction to an extension of the original -NFA. 4.3 Optimization In worst case, subset construction applied to a NFA can result in an exponential number of states in the resulting DFA. There is nothing that can be done about the number of states that have to be created in subset reconstruction, except for creating these states as efficiently as possible. As stated by research on subset construction [29, 30], it is important to choose the appropriate algorithms and data structures. For example, the fixpoint iteration should be replaced, checking for the existence of a subset of states in Q must be efficient (we use uniquely represented treaps [31]), and the kernel for a transition on a symbol X from a subset must be determined efficiently. In our implementation we have applied some of the basic optimizations, but have focused on optimizations specific to parse table composition. The performance of parse table composition mostly depends on (1) the number of -closure invocations and (2) the cardin... |

3 |
M.: Scalable component abstractions. In: OOPSLA ’05
- Odersky, Zenger
(Show Context)
Citation Context ...ensible compilers support composition of extensions by specifying language extensions as attribute grammars [1, 12] or using new language features for modular, type-safe scalable software composition =-=[13, 14]-=-. In contrast to the extensive research on composition of later compiler phases, the grammar formalisms used by current extensible compilers do not have advanced features for the modular definition of... |

3 |
Exercises in Free Syntax. Syntax Definition, Parsing, and Assimilation of Language Conglomerates.
- Bravenboer
- 2008
(Show Context)
Citation Context ...rately, which explains why some numbers do not sum up exactly. Results are the average of three runs. ficient algorithms for the SLR and scannerless extensions of the LR(0) algorithm are presented in =-=[31]-=- (Section 6.6 and 6.7). Scannerless parsing affects the performance of the composer: grammars have more productions, more symbols, and layout occurs between many symbols. The composed parse tables are... |

3 |
Evaluating GLR parsing algorithms.
- Johnstone, Scott, et al.
- 2006
(Show Context)
Citation Context ...r determines the set of terminals that can follow the nonterminal A and a reduce action for A → α is only applied if the next terminal in the input stream is in this set. If using a deterministic parser, then the main reason for restricting the application of reduce actions is to support a larger class of grammars. For a GLR parser this is not necessary: a GLR parser can perform both actions of a shift/reduce conflict and the continuation of the parsing process will determine which action was correct. However, for GLR it is still useful to reduce the number of conflicts to improve performance [33]. The SLR algorithm determines the follow set of nonterminals by analyzing the productions of the grammar. Unfortunately, the follow set of a nonterminal can (and usually will) change if new productions are added to a grammar. Thus, the calculation of the follow set of a nonterminal requires a global analysis of the grammar. For this reason, reduce actions of parse table components are guarded by a symbolic reference to the follow set of a nonterminal. The actual follow sets are calculated at composition-time. Parse table components store their contribution to the global nullable, first, and f... |

2 |
Practical predicate dispatch. In: OOPSLA ’04
- Millstein
- 2004
(Show Context)
Citation Context ...ally, the Polyglot [4] compiler has been used for the implementation of many language extensions, for example Jedd’s database relations and binary decision diagrams [6] and JPred’s predicate dispatch =-=[7]-=-. Figure 3 illustrates the JPred extension. Similar to the syntax extensions implemented using extensible compilers, several metaprogramming systems feature an extensible syntax. Metaprograms usually ... |

2 |
Dypgen: Self-extensible parsers for ocaml. http://dypgen.free.fr
- Onzon
- 2007
(Show Context)
Citation Context ... foundation for separate compilation in parser generators, load-time composition of parse table components (cf. dynamic linking), and even runtime extension of parse tables for selfextensible parsers =-=[15]-=-. As illustrated by our AspectJ evaluation, separate compilation of modules can even improve the performance of a whole program parser generator. Using parse table composition extensible compilers can... |

2 | G.: Separate compilation of grammars with Tatoo
- Cervelle, Forax, et al.
(Show Context)
Citation Context ...om scratch for every extension. Cardelli’s [22] extensible syntax uses an extensible LL(1) parser. Camlp4 [39] is a preprocessor for OCaml using an extensible top down recursive descent parser. Tatoo =-=[40]-=- allows incomplete grammars to be compiled into separate parse tables that can be linked together. However, Tatoo does not compose parse tables, but parsers. It switches to a different parser when it ... |