Results 1  10
of
18
Parsing with derivatives: a functional pearl
 In Proceeding of the 16th ACM SIGPLAN international conference on Functional Programming (ICFP ’11). ACM
, 2011
"... We present a functional approach to parsing unrestricted contextfree grammars based on Brzozowski’s derivative of regular expressions. If we consider contextfree grammars as recursive regular expressions, Brzozowski’s equational theory extends without modification to contextfree grammars (and ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
We present a functional approach to parsing unrestricted contextfree grammars based on Brzozowski’s derivative of regular expressions. If we consider contextfree grammars as recursive regular expressions, Brzozowski’s equational theory extends without modification to contextfree grammars (and it generalizes to parser combinators). The supporting actors in this story are three concepts familiar to functional programmers—laziness, memoization and fixed points; these allow Brzozowski’s original equations to be transliterated into purely functional code in about 30 lines spread over three functions. Yet, this almost impossibly brief implementation has a drawback: its performance is sour—in both theory and practice. The culprit? Each derivative can double the size of a grammar, and with it, the cost of the next derivative. Fortunately, much of the new structure inflicted by the derivative is either dead on arrival, or it dies after the very next derivative. To eliminate it, we once again exploit laziness and memoization to transliterate an equational theory that prunes such debris into working code. Thanks to this compaction, parsing times become reasonable in practice. We equip the functional programmer with two equational theories that, when combined, make for an abbreviated understanding and implementation of a system for parsing contextfree languages. Categories and Subject Descriptors F.4.3 [Formal Languages]: Operations on languages
Parser Combinators in Scala
, 2008
"... Parser combinators are wellknown in functional programming languages such as Haskell. In this paper, we describe how they are implemented as a library in Scala, a functional objectoriented language. Thanks to Scala’s flexible syntax, we are able to closely approximate the EBNF notation supported b ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Parser combinators are wellknown in functional programming languages such as Haskell. In this paper, we describe how they are implemented as a library in Scala, a functional objectoriented language. Thanks to Scala’s flexible syntax, we are able to closely approximate the EBNF notation supported by dedicated parser generators. For the uninitiated, we first explain the concept of parser combinators by developing a minimal library from scratch. We then turn to the existing Scala library, and discuss its features using various examples.
Validating LR(1) Parsers
"... Abstract. An LR(1) parser is a finitestate automaton, equipped with a stack, which uses a combination of its current state and one lookahead symbol in order to determine which action to perform next. We present a validator which, when applied to a contextfree grammar G and an automaton A, checks t ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract. An LR(1) parser is a finitestate automaton, equipped with a stack, which uses a combination of its current state and one lookahead symbol in order to determine which action to perform next. We present a validator which, when applied to a contextfree grammar G and an automaton A, checks that A and G agree. Validating the parser provides the correctness guarantees required by verified compilers and other highassurance software that involves parsing. The validation process is independent of which technique was used to construct A. The validator is implemented and proved correct using the Coq proof assistant. As an application, we build a formallyverified parser for the C99 language. 1
Direct LeftRecursive Parsing Expressing Grammars
"... Parsing Expression Grammars (PEGs) are specifications of unambiguous recursivedescent style parsers. PEGs incorporate both lexing and parsing phases and have valuable properties, such as being closed under composition. In common with most recursivedescent systems, raw PEGs cannot handle leftrec ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Parsing Expression Grammars (PEGs) are specifications of unambiguous recursivedescent style parsers. PEGs incorporate both lexing and parsing phases and have valuable properties, such as being closed under composition. In common with most recursivedescent systems, raw PEGs cannot handle leftrecursion; traditional approaches to leftrecursion elimination lead to incorrect parses. In this paper, I show how the approach proposed for direct leftrecursive Packrat parsing by Warth et al. can be adapted for ‘pure ’ PEGs. I then demonstrate that this approach results in incorrect parses for some PEGs, before outlining a restrictive subset of leftrecursive PEGs which can safely work with this algorithm. Finally I suggest an alteration to Warth et al.’s algorithm that can correctly parse a less restrictive subset of directly recursive PEGs.
Left Recursion in Parsing Expression Grammars
"... Abstract. Parsing Expression Grammars (PEGs) are a formalism that can describe all deterministic contextfree languages through a set of rules that specify a topdown parser for some language. PEGs are easy to use, and there are efficient implementations of PEG libraries in several programming langu ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Parsing Expression Grammars (PEGs) are a formalism that can describe all deterministic contextfree languages through a set of rules that specify a topdown parser for some language. PEGs are easy to use, and there are efficient implementations of PEG libraries in several programming languages. A frequently missed feature of PEGs is left recursion, which is commonly used in ContextFree Grammars (CFGs) to encode leftassociative operations. We present a simple conservative extension to the semantics of PEGs that gives useful meaning to direct and indirect leftrecursive rules, and show that our extensions make it easy to express leftrecursive idioms from CFGs in PEGs, with similar results. We prove the conservativeness of these extensions, and also prove that they work with any leftrecursive PEG.
rtifact * AEC Automating Grammar Comparison
"... ns iste nt * Complete * W ell D ocumented*Easyto ..."
(Show Context)
The Design and Implementation of Object Grammars
"... An Object Grammar is a variation on traditional BNF grammars, where the notation is extended to support declarative bidirectional mappings between text and object graphs. The two directions for interpreting Object Grammars are parsing and formatting. Parsing transforms text into an object graph by r ..."
Abstract
 Add to MetaCart
(Show Context)
An Object Grammar is a variation on traditional BNF grammars, where the notation is extended to support declarative bidirectional mappings between text and object graphs. The two directions for interpreting Object Grammars are parsing and formatting. Parsing transforms text into an object graph by recognizing syntactic features and creating the corresponding object structure. In the reverse direction, formatting recognizes object graph features and generates an appropriate textual presentation. The key to Object Grammars is the expressive power of the mapping, which decouples the syntactic structure from the graph structure. To handle graphs, Object Grammars support declarative annotations for resolving textual names that refer to arbitrary objects in the graph structure. Predicates on the semantic structure provide additional control over the mapping. Furthermore, Object Grammars are compositional so that languages may be defined in a modular fashion. We have implemented our approach to Object Grammars as one of the foundations of the Enso ̄ system and illustrate the utility of our approach by showing how it enables definition and composition of domainspecific languages (DSLs).
Parsing with Derivatives A Functional Pearl Matthew
"... We present a functional approach to parsing unrestricted contextfree grammars based on Brzozowski’s derivative of regular expressions. If we consider contextfree grammars as recursive regular expressions, Brzozowski’s equational theory extends without modification to contextfree grammars (and it g ..."
Abstract
 Add to MetaCart
(Show Context)
We present a functional approach to parsing unrestricted contextfree grammars based on Brzozowski’s derivative of regular expressions. If we consider contextfree grammars as recursive regular expressions, Brzozowski’s equational theory extends without modification to contextfree grammars (and it generalizes to parser combinators). The supporting actors in this story are three concepts familiar to functional programmers—laziness, memoization and fixed points; these allow Brzozowski’s original equations to be transliterated into purely functional code in about 30 lines spread over three functions. Yet, this almost impossibly brief implementation has a drawback: its performance is sour—in both theory and practice. The culprit? Each derivative can double the size of a grammar, and with it, the cost of the next derivative. Fortunately, much of the new structure inflicted by the derivative is either dead on arrival, or it dies after the very next derivative. To eliminate it, we once again exploit laziness and memoization to transliterate an equational theory that prunes such debris into working code. Thanks to this compaction, parsing times become reasonable in practice. We equip the functional programmer with two equational theories that, when combined, make for an abbreviated understanding and implementation of a system for parsing contextfree languages.
Towards Dynamically Extensible Syntax Tijs van der Storm
"... Abstract. Domain specific language embedding requires either a very flexible host language to approximate the desired level of abstraction, – or elaborate tool support for “compiling away ” embedded notation. The former confines the language designer to reinterpreting a given syntax. The latter proh ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Domain specific language embedding requires either a very flexible host language to approximate the desired level of abstraction, – or elaborate tool support for “compiling away ” embedded notation. The former confines the language designer to reinterpreting a given syntax. The latter prohibits runtime analysis, interpretation and transformation of the embedded language. I tentatively present CherryLisp: a Lisp dialect with dynamically userdefinable syntax that suffers from neither of these drawbacks. Jan Heering often speaks fondly of Lisp and much of his research has been dedicated to programming environments, syntax, semantics, and domain specific languages. On the occasion of his retirement I would like to honour him and his work with this extended abstract which touches upon some of these subjects. 1
rmod.lille.inria.fr
, 2012
"... HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
 Add to MetaCart
(Show Context)
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.