Results 1 - 10
of
29
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
- In ACL
, 2005
"... Discriminative reranking is one method for constructing high-performance statistical parsers (Collins, 2000). A discriminative reranker requires a source of candidate parses for each sentence. This paper describes a simple yet novel method for constructing sets of 50-best parses based on a co ..."
Abstract
-
Cited by 261 (13 self)
- Add to MetaCart
Discriminative reranking is one method for constructing high-performance statistical parsers (Collins, 2000). A discriminative reranker requires a source of candidate parses for each sentence. This paper describes a simple yet novel method for constructing sets of 50-best parses based on a coarse-to-fine generative parser (Charniak, 2000). This method generates 50-best lists that are of substantially higher quality than previously obtainable.
Intricacies of Collins' Parsing Model
- COMPUTATIONAL LINGUISTICS
"... This paper documents a large set of heretofore unpublished details Collins used in his parser, such that, along with Collins' thesis (Collins, 1999), this paper contains all information necessary to duplicate Collins' benchmark results. Indeed, these as-yet-unpublished details account for an 11% rel ..."
Abstract
-
Cited by 87 (1 self)
- Add to MetaCart
This paper documents a large set of heretofore unpublished details Collins used in his parser, such that, along with Collins' thesis (Collins, 1999), this paper contains all information necessary to duplicate Collins' benchmark results. Indeed, these as-yet-unpublished details account for an 11% relative reduction in error between a clean-room implementation of Collins' model and an implementation including all details. We also show a cleaner and equally--well-performing method for the handling of punctuation and conjunction, and reveal certain other probabilistic oddities about Collins' parser. We analyze not only the effect of the unpublished details, but also re-analyze the effect of certain well-known details, revealing that bilexical dependencies are barely used by the model and that head choice is not nearly as important to overall parsing performance as once thought. Finally, we perform experiments that show that the true discriminative power of lexicalization appears to lie in the fact that unlexicalized syntactic structures are generated conditioning on the head word and its part of speech
A novel use of statistical parsing to extract information from text
- ANLP
, 2000
"... Since 1995, a few statistical parsing algorithms have demonstrated a breakthrough in parsing accuracy, as measured against the UPenn TREEBANK as a gold standard. In this paper we report adapting a lexicalized, probabilistic context-free parser to information extraction and evaluate this new techniqu ..."
Abstract
-
Cited by 78 (4 self)
- Add to MetaCart
Since 1995, a few statistical parsing algorithms have demonstrated a breakthrough in parsing accuracy, as measured against the UPenn TREEBANK as a gold standard. In this paper we report adapting a lexicalized, probabilistic context-free parser to information extraction and evaluate this new technique on MUC-7 template elements and template relations.
Parsing Inside-Out
, 1998
"... Probabilistic Context-Free Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract
-
Cited by 65 (2 self)
- Add to MetaCart
Probabilistic Context-Free Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given non-terminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and n-best lists. We also present three novel uses for the inside and outside probabilities. T...
A* Parsing: Fast Exact Viterbi Parse Selection
- IN PROCEEDINGS OF THE HUMAN LANGUAGE TECHNOLOGY CONFERENCE AND THE NORTH AMERICAN ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (HLT-NAACL
, 2003
"... We present an extension of the classic A* search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating the probabilities of parse completions. We discuss various estimates and give efficient algorithms ..."
Abstract
-
Cited by 65 (3 self)
- Add to MetaCart
We present an extension of the classic A* search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating the probabilities of parse completions. We discuss various estimates and give efficient algorithms for computing them. On average-length Penn treebank sentences, our most detailed estimate reduces the total number of edges processed to less than 3% of that required by exhaustive parsing, and a simpler estimate, which requires less than a minute of precomputation, reduces the work to less than 5%. Unlike best-first and finite-beam methods for achieving this kind of speed-up, an A* method is guaranteed to find the most likely parse, not just an approximation. Our parser
Probabilistic Top-Down Parsing and Language Modeling
- Computational Linguistics
, 2004
"... This paper describes the functioning of a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
This paper describes the functioning of a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches to using syntactic structure for language modeling. A lexicalized probabilistic topdown parser is then presented, which performs very well, in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broad-coverage statistical parsers. A new language model that utilizes probabilistic top-down parsing is then outlined, and empirical results show that it improves upon previous work in test corpus perplexity. Interpolation with a trigram model yields an exceptional improvement relative to the improvement observed by other models, demonstrating the degree to which the information captured by our parsing model is orthogonal to that captured by a trigram model. A small recognition experiment also demonstrates the utility of the model
Semiring Parsing
- Computational Linguistics
, 1999
"... this paper is that all five of these commonly computed quantities can be described as elements of complete semirings (Kuich 1997). The relationship between grammars and semirings was discovered by Chomsky and Schtitzenberger (1963), and for parsing with the CKY algorithm, dates back to Teitelbaum ( ..."
Abstract
-
Cited by 50 (1 self)
- Add to MetaCart
this paper is that all five of these commonly computed quantities can be described as elements of complete semirings (Kuich 1997). The relationship between grammars and semirings was discovered by Chomsky and Schtitzenberger (1963), and for parsing with the CKY algorithm, dates back to Teitelbaum (1973). A complete semiring is a set of values over which a multiplicative operator and a commutative additive operator have been defined, and for which infinite summations are defined. For parsing algorithms satisfying certain conditions, the multiplicative and additive operations of any complete semiring can be used in place of/x and , and correct values will be returned. We will give a simple normal form for describing parsers, then precisely define complete semirings, and the conditions for correctness
Edge-Based Best-First Chart Parsing
- IN PROCEEDINGS OF THE SIXTH WORKSHOP ON VERY LARGE CORPORA
, 1998
"... Best-first probabilistic chart parsing attempts to parse efficiently by working on edges that are judged 'best' by some probabilistic figure of merit (FOM). Recent work has used proba- bilistic context-free grammars (PCFGs) to sign probabilities to constituents, and to use these probabilities as the ..."
Abstract
-
Cited by 45 (4 self)
- Add to MetaCart
Best-first probabilistic chart parsing attempts to parse efficiently by working on edges that are judged 'best' by some probabilistic figure of merit (FOM). Recent work has used proba- bilistic context-free grammars (PCFGs) to sign probabilities to constituents, and to use these probabilities as the starting point for the FOM. This paper extends this approach to us- ing a probabilistic FOM to judge edges (incomplete constituents), thereby giving a much finergrained control over parsing effort. We show how this can be accomplished in a particularly simple way using the common idea of binarizing the PCFG. The results obtained are about a factor of twenty improvement over the best prior results -- that is, our parser achieves equivalent results using one twentieth the number of edges. Furthermore we show that this improvement is obtained with parsing precision and recall levels superior to those achieved by exhaustive parsing.
Probabilistic Feature Grammars
- In Proceedings of the International Workshop on Parsing Technologies
, 1997
"... We present a new formalism, probabilistic feature grammar (PFG). PFGs combine most of the best properties of several other formalisms, including those of Collins, Magerman, and Charniak, and in experiments have comparable or better performance. PFGs generate features one at a time, probabilistically ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
We present a new formalism, probabilistic feature grammar (PFG). PFGs combine most of the best properties of several other formalisms, including those of Collins, Magerman, and Charniak, and in experiments have comparable or better performance. PFGs generate features one at a time, probabilistically, conditioning the probabilities of each feature on other features in a local context. Because the conditioning is local, efficient polynomial time parsing algorithms exist for computing inside, outside, and Viterbi parses. PFGs can produce probabilities of strings, making them potentially useful for language modeling. Precision and recall results are comparable to the state of the art with words, and the best reported without words. 1 Introduction Recently, many researchers have worked on statistical parsing techniques which try to capture additional context beyond that of simple probabilistic context-free grammars (PCFGs), including Magerman (1995), Charniak (1996), Collins (1996; 1997), ...
Statistical Parsing With an Automatically-Extracted Tree Adjoining Grammar
, 2000
"... We discuss the advantages of lexicalized tree-adjoining grammar as an alternative to lexicalized PCFG for statistical parsing, describing the induction of a probabilistic LTAG model from the Penn Treebank and evaluating its parsing performance. We find that this induction method is an improvement ov ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
We discuss the advantages of lexicalized tree-adjoining grammar as an alternative to lexicalized PCFG for statistical parsing, describing the induction of a probabilistic LTAG model from the Penn Treebank and evaluating its parsing performance. We find that this induction method is an improvement over the EM-based method of [Hwa, 1998], and that the induced model yields results comparable to lexicalized PCFG.

