Results 1  10
of
33
HeadDriven Statistical Models for Natural Language Parsing
, 1999
"... Mitch Marcus was a wonderful advisor. He gave consistently good advice, and allowed an ideal level of intellectual freedom in pursuing ideas and research topics. I would like to thank the members of my thesis committee Aravind Joshi, Mark Liberman, Fernando Pereira and Mark Steedman  for the remar ..."
Abstract

Cited by 955 (16 self)
 Add to MetaCart
Mitch Marcus was a wonderful advisor. He gave consistently good advice, and allowed an ideal level of intellectual freedom in pursuing ideas and research topics. I would like to thank the members of my thesis committee Aravind Joshi, Mark Liberman, Fernando Pereira and Mark Steedman  for the remarkable breadth and depth of their feedback. I had countless impromptu but in uential discussions with Jason Eisner, Dan Melamed and Adwait Ratnaparkhi in the LINC lab. They also provided feedback on many drafts of papers and thesis chapters. Paola Merlo pushed me to think about many new angles of the research. Dimitrios Samaras gave invaluable feedback on many portions of the work. Thanks to James Brooks for his contribution to the work that comprises chapter 5 of this thesis. The community of faculty, students and visitors involved with the Institute for Research in Cognitive Science at Penn provided an intensely varied and stimulating environment. I would like to thank them collectively. Some deserve special mention for discussions that contributed quite directly to this research: Breck Baldwin, Srinivas Bangalore, Dan
Coarsetofine nbest parsing and MaxEnt discriminative reranking
 In ACL
, 2005
"... Discriminative reranking is one method for constructing highperformance statistical parsers (Collins, 2000). A discriminative reranker requires a source of candidate parses for each sentence. This paper describes a simple yet novel method for constructing sets of 50best parses based on a co ..."
Abstract

Cited by 385 (14 self)
 Add to MetaCart
Discriminative reranking is one method for constructing highperformance statistical parsers (Collins, 2000). A discriminative reranker requires a source of candidate parses for each sentence. This paper describes a simple yet novel method for constructing sets of 50best parses based on a coarsetofine generative parser (Charniak, 2000). This method generates 50best lists that are of substantially higher quality than previously obtainable.
Intricacies of Collins’ parsing model
, 2003
"... This article documents a large set of heretofore unpublished details Collins used in his parser, such that, along with Collins ’ (1999) thesis, this article contains all information necessary to duplicate Collins ’ benchmark results. Indeed, these asyetunpublished details account for an 11 % relat ..."
Abstract

Cited by 111 (1 self)
 Add to MetaCart
This article documents a large set of heretofore unpublished details Collins used in his parser, such that, along with Collins ’ (1999) thesis, this article contains all information necessary to duplicate Collins ’ benchmark results. Indeed, these asyetunpublished details account for an 11 % relative increase in error from an implementation including all details to a cleanroom implementation of Collins ’ model. We also show a cleaner and equally wellperforming method for the handling of punctuation and conjunction and reveal certain other probabilistic oddities about Collins ’ parser. We not only analyze the effect of the unpublished details, but also reanalyze the effect of certain wellknown details, revealing that bilexical dependencies are barely used by the model and that head choice is not nearly as important to overall parsing performance as once thought. Finally, we perform experiments that show that the true discriminative power of lexicalization appears to lie in the fact that unlexicalized syntactic structures are generated conditioning on the headword and its part of speech. 1.
A novel use of statistical parsing to extract information from text
 ANLP
, 2000
"... Since 1995, a few statistical parsing algorithms have demonstrated a breakthrough in parsing accuracy, as measured against the UPenn TREEBANK as a gold standard. In this paper we report adapting a lexicalized, probabilistic contextfree parser to information extraction and evaluate this new techniqu ..."
Abstract

Cited by 91 (4 self)
 Add to MetaCart
Since 1995, a few statistical parsing algorithms have demonstrated a breakthrough in parsing accuracy, as measured against the UPenn TREEBANK as a gold standard. In this paper we report adapting a lexicalized, probabilistic contextfree parser to information extraction and evaluate this new technique on MUC7 template elements and template relations.
Parsing InsideOut
, 1998
"... Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract

Cited by 82 (2 self)
 Add to MetaCart
Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given nonterminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and nbest lists. We also present three novel uses for the inside and outside probabilities. T...
A* Parsing: Fast Exact Viterbi Parse Selection
 IN PROCEEDINGS OF THE HUMAN LANGUAGE TECHNOLOGY CONFERENCE AND THE NORTH AMERICAN ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (HLTNAACL
, 2003
"... We present an extension of the classic A* search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating the probabilities of parse completions. We discuss various estimates and give efficient algorithms ..."
Abstract

Cited by 78 (3 self)
 Add to MetaCart
We present an extension of the classic A* search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating the probabilities of parse completions. We discuss various estimates and give efficient algorithms for computing them. On averagelength Penn treebank sentences, our most detailed estimate reduces the total number of edges processed to less than 3% of that required by exhaustive parsing, and a simpler estimate, which requires less than a minute of precomputation, reduces the work to less than 5%. Unlike bestfirst and finitebeam methods for achieving this kind of speedup, an A* method is guaranteed to find the most likely parse, not just an approximation. Our parser
Probabilistic TopDown Parsing and Language Modeling
 Computational Linguistics
, 2004
"... This paper describes the functioning of a broadcoverage probabilistic topdown parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches ..."
Abstract

Cited by 66 (1 self)
 Add to MetaCart
This paper describes the functioning of a broadcoverage probabilistic topdown parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches to using syntactic structure for language modeling. A lexicalized probabilistic topdown parser is then presented, which performs very well, in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broadcoverage statistical parsers. A new language model that utilizes probabilistic topdown parsing is then outlined, and empirical results show that it improves upon previous work in test corpus perplexity. Interpolation with a trigram model yields an exceptional improvement relative to the improvement observed by other models, demonstrating the degree to which the information captured by our parsing model is orthogonal to that captured by a trigram model. A small recognition experiment also demonstrates the utility of the model
Semiring Parsing
 Computational Linguistics
, 1999
"... this paper is that all five of these commonly computed quantities can be described as elements of complete semirings (Kuich 1997). The relationship between grammars and semirings was discovered by Chomsky and Schtitzenberger (1963), and for parsing with the CKY algorithm, dates back to Teitelbaum ( ..."
Abstract

Cited by 64 (1 self)
 Add to MetaCart
this paper is that all five of these commonly computed quantities can be described as elements of complete semirings (Kuich 1997). The relationship between grammars and semirings was discovered by Chomsky and Schtitzenberger (1963), and for parsing with the CKY algorithm, dates back to Teitelbaum (1973). A complete semiring is a set of values over which a multiplicative operator and a commutative additive operator have been defined, and for which infinite summations are defined. For parsing algorithms satisfying certain conditions, the multiplicative and additive operations of any complete semiring can be used in place of/x and , and correct values will be returned. We will give a simple normal form for describing parsers, then precisely define complete semirings, and the conditions for correctness
EdgeBased BestFirst Chart Parsing
 IN PROCEEDINGS OF THE SIXTH WORKSHOP ON VERY LARGE CORPORA
, 1998
"... Bestfirst probabilistic chart parsing attempts to parse efficiently by working on edges that are judged 'best' by some probabilistic figure of merit (FOM). Recent work has used proba bilistic contextfree grammars (PCFGs) to sign probabilities to constituents, and to use these probabilities as the ..."
Abstract

Cited by 53 (4 self)
 Add to MetaCart
Bestfirst probabilistic chart parsing attempts to parse efficiently by working on edges that are judged 'best' by some probabilistic figure of merit (FOM). Recent work has used proba bilistic contextfree grammars (PCFGs) to sign probabilities to constituents, and to use these probabilities as the starting point for the FOM. This paper extends this approach to us ing a probabilistic FOM to judge edges (incomplete constituents), thereby giving a much finergrained control over parsing effort. We show how this can be accomplished in a particularly simple way using the common idea of binarizing the PCFG. The results obtained are about a factor of twenty improvement over the best prior results  that is, our parser achieves equivalent results using one twentieth the number of edges. Furthermore we show that this improvement is obtained with parsing precision and recall levels superior to those achieved by exhaustive parsing.
Efficient Deformable Template Detection and Localization without User Initialization
, 1998
"... A novel deformable template is presented which detects the boundary of an open hand in a grayscale image without initialization by the user. A dynamic programming algorithm enhanced by pruning techniques finds the hand contour in the image in as little as 19 seconds on a Pentium 150. The template is ..."
Abstract

Cited by 44 (12 self)
 Add to MetaCart
A novel deformable template is presented which detects the boundary of an open hand in a grayscale image without initialization by the user. A dynamic programming algorithm enhanced by pruning techniques finds the hand contour in the image in as little as 19 seconds on a Pentium 150. The template is translation and rotationinvariant and accomodates shape deformation, significant occlusion and background clutter, and the presence of multiple hands. 2 Symbols Boldface letters, e.g. x, denote vectors. P (xjy) denotes conditional probability of x given y. p a denotes the square root of a. P denotes summation. Q denotes repeated product. R denotes integration. ? denotes "perpendicular to." !, ? denote less than and greater than, respectively. rI(x) denotes the gradient of I with respect to x. / denotes "proportional to." denotes "approximately equal to." argmax x f(x) denotes the value of x that maximizes f(x). f ? g denotes the convolution of f with g. denotes the ...