Results 1 - 10
of
34
Head-Driven Statistical Models for Natural Language Parsing
, 2003
"... This article describes three statistical models for natural language parsing. The models extend methods from probabilistic context-free grammars to lexicalized grammars, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down ..."
Abstract
-
Cited by 780 (13 self)
- Add to MetaCart
This article describes three statistical models for natural language parsing. The models extend methods from probabilistic context-free grammars to lexicalized grammars, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree. Independence assumptions then lead to parameters that encode the X-bar schema, subcategorization, ordering of complements, placement of adjuncts, bigram lexical dependencies, wh-movement, and preferences for close attachment. All of these preferences are expressed by probabilities conditioned on lexical heads. The models are evaluated on the Penn Wall Street Journal Treebank, showing that their accuracy is competitive with other models in the literature. To gain a better understanding of the models, we also give results on different constituent types, as well as a breakdown of precision/recall results in recovering various types of dependencies. We analyze various characteristics of the models through experiments on parsing accuracy, by collecting frequencies of various structures in the treebank, and through linguistically motivated examples. Finally, we compare the models to others that have been applied to parsing the treebank, aiming to give some explanation of the difference in performance of the various models
Three Generative, Lexicalised Models for Statistical Parsing
, 1997
"... In this paper we first propose a new statistical parsing model, which is a generative model of lexicalised context-free gram- mar. We then extend the model to in- clude a probabilistic treatment of both subcategorisation and wh~movement. Results on Wall Street Journal text show that the parse ..."
Abstract
-
Cited by 427 (7 self)
- Add to MetaCart
In this paper we first propose a new statistical parsing model, which is a generative model of lexicalised context-free gram- mar. We then extend the model to in- clude a probabilistic treatment of both subcategorisation and wh~movement. Results on Wall Street Journal text show that the parser performs at 88.1/87.5% constituent precision/recall, an average improvement of 2.3% over (Collins 96).
Discriminative Reranking for Natural Language Parsing
, 2005
"... This article considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this i ..."
Abstract
-
Cited by 220 (8 self)
- Add to MetaCart
This article considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this initial ranking, using additional features of the tree as evidence. The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes these features into account. We introduce a new method for the reranking task, based on the boosting approach to ranking problems described in Freund et al. (1998). We apply the boosting method to parsing the Wall Street Journal treebank. The method combined the log-likelihood under a baseline model (that of Collins [1999]) with evidence from an additional 500,000 features over parse trees that were not included in the original model. The new model achieved 89.75 % F-measure, a 13 % relative decrease in F-measure error over the baseline model’s score of 88.2%. The article also introduces a new algorithm for the boosting approach which takes advantage of the sparsity of the feature space in the parsing data. Experiments show significant efficiency gains for the new algorithm over the obvious implementation of the boosting approach. We argue that the method is an appealing alternative—in terms of both simplicity and efficiency—to work on feature selection methods within log-linear (maximum-entropy) models. Although the experiments in this article are on natural language parsing (NLP), the approach should be applicable to many other NLP problems which are naturally framed as ranking tasks, for example, speech recognition, machine translation, or natural language generation.
A novel use of statistical parsing to extract information from text
- ANLP
, 2000
"... Since 1995, a few statistical parsing algorithms have demonstrated a breakthrough in parsing accuracy, as measured against the UPenn TREEBANK as a gold standard. In this paper we report adapting a lexicalized, probabilistic context-free parser to information extraction and evaluate this new techniqu ..."
Abstract
-
Cited by 78 (4 self)
- Add to MetaCart
Since 1995, a few statistical parsing algorithms have demonstrated a breakthrough in parsing accuracy, as measured against the UPenn TREEBANK as a gold standard. In this paper we report adapting a lexicalized, probabilistic context-free parser to information extraction and evaluate this new technique on MUC-7 template elements and template relations.
Training and Scaling Preference Functions For Disambiguation
- COMPUTATIONAL LINGUISTICS
, 1994
"... ..."
Converting Dependency Structures to Phrase Structures
, 2001
"... this paper, we address the relationship between dependency structures and phrase structures from a practical perspective; namely, the exploration of different algorithms that convert dependency structures to phrase structures and the evaluation of their performance against an existing Treebank. This ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
this paper, we address the relationship between dependency structures and phrase structures from a practical perspective; namely, the exploration of different algorithms that convert dependency structures to phrase structures and the evaluation of their performance against an existing Treebank. This work not only provides ways to convert Treebanks from one type of representation to the other, but also clarifies the differences in representational coverage of the two approaches
An Efficient Implementation of a New DOP Model
- In EACL
, 2003
"... Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which ou ..."
Abstract
-
Cited by 27 (6 self)
- Add to MetaCart
Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which outperforms each of them separately. Together with a PCFGreduction of DOP we obtain improved accuracy and efficiency on the Wall Street Journal treebank. Our results show an 11% relative reduction in error rate over previous models, and an average processing time of 3.6 seconds per WSJ sentence.
What is the Minimal Set of Fragments that Achieves Maximal Parse Accuracy?
- IN PROCEEDINGS OF ACL 2001
, 2001
"... We aim at finding the minimal set of fragments which achieves maximal parse accuracy in Data Oriented Parsing. Experiments with the Penn Wall Street Journal treebank show that counts of almost arbitrary fragments within parse trees are important, leading to improved parse accuracy over previo ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
We aim at finding the minimal set of fragments which achieves maximal parse accuracy in Data Oriented Parsing. Experiments with the Penn Wall Street Journal treebank show that counts of almost arbitrary fragments within parse trees are important, leading to improved parse accuracy over previous models tested on this treebank (a precision of 90.8% and a recall of 90.6%). We isolate some dependency relations which previous models neglect but which contribute to higher parse accuracy.
A Unified Model of Structural Organization in Language and Music
, 2002
"... Is there a general model that can predict the perceived phrase structure in language and music? While it is usually assumed that humans have separate faculties for language and music, this work focuses on the commonalities rather than on the differences between these modalities, aiming at finding ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
Is there a general model that can predict the perceived phrase structure in language and music? While it is usually assumed that humans have separate faculties for language and music, this work focuses on the commonalities rather than on the differences between these modalities, aiming at finding a deeper "faculty". Our key idea is that the perceptual system strives for the simplest structure (the "simplicity principle"), but in doing so it is biased by the likelihood of previous structures (the "likelihood principle"). We present a series of dataoriented pro'sing (DOP) models that combine these two principles and that are tested on the Penn Treebank and the Essen Folksong Collection. Our experiments show that (1) a combination of the two principles outperforms the use of either of them, and (2) exactly the same model with the same parameter setting achieves maximum accuracy for both language and music. We argue that our results suggest an interesting parallel between linguistic and musical structuring.
Combining Semantic And Syntactic Structure For Language Modeling
- Proceedings ICSLP-2000
, 2000
"... Structured language models for speech recognition have been shown to remedy the weaknesses of n -gram models. All current structured language models, however, are limited in that they do not take into account dependencies between non-headwords. We show that non-headword dependencies contribute signi ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
Structured language models for speech recognition have been shown to remedy the weaknesses of n -gram models. All current structured language models, however, are limited in that they do not take into account dependencies between non-headwords. We show that non-headword dependencies contribute significantly to improved word error rate, and that a data-oriented parsing model trained on semantically and syntactically annotated data can exploit these dependencies. This paper contains the first published experiments with a data-oriented parsing model trained by means of a maximum likelihood reestimation procedure. 1. INTRODUCTION Structured language models for speech recognition have recently gained a considerable interest. They have been shown to outperform the 3-gram language model on various domains and they can be efficiently parsed in a left-to-right manner (Chelba & Jelinek 1998; Chelba 2000). Although it has been reported that higher order n-gram models perform as well as structur...

