Results 1 - 10
of
40
Head-Driven Statistical Models for Natural Language Parsing
, 2003
"... This article describes three statistical models for natural language parsing. The models extend methods from probabilistic context-free grammars to lexicalized grammars, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down ..."
Abstract
-
Cited by 780 (13 self)
- Add to MetaCart
This article describes three statistical models for natural language parsing. The models extend methods from probabilistic context-free grammars to lexicalized grammars, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree. Independence assumptions then lead to parameters that encode the X-bar schema, subcategorization, ordering of complements, placement of adjuncts, bigram lexical dependencies, wh-movement, and preferences for close attachment. All of these preferences are expressed by probabilities conditioned on lexical heads. The models are evaluated on the Penn Wall Street Journal Treebank, showing that their accuracy is competitive with other models in the literature. To gain a better understanding of the models, we also give results on different constituent types, as well as a breakdown of precision/recall results in recovering various types of dependencies. We analyze various characteristics of the models through experiments on parsing accuracy, by collecting frequencies of various structures in the treebank, and through linguistically motivated examples. Finally, we compare the models to others that have been applied to parsing the treebank, aiming to give some explanation of the difference in performance of the various models
Calibrating features for semantic role labeling
- In Proceedings of EMNLP 2004
, 2004
"... This paper takes a critical look at the features used in the semantic role tagging literature and show that the information in the input, generally a syntactic parse tree, has yet to be fully exploited. We propose an additional set of features and our experiments show that these features lead to fai ..."
Abstract
-
Cited by 67 (4 self)
- Add to MetaCart
This paper takes a critical look at the features used in the semantic role tagging literature and show that the information in the input, generally a syntactic parse tree, has yet to be fully exploited. We propose an additional set of features and our experiments show that these features lead to fairly significant improvements in the tasks we performed. We further show that different features are needed for different subtasks. Finally, we show that by using a Maximum Entropy classifier and fewer features, we achieved results comparable with the best previously reported results obtained with SVM models. We believe this is a clear indication that developing features that capture the right kind of information is crucial to advancing the stateof-the-art in semantic analysis. 1
A study on convolution kernels for shallow semantic parsing
- in Proceedings of ACL’04
, 2004
"... In this paper we have designed and experimented novel convolution kernels for automatic classification of predicate arguments. Their main property is the ability to process structured representations. Support Vector Machines (SVMs), using a combination of such kernels and the flat feature kernel, cl ..."
Abstract
-
Cited by 56 (6 self)
- Add to MetaCart
In this paper we have designed and experimented novel convolution kernels for automatic classification of predicate arguments. Their main property is the ability to process structured representations. Support Vector Machines (SVMs), using a combination of such kernels and the flat feature kernel, classify Prop-Bank predicate arguments with accuracy higher than the current argument classification stateof-the-art. Additionally, experiments on FrameNet data have shown that SVMs are appealing for the classification of semantic roles even if the proposed kernels do not produce any improvement. 1
Cascaded Grammatical Relation Assignment
, 1999
"... In this paper we discuss cascaded Memory-Based grammatical relations assignment. In the first stages of the cascade, we find chunks of several types (NP,VP,ADJPADVP,PP) and label them with their adverbial function (e.g. local, temporal). In the last stage, we assign grammatical relations to pairs of ..."
Abstract
-
Cited by 49 (14 self)
- Add to MetaCart
In this paper we discuss cascaded Memory-Based grammatical relations assignment. In the first stages of the cascade, we find chunks of several types (NP,VP,ADJPADVP,PP) and label them with their adverbial function (e.g. local, temporal). In the last stage, we assign grammatical relations to pairs of chunks. We studied the effect of adding several levels to this cascaded classifier and we found that even the less peribrining chunkors enhanced the performance of the relation finder.
A Uniform Method of Grammar Extraction and Its Applications
, 2000
"... Grammars are core elements of many NLP applications. In this paper, we present a system that automatically extracts lexicalized grammars from annotated corpora. The data produced by this system have been used in several tasks, such as training NLP tools (such as Supertaggers) and estimating the cove ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
Grammars are core elements of many NLP applications. In this paper, we present a system that automatically extracts lexicalized grammars from annotated corpora. The data produced by this system have been used in several tasks, such as training NLP tools (such as Supertaggers) and estimating the coverage of harid-crafted grammars. We report experimental results on two of those tasks and compare our approaches with related work.
Fast NP Chunking Using Memory-Based Learning Techniques
- In Proceedings of BENELEARN'98
, 1998
"... In this paper we discuss the application of Memory-Based Learning (MBL) to fast NP chunking. We first discuss the application of a fast decision tree variant of MBL (IGTree) on the dataset described in (Ramshaw and Marcus, 1995), which consists of roughly 50,000 test and 200,000 train items. In a se ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
In this paper we discuss the application of Memory-Based Learning (MBL) to fast NP chunking. We first discuss the application of a fast decision tree variant of MBL (IGTree) on the dataset described in (Ramshaw and Marcus, 1995), which consists of roughly 50,000 test and 200,000 train items. In a second series of experiments we used an architecture of two cascaded IGTrees. In the second level of this cascaded classifier we added context predictions as extra features so that incorrect predictions from the first level can be corrected, yielding a 97.2% generalisation accuracy with training and testing times in the order of seconds to minutes. Submission Type: regular paper Topic Areas: robust parsing, NP chunking, memory-based learning Author of Record: Jorn Veenstra Under consideration for other conferences (specify)? no Fast NP Chunking Using Memory-Based Learning Techniques Abstract In this paper we discuss the application of Memory-Based Learning (MBL) to fast NP chunking. We fir...
Multidimensional Transformation-Based Learning
- Conference on Natural Language Learning
, 2001
"... This paper presents a novel method that allows a machine learning algorithm following the transformation-based learning paradigm (Brill, 1995) to be applied to multiple classication tasks by training jointly and simultaneously on all elds. The motivation for constructing such a system stems from the ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
This paper presents a novel method that allows a machine learning algorithm following the transformation-based learning paradigm (Brill, 1995) to be applied to multiple classication tasks by training jointly and simultaneously on all elds. The motivation for constructing such a system stems from the observation that many tasks in natural language processing are naturally composed of multiple subtasks which need to be resolved simultaneously; also tasks usually learned in isolation can possibly benet from being learned in a joint framework, as the signals for the extra tasks usually constitute inductive bias.
Detecting Errors in Part-of-Speech Annotation
- In EACL ’03: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics
, 2003
"... We propose a new method for detecting errors in "gold-standard" part-ofspeech annotation. The approach locates errors with high precision based on n-grams occurring in the corpus with multiple taggings. Two further techniques, closed-class analysis and finitestate tagging guide patterns, are ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We propose a new method for detecting errors in "gold-standard" part-ofspeech annotation. The approach locates errors with high precision based on n-grams occurring in the corpus with multiple taggings. Two further techniques, closed-class analysis and finitestate tagging guide patterns, are discussed.
Relabeling syntax trees to improve syntax-based machine translation quality
- IN: PROC. HLT/NAACL, ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 2006
"... We identify problems with the Penn Treebank that render it imperfect for syntaxbased machine translation and propose methods of relabeling the syntax trees to improve translation quality. We develop a system incorporating a handful of relabeling strategies that yields a statistically significant imp ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We identify problems with the Penn Treebank that render it imperfect for syntaxbased machine translation and propose methods of relabeling the syntax trees to improve translation quality. We develop a system incorporating a handful of relabeling strategies that yields a statistically significant improvement of 2.3 BLEU points over a baseline syntax-based system.
Discourse-new detectors for definite description resolution: a survey and preliminary proposal
- In Proceedings of the Refrence Resolution Workshop at ACL’04
, 2004
"... Vieira and Poesio (2000) proposed an algorithm for definite description (DD) resolution that incorporates a number of heuristics for detecting discoursenew descriptions. The inclusion of such detectors was motivated by the observation that more than 50 % of definite descriptions (DDs) in an average ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Vieira and Poesio (2000) proposed an algorithm for definite description (DD) resolution that incorporates a number of heuristics for detecting discoursenew descriptions. The inclusion of such detectors was motivated by the observation that more than 50 % of definite descriptions (DDs) in an average corpus are discourse new (Poesio and Vieira, 1998), but whereas the inclusion of detectors for non-anaphoric pronouns in algorithms such as Lappin and Leass ’ (1994) leads to clear improvements in precision, the improvements in anaphoric DD resolution (as opposed to classification) brought about by the detectors were rather small. In fact, Ng and Cardie (2002a) challenged the motivation for the inclusion of such detectors, reporting no improvements, or even worse performance. We re-examine the literature on the topic in detail, and propose a revised algorithm, taking advantage of the improved discourse-new detection techniques developed by Uryupina (2003). 1

