Results 1 - 10
of
58
Accurate Unlexicalized Parsing
- IN PROCEEDINGS OF THE 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 2003
"... We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its ..."
Abstract
-
Cited by 422 (50 self)
- Add to MetaCart
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its
Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques
- IN PROCEEDINGS OF THE 40TH MEETING OF THE ACL
, 2002
"... We present a stochastic parsing system consisting of a Lexical-Functional Grammar (LFG), a constraint-based parser and a stochastic disambiguation model. We report on the results of applying this system to parsing the UPenn Wall Street Journal (WSJ) treebank. The model combines full and parti ..."
Abstract
-
Cited by 95 (8 self)
- Add to MetaCart
We present a stochastic parsing system consisting of a Lexical-Functional Grammar (LFG), a constraint-based parser and a stochastic disambiguation model. We report on the results of applying this system to parsing the UPenn Wall Street Journal (WSJ) treebank. The model combines full and partial parsing techniques to reach full grammar coverage on unseen data. The treebank annotations are used to provide partially labeled data for discriminative statistical estimation using exponential models. Disambiguation performance is evaluated by measuring matches of predicate-argument relations on two distinct test sets. On a gold standard of manually annotated f-structures for a subset of the WSJ treebank, this evaluation reaches 79% F-score. An evaluation on a gold standard of dependency relations for Brown corpus data achieves 76% F-score.
Generative Models for Statistical Parsing with Combinatory Categorial Grammar
- In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics
, 2002
"... This paper compares a number of generative probability models for a widecoverage Combinatory Categorial Grammar (CCG) parser. These models are trained and tested on a corpus obtained by translating the Penn Treebank trees into CCG normal-form derivations. According to an evaluation of unlabel ..."
Abstract
-
Cited by 60 (7 self)
- Add to MetaCart
This paper compares a number of generative probability models for a widecoverage Combinatory Categorial Grammar (CCG) parser. These models are trained and tested on a corpus obtained by translating the Penn Treebank trees into CCG normal-form derivations. According to an evaluation of unlabeled word-word dependencies, our best model achieves a performance of 89.9%, comparable to the figures given by Collins (1999) for a linguistically less expressive grammar. In contrast to Gildea (2001), we find a significant improvement from modeling wordword dependencies.
Parsing biomedical literature
- In Proceedings of the Second International Joint Conference on Natural Language Processing (IJCNLP-05), Jeju Island, Korea
, 2005
"... Abstract. We present a preliminary study of several parser adaptation techniques evaluated on the GENIA corpus of MEDLINE abstracts [1, 2]. We begin by observing that the Penn Treebank (PTB) is lexically impoverished when measured on various genres of scientific and technical writing, and that this ..."
Abstract
-
Cited by 35 (2 self)
- Add to MetaCart
Abstract. We present a preliminary study of several parser adaptation techniques evaluated on the GENIA corpus of MEDLINE abstracts [1, 2]. We begin by observing that the Penn Treebank (PTB) is lexically impoverished when measured on various genres of scientific and technical writing, and that this significantly impacts parse accuracy. To resolve this without requiring in-domain treebank data, we show how existing domain-specific lexical resources may be leveraged to augment PTB-training: part-of-speech tags, dictionary collocations, and namedentities. Using a state-of-the-art statistical parser [3] as our baseline, our lexically-adapted parser achieves a 14.2 % reduction in error. With oracleknowledge of named-entities, this error reduction improves to 21.2%. 1
Comparison of Evaluation Metrics for a Broad Coverage Parser LREC Workshop: Beyond PARSEVAL Towards Improved Evaluation Measures for Parsing Systems
, 2002
"... This paper reports on the use of two distinct evaluation metrics for assessing a stochastic parsing model consisting of a broad-coverage Lexical-Functional Grammar (LFG), an efficient constraint-based parser and a stochastic disambiguation model. The first evaluation metric measures matches of predi ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
This paper reports on the use of two distinct evaluation metrics for assessing a stochastic parsing model consisting of a broad-coverage Lexical-Functional Grammar (LFG), an efficient constraint-based parser and a stochastic disambiguation model. The first evaluation metric measures matches of predicate-argument relations in LFG f-structures (henceforth the LFG annotation scheme) to a gold standard of manually annotated f-structures for a subset of the UPenn Wall Street Journal treebank. The other metric maps predicate-argument relations in LFG f-structures to dependency relations (henceforth DR annotations) as proposed by Carroll et al. (Carroll et al., 1999). For evaluation, these relations are matched against Carroll et al.’s gold standard which was manually annnotated on a subset of the Brown corpus. The parser plus stochastic disambiguator gives an F-measure of 79 % (LFG) or 73 % (DR) on the WSJ test set. This shows that the two evaluation schemes are similar in spirit, although accuracy is impaired systematically by mapping one annotation scheme to the other. A systematic loss of accuracy is incurred also by corpus variation: Training the stochastic disambiguation model on WSJ data and testing on Carroll et al.’s Brown corpus data yields an F-score of 74 % (DR) for dependency-relation match. A variant of this measure comparable to the measure reported by Carroll et al. yields an F-measure of 76%. We examine divergences between annotation schemes aiming at a future improvement of methods for assessing parser quality. 1.
A distributional analysis of a lexicalized statistical parsing model
- In EMNLP
, 2004
"... This paper presents some of the first data visualizations and analysis of distributions for a lexicalized statistical parsing model, in order to better understand their nature. In the course of this analysis, we have paid particular attention to parameters that include bilexical dependencies. The pr ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
This paper presents some of the first data visualizations and analysis of distributions for a lexicalized statistical parsing model, in order to better understand their nature. In the course of this analysis, we have paid particular attention to parameters that include bilexical dependencies. The prevailing view has been that such statistics are very informative but suffer greatly from sparse data problems. By using a parser to constrain-parse its own output, and by hypothesizing and testing for distributional similarity with back-off distributions, we have evidence that finally explains that (a) bilexical statistics are actually getting used quite often but that (b) the distributions are so similar to those that do not include head words as to be nearly indistinguishable insofar as making parse decisions. Finally, our analysis has provided for the first time an effective way to do parameter selection for a generative lexicalized statistical parsing model. 1
Evaluating the Accuracy of an Unlexicalized Statistical Parser on the PARC DepBank
- In Proceedings of the Poster Session of COLING/ACL-06
, 2006
"... We evaluate the accuracy of an unlexicalized statistical parser, trained on 4K treebanked sentences from balanced data and tested on the PARC DepBank. We demonstrate that a parser which is competitive in accuracy (without sacrificing processing speed) can be quickly tuned without reliance on large i ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
We evaluate the accuracy of an unlexicalized statistical parser, trained on 4K treebanked sentences from balanced data and tested on the PARC DepBank. We demonstrate that a parser which is competitive in accuracy (without sacrificing processing speed) can be quickly tuned without reliance on large in-domain manuallyconstructed treebanks. This makes it more practical to use statistical parsers in applications that need access to aspects of predicate-argument structure. The comparison of systems using DepBank is not straightforward, so we extend and validate DepBank and highlight a number of representation and scoring issues for relational evaluation schemes. 1
QuestionBank: Creating a Corpus of Parse-Annotated Questions
- In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL (COLING-ACL-06
, 2006
"... This paper describes the development of QuestionBank, a corpus of 4000 parseannotated questions for (i) use in training parsers employed in QA, and (ii) evaluation of question parsing. We present a series of experiments to investigate the effectiveness of QuestionBank as both an exclusive and supple ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
This paper describes the development of QuestionBank, a corpus of 4000 parseannotated questions for (i) use in training parsers employed in QA, and (ii) evaluation of question parsing. We present a series of experiments to investigate the effectiveness of QuestionBank as both an exclusive and supplementary training resource for a state-of-the-art parser in parsing both question and non-question test sets. We introduce a new method for recovering empty nodes and their antecedents (capturing long distance dependencies) from parser output in CFG trees using LFG f-structure reentrancies. Our main findings are (i) using QuestionBank training data improves parser performance to 89.75 % labelled bracketing f-score, an increase of almost 11 % over the baseline; (ii) back-testing experiments on nonquestion data (Penn-II WSJ Section 23) shows that the retrained parser does not suffer a performance drop on non-question material; (iii) ablation experiments show that the size of training material provided by QuestionBank is sufficient to achieve optimal results; (iv) our method for recovering empty nodes captures long distance dependencies in questions from the ATIS corpus with high precision (96.82%) and low recall (39.38%). In summary, QuestionBank provides a useful new resource in parser-based QA research. 1
Uptraining for Accurate Deterministic Question Parsing
"... It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift- ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60 % labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance. 1
Supertagging and full parsing
- In: Proceedings of the Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+7
, 2004
"... We investigate an approach to parsing in which lexical information is used only in a first phase, supertagging, in which lexical syntactic properties are determined without building structure. In the second phase, the best parse tree is determined without using lexical information. We investigate di ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We investigate an approach to parsing in which lexical information is used only in a first phase, supertagging, in which lexical syntactic properties are determined without building structure. In the second phase, the best parse tree is determined without using lexical information. We investigate different probabilistic models for adjunction, and we show that, assuming hypothetically perfect performance in the first phase, the error rate on dependency arc attachment can be reduced to 2.3 % using a full chart parser. This is an improvement of about 50% over previously reported results using a simple heuristic parser. 1

