• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Statistical Parsing with an Automatically Extracted Tree Adjoining Grammar, in "Data-Oriented Parsing (2003)

by D CHIANG
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 19
Next 10 →

An Efficient Implementation of a New DOP Model

by Rens Bod - In EACL , 2003
"... Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which ou ..."
Abstract - Cited by 27 (6 self) - Add to MetaCart
Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which outperforms each of them separately. Together with a PCFGreduction of DOP we obtain improved accuracy and efficiency on the Wall Street Journal treebank. Our results show an 11% relative reduction in error rate over previous models, and an average processing time of 3.6 seconds per WSJ sentence.

What is the Minimal Set of Fragments that Achieves Maximal Parse Accuracy?

by Rens Bod - IN PROCEEDINGS OF ACL 2001 , 2001
"... We aim at finding the minimal set of fragments which achieves maximal parse accuracy in Data Oriented Parsing. Experiments with the Penn Wall Street Journal treebank show that counts of almost arbitrary fragments within parse trees are important, leading to improved parse accuracy over previo ..."
Abstract - Cited by 24 (2 self) - Add to MetaCart
We aim at finding the minimal set of fragments which achieves maximal parse accuracy in Data Oriented Parsing. Experiments with the Penn Wall Street Journal treebank show that counts of almost arbitrary fragments within parse trees are important, leading to improved parse accuracy over previous models tested on this treebank (a precision of 90.8% and a recall of 90.6%). We isolate some dependency relations which previous models neglect but which contribute to higher parse accuracy.

Efficient Parsing of DOP with PCFG-reductions

by Joshua Goodman, Rens Bod, Remko Scha, R. Bod, R. Scha , 2003
"... Contents R. Bod, R. Scha and K. Sima'an PART I: The Basic Data-Oriented Parsing Model 1. A DOP model for phrase-structure trees R. Bod and R. Scha 2. Probability models for DOP 3. Encoding frequency information in stochastic parsing models J. Carroll and D. Weir PART II: Computational Is ..."
Abstract - Cited by 17 (0 self) - Add to MetaCart
Contents R. Bod, R. Scha and K. Sima'an PART I: The Basic Data-Oriented Parsing Model 1. A DOP model for phrase-structure trees R. Bod and R. Scha 2. Probability models for DOP 3. Encoding frequency information in stochastic parsing models J. Carroll and D. Weir PART II: Computational Issues 1. Computational complexity of disambiguation under DOP 2. Parsing DOP with Monte Carlo techniques J. Chappelier and M. Rajman 3. Towards efficient Monte Carlo parsing 4. Efficient parsing of DOP with PCFG-reductions J. Goodman 5. An approximation of DOP through memory-based learning G. de Pauw 6. Compositional partial parsing by memory-based sequence learning I. Dagan and Y. Krymolowsky PART III: Richer Models 1. A head-driven data-oriented approach to lexical dependency 2. A DOP model for Lexical-Functional Grammar representations R. Bod and R. Kaplan 3. A data-driven approach to Head-driven Phrase-Structure G. Neumann 4. Tree-Adjoining Grammars and its applic

Do All Fragments Count?

by Rens Bod - Natural Language Engineering , 2003
"... We aim at finding the minimal set of fragments which achieves maximal parse accuracy in Data Oriented Parsing (DOP). Experiments with the Penn Wall Street Journal (WSJ) treebank show that counts of almost arbitrary fragments within parse trees are important, leading to improved parse accuracy over p ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
We aim at finding the minimal set of fragments which achieves maximal parse accuracy in Data Oriented Parsing (DOP). Experiments with the Penn Wall Street Journal (WSJ) treebank show that counts of almost arbitrary fragments within parse trees are important, leading to improved parse accuracy over previous models tested on this treebank. We isolate a number of dependency relations which previous models neglect but which contribute to higher accuracy. We show that the history of statistical parsing models displays a tendency towards using more and larger fragments from training data.

Experimental Evaluation of LTAG-based Features for Semantic Role Labeling

by Yudong Liu, Anoop Sarkar
"... This paper proposes the use of Lexicalized Tree-Adjoining Grammar (LTAG) formalism as an important additional source of features for the Semantic Role Labeling (SRL) task. Using a set of one-vs-all Support Vector Machines (SVMs), we evaluate these LTAG-based features. Our experiments show that LTAG- ..."
Abstract - Cited by 4 (3 self) - Add to MetaCart
This paper proposes the use of Lexicalized Tree-Adjoining Grammar (LTAG) formalism as an important additional source of features for the Semantic Role Labeling (SRL) task. Using a set of one-vs-all Support Vector Machines (SVMs), we evaluate these LTAG-based features. Our experiments show that LTAG-based features can improve SRL accuracy significantly. When compared with the best known set of features that are used in state of the art SRL systems we obtain an improvement in F-score from 82.34 % to 85.25%.

TAG, Dynamic Programming, and the Perceptron for Efficient, Feature-rich Parsing

by Xavier Carreras, Michael Collins, Terry Koo , 2008
"... We describe a parsing approach that makes use of the perceptron algorithm, in conjunction with dynamic programming methods, to recover full constituent-based parse trees. The formalism allows a rich set of parse-tree features, including PCFG-based features, bigram and trigram dependency features, an ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
We describe a parsing approach that makes use of the perceptron algorithm, in conjunction with dynamic programming methods, to recover full constituent-based parse trees. The formalism allows a rich set of parse-tree features, including PCFG-based features, bigram and trigram dependency features, and surface features. A severe challenge in applying such an approach to full syntactic parsing is the efficiency of the parsing algorithms involved. We show that efficient training is feasible, using a Tree Adjoining Grammar (TAG) based parsing formalism. A lower-order dependency parsing model is used to restrict the search space of the full model, thereby making it efficient. Experiments on the Penn WSJ treebank show that the model achieves state-of-the-art performance, for both constituent and dependency accuracy.

Combining Labeled and Unlabeled Data in Statistical Natural Language Parsing

by Anoop Sarkar, Mike Collins, Adwait Ratnaparkhi, Mark Dras, David Chiang, Dan Bikel, Tom Morton, Anoop Sarkar , 2002
"... Prof. Aravind Joshi, my dissertation advisor has been my guide and mentor for the entire time that I spent at Penn. I thank him for all his academic help and personal kindness. The external member on my dissertation committee was Steven Abney, whose suggestions and advice have made the ideas present ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Prof. Aravind Joshi, my dissertation advisor has been my guide and mentor for the entire time that I spent at Penn. I thank him for all his academic help and personal kindness. The external member on my dissertation committee was Steven Abney, whose suggestions and advice have made the ideas presented here stronger. My dissertation committee members from Penn: Mitch Marcus, Mark Liberman and Martha Palmer provided questions whose answers shaped my dissertation proposal into the finished form in front of you. Many thanks to my academic collaborators; the work on prefix probabilities was done with Mark-Jan Nederhof and Giorgio Satta when they visited IRCS in 1998, the work on subcategorization frame learning was done in collaboration with Daniel Zeman when he visited IRCS in 2000. Thanks to B. Srinivas whose previous work provided the path to the experimental work in this dissertation. Thanks also to Paola Merlo and Suzanne Stevenson for discussions on their work on verb alternation classes. I also acknowledge the help of Woottiporn Tripasai in the extension of their work presented in this dissertation. Thanks to

Cross Parser Evaluation and Tagset Variation: a French Treebank Study

by Djamé Seddah, Marie C, Benoît Crabbé
"... This paper presents preliminary investigations on the statistical parsing of French by bringing a complete evaluation on French data of the main probabilistic lexicalized and unlexicalized parsers first designed on the Penn Treebank. We adapted the parsers on the two existing treebanks of French (Ab ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
This paper presents preliminary investigations on the statistical parsing of French by bringing a complete evaluation on French data of the main probabilistic lexicalized and unlexicalized parsers first designed on the Penn Treebank. We adapted the parsers on the two existing treebanks of French (Abeillé et al., 2003; Schluter and van Genabith, 2007). To our knowledge, mostly all of the results reported here are state-of-the-art for the constituent parsing of French on every available treebank. Regarding the algorithms, the comparisons show that lexicalized parsing models are outperformed by the unlexicalized Berkeley parser. Regarding the treebanks, we observe that, depending on the parsing model, a tag set with specific features has direct influence over evaluation results. We show that the adapted lexicalized parsers do not share the same sensitivity towards the amount of lexical material used for training, thus questioning the relevance of using only one lexicalized model to study the usefulness of lexicalization for the parsing of French. 1

Towards Unifying Perception and Cognition: The Ubiquity of Trees. Prepublication

by Rens Bod , 2005
"... Is there a single mechanism that underlies all perceptual and cognitive processing? This paper aims to solve a small part of Newell's challenge (A. Newell 1990, Unified Theories of Cognition, Harvard University Press) and proposes a model that unifies three different modalities: language, music and ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Is there a single mechanism that underlies all perceptual and cognitive processing? This paper aims to solve a small part of Newell's challenge (A. Newell 1990, Unified Theories of Cognition, Harvard University Press) and proposes a model that unifies three different modalities: language, music and problem-solving. In doing so, we will focus on tree structures. Trees are ubiquitous in modeling high-level perception and cognition and have been used to represent grouping structures in linguistic, musical and visual perception and deductive structures in reasoning, learning and problem solving. We will show that an instantiation of the Data-Oriented Parsing (DOP) framework can accurately predict the correct tree structure for linguistic utterances, musical pieces and physics problems. The key idea of the DOP framework is that new input is analyzed by combining subtrees from a representative corpus of previous trees. While the labeling of the trees and the details of the combination operation may differ across the modalities, we argue that there is one model for predicting the tree that humans come up with. We report on experiments with manually annotated corpora for the three modalities, showing that the best performing model is the one which takes into account subtrees of arbitrary size and which selects the most probable tree from among the shortest derivations of an input.

The Data-Oriented Parsing Approach: Theory and Application

by Rens Bod
"... Parsing models have many applications in AI, ranging from natural language processing (NLP) and computational music analysis to logic programming and computational learning. Broadly conceived, a parsing model seeks to uncover the underlying structure of an input, that is, the various ways in which ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Parsing models have many applications in AI, ranging from natural language processing (NLP) and computational music analysis to logic programming and computational learning. Broadly conceived, a parsing model seeks to uncover the underlying structure of an input, that is, the various ways in which
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University