Results 1 - 10
of
24
Better k-best parsing
, 2005
"... We discuss the relevance of k-best parsing to recent applications in natural language processing, and develop efficient algorithms for k-best trees in the framework of hypergraph parsing. To demonstrate the efficiency, scalability and accuracy of these algorithms, we present experiments on Bikel’s i ..."
Abstract
-
Cited by 103 (14 self)
- Add to MetaCart
We discuss the relevance of k-best parsing to recent applications in natural language processing, and develop efficient algorithms for k-best trees in the framework of hypergraph parsing. To demonstrate the efficiency, scalability and accuracy of these algorithms, we present experiments on Bikel’s implementation of Collins ’ lexicalized PCFG model, and on Chiang’s CFG-based decoder for hierarchical phrase-based translation. We show in particular how the improved output of our algorithms has the potential to improve results from parse reranking systems and other applications. 1
An Efficient Algorithm for Projective Dependency Parsing
- Proceedings of the 8th International Workshop on Parsing Technologies (IWPT
, 2003
"... This paper presents a deterministic parsing algorithm for projective dependency grammar. The running time of the algorithm is linear in the length of the input string, and the dependency graph produced is guaranteed to be projective and acyclic. The algorithm has been experimentally evaluated in ..."
Abstract
-
Cited by 31 (9 self)
- Add to MetaCart
This paper presents a deterministic parsing algorithm for projective dependency grammar. The running time of the algorithm is linear in the length of the input string, and the dependency graph produced is guaranteed to be projective and acyclic. The algorithm has been experimentally evaluated in parsing unrestricted Swedish text, achieving an accuracy above 85% with a very simple grammar.
Grammatical Bigrams
- Advances in Neural Information Processing Systems 14
, 2001
"... Unsupervised learning algorithms have been derived for several statistical models of English grammar, but their computational complexity makes applying them to large data sets intractable. This paper presents a probabilistic model of English grammar that is much simpler than conventional models, but ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Unsupervised learning algorithms have been derived for several statistical models of English grammar, but their computational complexity makes applying them to large data sets intractable. This paper presents a probabilistic model of English grammar that is much simpler than conventional models, but which admits an efficient EM training algorithm. The model is based upon grammatical bigrams, i.e., syntactic relationships between pairs of words.
A distributional analysis of a lexicalized statistical parsing model
- In EMNLP
, 2004
"... This paper presents some of the first data visualizations and analysis of distributions for a lexicalized statistical parsing model, in order to better understand their nature. In the course of this analysis, we have paid particular attention to parameters that include bilexical dependencies. The pr ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
This paper presents some of the first data visualizations and analysis of distributions for a lexicalized statistical parsing model, in order to better understand their nature. In the course of this analysis, we have paid particular attention to parameters that include bilexical dependencies. The prevailing view has been that such statistics are very informative but suffer greatly from sparse data problems. By using a parser to constrain-parse its own output, and by hypothesizing and testing for distributional similarity with back-off distributions, we have evidence that finally explains that (a) bilexical statistics are actually getting used quite often but that (b) the distributions are so similar to those that do not include head words as to be nearly indistinguishable insofar as making parse decisions. Finally, our analysis has provided for the first time an effective way to do parameter selection for a generative lexicalized statistical parsing model. 1
Constraints on non-projective dependency parsing
- In Proc. EACL
, 2006
"... An open issue in data-driven dependency parsing is how to handle non-projective dependencies, which seem to be required by linguistically adequate representations, but which pose problems in parsing with respect to both accuracy and efficiency. Using data from five different languages, we evaluate a ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
An open issue in data-driven dependency parsing is how to handle non-projective dependencies, which seem to be required by linguistically adequate representations, but which pose problems in parsing with respect to both accuracy and efficiency. Using data from five different languages, we evaluate an incremental deterministic parser that derives non-projective dependency structures in O(n 2) time, supported by SVM classifiers for predicting the next parser action. The experiments show that unrestricted non-projective parsing gives a significant improvement in accuracy, compared to a strictly projective baseline, with up to 35 % error reduction, leading to state-of-the-art results for the given data sets. Moreover, by restricting the class of permissible structures to limited degrees of non-projectivity, the parsing time can be reduced by up to 50 % without a significant decrease in accuracy. 1
The CoNLL-2009 shared task: Syntactic and semantic dependencies in multiple languages
, 2009
"... For the 11th straight year, the Conference on Computational Natural Language Learning has been accompanied by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2009, the shared task was dedicated to the joint parsing of syn ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
For the 11th straight year, the Conference on Computational Natural Language Learning has been accompanied by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2009, the shared task was dedicated to the joint parsing of syntactic and semantic dependencies in multiple languages. This shared task combines the shared tasks of the previous five years under a unique dependency-based formalism similar to the 2008 task. In this paper, we define the shared task, describe how the data sets were created and show their quantitative properties, report the results and summarize the approaches of the participating systems.
Supertagging and full parsing
- In: Proceedings of the Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+7
, 2004
"... We investigate an approach to parsing in which lexical information is used only in a first phase, supertagging, in which lexical syntactic properties are determined without building structure. In the second phase, the best parse tree is determined without using lexical information. We investigate di ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We investigate an approach to parsing in which lexical information is used only in a first phase, supertagging, in which lexical syntactic properties are determined without building structure. In the second phase, the best parse tree is determined without using lexical information. We investigate different probabilistic models for adjunction, and we show that, assuming hypothetically perfect performance in the first phase, the error rate on dependency arc attachment can be reduced to 2.3 % using a full chart parser. This is an improvement of about 50% over previously reported results using a simple heuristic parser. 1
Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing
"... Inducing a grammar directly from text is one of the oldest and most challenging tasks in Computational Linguistics. Significant progress has been made for inducing dependency grammars, however the models employed are overly simplistic, particularly in comparison to supervised parsing models. In this ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Inducing a grammar directly from text is one of the oldest and most challenging tasks in Computational Linguistics. Significant progress has been made for inducing dependency grammars, however the models employed are overly simplistic, particularly in comparison to supervised parsing models. In this paper we present an approach to dependency grammar induction using tree substitution grammar which is capable of learning large dependency fragments and thereby better modelling the text. We define a hierarchical non-parametric Pitman-Yor Process prior which biases towards a small grammar with simple productions. This approach significantly improves the state-of-the-art, when measured by head attachment accuracy. 1
Dependency grammar and dependency parsing
- Växjö University
, 2005
"... Despite a long and venerable tradition in descriptive linguistics, dependency grammar has until recently played a fairly marginal role both in theoretical linguistics and in natural language processing. The increasing interest in dependency-based ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Despite a long and venerable tradition in descriptive linguistics, dependency grammar has until recently played a fairly marginal role both in theoretical linguistics and in natural language processing. The increasing interest in dependency-based
Inducing Tree-Substitution Grammars
"... Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has favoured model simplicity (and thus learnability) over representational capacity by using context free grammars and first order dependency grammars, which are not sufficiently expressive to model many common linguistic constructions. We propose a novel compromise by inferring a probabilistic tree substitution grammar, a formalism which allows for arbitrarily large tree fragments and thereby better represent complex linguistic structures. To limit the model’s complexity we employ a Bayesian non-parametric prior which biases the model towards a sparse grammar with shallow productions. We demonstrate the model’s efficacy on supervised phrase-structure parsing, where we induce a latent segmentation of the training treebank, and on unsupervised dependency grammar induction. In both cases the model uncovers interesting latent linguistic structures while producing competitive results.

