Results 11 -
17 of
17
Dependency Parsing and Projection Based on Word-Pair Classification
"... In this paper we describe an intuitionistic method for dependency parsing, where a classifier is used to determine whether a pair of words forms a dependency edge. And we also propose an effective strategy for dependency projection, where the dependency relationships of the word pairs in the source ..."
Abstract
- Add to MetaCart
In this paper we describe an intuitionistic method for dependency parsing, where a classifier is used to determine whether a pair of words forms a dependency edge. And we also propose an effective strategy for dependency projection, where the dependency relationships of the word pairs in the source language are projected to the word pairs of the target language, leading to a set of classification instances rather than a complete tree. Experiments show that, the classifier trained on the projected classification instances significantly outperforms previous projected dependency parsers. More importantly, when this classifier is integrated into a maximum spanning tree (MST) dependency parser, obvious improvement is obtained over the MST baseline. 1
Preference Grammars and Decoding Algorithms for Probabilistic Synchronous Context Free Grammar Based Translation.
"... Probabilistic Synchronous Context-free Grammars (PSCFGs) [Aho and Ullmann, 1969, Wu, 1996] define weighted transduction rules to represent translation and reordering operations. When translation models use features that are defined locally, on each rule, there are efficient dynamic programming algor ..."
Abstract
- Add to MetaCart
Probabilistic Synchronous Context-free Grammars (PSCFGs) [Aho and Ullmann, 1969, Wu, 1996] define weighted transduction rules to represent translation and reordering operations. When translation models use features that are defined locally, on each rule, there are efficient dynamic programming algorithms to perform translation with these grammars [Kasami, 1965]. In general, the integration of non-local features into the translation model can make translation NP-hard, requiring decoding approximations that limit the impact of these features. In this thesis, we consider the impact and interaction between two non-local features, the n-gram language model (LM) and labels on rule nonterminal symbols in the Syntax-Augmented MT (SAMT) grammar [Zollmann and Venugopal, 2006]. While these features do not result in NP-hard search, they would lead to serious increases in wall-clock runtime if naïve dynamic programming methods are applied. We develop novel two-pass algorithms that make strong decoding approximations during a first pass search, generating a hypergraph of sentence spanning translation i derivations. In a second pass, we use knowledge about non-local features to explore
Research on Rule-based Chinese Syntactic Parsing Postprocess Using Verb Subcategorization 1
"... Abstract- We propose a simple approach for Chinese syntactic parsing postprocess in this paper. It uses verb subcategorization syntactic mode to match n-best candidate parsing trees outputed from baseline parser system. We extract various features of verb subcategorization from train corpora. And us ..."
Abstract
- Add to MetaCart
Abstract- We propose a simple approach for Chinese syntactic parsing postprocess in this paper. It uses verb subcategorization syntactic mode to match n-best candidate parsing trees outputed from baseline parser system. We extract various features of verb subcategorization from train corpora. And use those features of verb subcategorization extracted from train corpus to rerank the n-best list via a similar pattern matching approach, and with the rule-based method, but no use statistic information. We called this method as rule-based reranking. The result shows our approach reaches a good performance.
A Fast, Accurate, Non-Projective, Semantically-Enriched Parser
"... Dependency parsers are critical components within many NLP systems. However, currently available dependency parsers each exhibit at least one of several weaknesses, including high running time, limited accuracy, vague dependency labels, and lack of nonprojectivity support. Furthermore, no commonly u ..."
Abstract
- Add to MetaCart
Dependency parsers are critical components within many NLP systems. However, currently available dependency parsers each exhibit at least one of several weaknesses, including high running time, limited accuracy, vague dependency labels, and lack of nonprojectivity support. Furthermore, no commonly used parser provides additional shallow semantic interpretation, such as preposition sense disambiguation and noun compound interpretation. In this paper, we present a new dependency-tree conversion of the Penn Treebank along with its associated fine-grain dependency labels and a fast, accurate parser trained on it. We explain how a non-projective extension to shift-reduce parsing can be incorporated into non-directional easy-first parsing. The parser performs well when evaluated on the standard test section of the Penn Treebank, outperforming several popular open source dependency parsers; it is, to the best of our knowledge, the first dependency parser capable of parsing more than 75 sentences per second at over 93 % accuracy. 1
Relaxed Cross-lingual Projection of Constituent Syntax
"... We propose a relaxed correspondence assumption for cross-lingual projection of constituent syntax, which allows a supposed constituent of the target sentence to correspond to an unrestricted treelet in the source parse. Such a relaxed assumption fundamentally tolerates the syntactic non-isomorphism ..."
Abstract
- Add to MetaCart
We propose a relaxed correspondence assumption for cross-lingual projection of constituent syntax, which allows a supposed constituent of the target sentence to correspond to an unrestricted treelet in the source parse. Such a relaxed assumption fundamentally tolerates the syntactic non-isomorphism between languages, and enables us to learn the target-language-specific syntactic idiosyncrasy rather than a strained grammar directly projected from the source language syntax. Based on this assumption, a novel constituency projection method is also proposed in order to induce a projected constituent treebank from the source-parsed bilingual corpus. Experiments show that, the parser trained on the projected treebank dramatically outperforms previous projected and unsupervised parsers. 1
A Search in the Forest: Efficient Algorithms for Parsing and Machine Translation based on Packed Forests
"... Many problems in Natural Language Processing (NLP) involves an efficient search for the best derivation over (exponentially) many candidates. For example, a parser aims to find the best syntactic tree for a given sentence among all derivations under a grammar, and a machine translation (MT) decoder ..."
Abstract
- Add to MetaCart
Many problems in Natural Language Processing (NLP) involves an efficient search for the best derivation over (exponentially) many candidates. For example, a parser aims to find the best syntactic tree for a given sentence among all derivations under a grammar, and a machine translation (MT) decoder explores the space of all possible translations of the source-language sentence. In these cases, the concept of packed forest provides a compact representation of huge search spaces by sharing common sub-derivations, where efficient algorithms based on Dynamic Programming (DP) are possible. Building upon the hypergraph formulation of forests and well-known 1-best DP algorithms, this dissertation develops fast and exact k-best DP algorithms on forests, which are orders of magnitudes faster than previously used methods on state-of-theart parsers. We also show empirically how the improved output of our algorithms has the potential to improve results from parse reranking systems and other applications. We then extend these algorithms to approximate search when the forests are

