Results 1 -
6 of
6
Binarization of Synchronous Context-Free Grammars
"... Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary re-orderings between the two langu ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary re-orderings between the two languages. We develop a theory of binarization for synchronous context-free grammars and present a linear-time algorithm for binarizing synchronous rules when possible. In our large-scale experiments, we found that almost all rules are binarizable and the resulting binarized rule set significantly improves the speed and accuracy of a state-of-the-art syntaxbased machine translation system. We also discuss the more general, and computationally more difficult, problem of finding good parsing strategies for non-binarizable rules, and present an approximate polynomial-time algorithm for this problem. 1.
Extracting synchronous grammar rules from word-level alignments in linear time
- In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08
, 2008
"... We generalize Uno and Yagiura’s algorithm for finding all common intervals of two permutations to the setting of two sequences with many-to-many alignment links across the two sides. We show how to maximally decompose a word-aligned sentence pair in linear time, which can be used to generate all pos ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We generalize Uno and Yagiura’s algorithm for finding all common intervals of two permutations to the setting of two sequences with many-to-many alignment links across the two sides. We show how to maximally decompose a word-aligned sentence pair in linear time, which can be used to generate all possible phrase pairs or a Synchronous Context-Free Grammar (SCFG) with the simplest rules possible. We also use the algorithm to precisely analyze the maximum SCFG rule length needed to cover hand-aligned data from various language pairs. 1
Optimal k-arization of Synchronous Tree-Adjoining Grammar
"... Synchronous Tree-Adjoining Grammar (STAG) is a promising formalism for syntaxaware machine translation and simultaneous computation of natural-language syntax and semantics. Current research in both of these areas is actively pursuing its incorporation. However, STAG parsing is known to be NP-hard d ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Synchronous Tree-Adjoining Grammar (STAG) is a promising formalism for syntaxaware machine translation and simultaneous computation of natural-language syntax and semantics. Current research in both of these areas is actively pursuing its incorporation. However, STAG parsing is known to be NP-hard due to the potential for intertwined correspondences between the linked nonterminal symbols in the elementary structures. Given a particular grammar, the polynomial degree of efficient STAG parsing algorithms depends directly on the rank of the grammar: the maximum number of correspondences that appear within a single elementary structure. In this paper we present a compile-time algorithm for transforming a STAG into a strongly-equivalent STAG that optimally minimizes the rank, k, across the grammar. The algorithm performs in O(|G | + |Y | · L 3 G) time where LG is the maximum number of links in any single synchronous tree pair in the grammar and Y is the set of synchronous tree pairs of G. 1
Enumeration of Factorizable Multi-Dimensional Permutations
"... A d-dimensional permutation is a sequence of d + 1 permutations with the leading element being the identity permutation. It can be viewed as an alignment structure across d+1 sequences, or visualized as the result of permuting n hypercubes of (d+1) dimensions. We study the hierarchical decomposition ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A d-dimensional permutation is a sequence of d + 1 permutations with the leading element being the identity permutation. It can be viewed as an alignment structure across d+1 sequences, or visualized as the result of permuting n hypercubes of (d+1) dimensions. We study the hierarchical decomposition of d-dimensional permutations. We show that when d ≥ 2, the ratio between non-decomposable or simple d-permutations and all d-permutations approaches 1. We also prove that the growth rate of the number of d-permutations that can be factorized into k-ary branching trees approaches � � k d e as k grows. 1
unknown title
"... Synchronous Tree-Adjoining Grammar (STAG) is a promising formalism for syntaxaware machine translation and simultaneous computation of natural-language syntax and semantics. Current research in both of these areas is actively pursuing its incorporation. However, STAG parsing is known to be NP-hard d ..."
Abstract
- Add to MetaCart
Synchronous Tree-Adjoining Grammar (STAG) is a promising formalism for syntaxaware machine translation and simultaneous computation of natural-language syntax and semantics. Current research in both of these areas is actively pursuing its incorporation. However, STAG parsing is known to be NP-hard due to the potential for intertwined correspondences between the linked nonterminal symbols in the elementary structures. Given a particular grammar, the polynomial degree of efficient STAG parsing algorithms depends directly on the rank of the grammar: the maximum number of correspondences that appear within a single elementary structure. In this paper we present a compile-time algorithm for transforming a STAG into a strongly-equivalent STAG that optimally minimizes the rank, k, across the grammar. The algorithm performs in O(|G | + |Y | · L 3 G) time where LG is the maximum number of links in any single synchronous tree pair in the grammar and Y is the set of synchronous tree pairs of G. 1
Synchronous and Multicomponent Tree-Adjoining Grammars: Complexity, Algorithms and Linguistic Applications
, 2009
"... iv guages is determined only in part by the synchronization. The base formalism chosen can offer greater or lesser opportunity for divergence in the derived structures. My choice of a base formalism is motivated directly by research into applications of synchronous TAG-based grammars to two natural ..."
Abstract
- Add to MetaCart
iv guages is determined only in part by the synchronization. The base formalism chosen can offer greater or lesser opportunity for divergence in the derived structures. My choice of a base formalism is motivated directly by research into applications of synchronous TAG-based grammars to two natural language applications: semantic interpretation and natural language translations. I first examine a range of TAG variants that have not previously been studied in this level of detail to determine their computational properties and to develop algorithms that can be used to process them. Original results on the complexity of these formalisms are presented as well as novel algorithms for factorizing grammars to reduce the time required to process them. In Part II, I develop applications of synchronous Limited Delay Tree-Local Multicomponent TAG to semantic interpretation and probabilistic synchronous Tree Insertion Grammar to statistical natural language translation. Contents

