Results 1 
9 of
9
Binarization of Synchronous ContextFree Grammars
"... Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary reorderings between the two langu ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary reorderings between the two languages. We develop a theory of binarization for synchronous contextfree grammars and present a lineartime algorithm for binarizing synchronous rules when possible. In our largescale experiments, we found that almost all rules are binarizable and the resulting binarized rule set significantly improves the speed and accuracy of a stateoftheart syntaxbased machine translation system. We also discuss the more general, and computationally more difficult, problem of finding good parsing strategies for nonbinarizable rules, and present an approximate polynomialtime algorithm for this problem. 1.
Extracting synchronous grammar rules from wordlevel alignments in linear time
 In Proceedings of the 22nd International Conference on Computational Linguistics (COLING08
, 2008
"... We generalize Uno and Yagiura’s algorithm for finding all common intervals of two permutations to the setting of two sequences with manytomany alignment links across the two sides. We show how to maximally decompose a wordaligned sentence pair in linear time, which can be used to generate all pos ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
We generalize Uno and Yagiura’s algorithm for finding all common intervals of two permutations to the setting of two sequences with manytomany alignment links across the two sides. We show how to maximally decompose a wordaligned sentence pair in linear time, which can be used to generate all possible phrase pairs or a Synchronous ContextFree Grammar (SCFG) with the simplest rules possible. We also use the algorithm to precisely analyze the maximum SCFG rule length needed to cover handaligned data from various language pairs. 1
Optimal karization of Synchronous TreeAdjoining Grammar
"... Synchronous TreeAdjoining Grammar (STAG) is a promising formalism for syntaxaware machine translation and simultaneous computation of naturallanguage syntax and semantics. Current research in both of these areas is actively pursuing its incorporation. However, STAG parsing is known to be NPhard d ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Synchronous TreeAdjoining Grammar (STAG) is a promising formalism for syntaxaware machine translation and simultaneous computation of naturallanguage syntax and semantics. Current research in both of these areas is actively pursuing its incorporation. However, STAG parsing is known to be NPhard due to the potential for intertwined correspondences between the linked nonterminal symbols in the elementary structures. Given a particular grammar, the polynomial degree of efficient STAG parsing algorithms depends directly on the rank of the grammar: the maximum number of correspondences that appear within a single elementary structure. In this paper we present a compiletime algorithm for transforming a STAG into a stronglyequivalent STAG that optimally minimizes the rank, k, across the grammar. The algorithm performs in O(G  + Y  · L 3 G) time where LG is the maximum number of links in any single synchronous tree pair in the grammar and Y is the set of synchronous tree pairs of G. 1
Enumeration of Factorizable MultiDimensional Permutations
"... A ddimensional permutation is a sequence of d + 1 permutations with the leading element being the identity permutation. It can be viewed as an alignment structure across d+1 sequences, or visualized as the result of permuting n hypercubes of (d+1) dimensions. We study the hierarchical decomposition ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
A ddimensional permutation is a sequence of d + 1 permutations with the leading element being the identity permutation. It can be viewed as an alignment structure across d+1 sequences, or visualized as the result of permuting n hypercubes of (d+1) dimensions. We study the hierarchical decomposition of ddimensional permutations. We show that when d ≥ 2, the ratio between nondecomposable or simple dpermutations and all dpermutations approaches 1. We also prove that the growth rate of the number of dpermutations that can be factorized into kary branching trees approaches � � k d e as k grows. 1
Synchronous and Multicomponent TreeAdjoining Grammars: Complexity, Algorithms and Linguistic Applications
, 2009
"... iv guages is determined only in part by the synchronization. The base formalism chosen can offer greater or lesser opportunity for divergence in the derived structures. My choice of a base formalism is motivated directly by research into applications of synchronous TAGbased grammars to two natural ..."
Abstract
 Add to MetaCart
iv guages is determined only in part by the synchronization. The base formalism chosen can offer greater or lesser opportunity for divergence in the derived structures. My choice of a base formalism is motivated directly by research into applications of synchronous TAGbased grammars to two natural language applications: semantic interpretation and natural language translations. I first examine a range of TAG variants that have not previously been studied in this level of detail to determine their computational properties and to develop algorithms that can be used to process them. Original results on the complexity of these formalisms are presented as well as novel algorithms for factorizing grammars to reduce the time required to process them. In Part II, I develop applications of synchronous Limited Delay TreeLocal Multicomponent TAG to semantic interpretation and probabilistic synchronous Tree Insertion Grammar to statistical natural language translation. Contents
Research Statement
"... My research interests are algorithms for massive data, data structures, and approximation/online algorithms. ..."
Abstract
 Add to MetaCart
My research interests are algorithms for massive data, data structures, and approximation/online algorithms.
Supervised by:
"... I declare that: this work has been prepared by myself, all literally or contentrelated quotations from other sources are clearly referenced, and no other sources or aids out of the declared reference are used. Hamburg, 24.10.2005 Jun ZhangI would like to thank Professor Joachim W. Schmidt of STS fo ..."
Abstract
 Add to MetaCart
I declare that: this work has been prepared by myself, all literally or contentrelated quotations from other sources are clearly referenced, and no other sources or aids out of the declared reference are used. Hamburg, 24.10.2005 Jun ZhangI would like to thank Professor Joachim W. Schmidt of STS for supervising this thesis and being very helpful with guiding the project overall and finding a topic for my work. Thanks also go to Dipl. Inform. Rainer Marrone, who guided me through the whole project and this thesis and offered great help on developing the whole work. Thanks also go to Birgit Guth, Jürgen Meincke and Werner Wendt from Dataport, who provided useful information of Dataport and advices through this work. Dr. HansWerner Sehring and Sebastian Boung of STS were very helpful in providing advices
unknown title
"... Synchronous TreeAdjoining Grammar (STAG) is a promising formalism for syntaxaware machine translation and simultaneous computation of naturallanguage syntax and semantics. Current research in both of these areas is actively pursuing its incorporation. However, STAG parsing is known to be NPhard d ..."
Abstract
 Add to MetaCart
Synchronous TreeAdjoining Grammar (STAG) is a promising formalism for syntaxaware machine translation and simultaneous computation of naturallanguage syntax and semantics. Current research in both of these areas is actively pursuing its incorporation. However, STAG parsing is known to be NPhard due to the potential for intertwined correspondences between the linked nonterminal symbols in the elementary structures. Given a particular grammar, the polynomial degree of efficient STAG parsing algorithms depends directly on the rank of the grammar: the maximum number of correspondences that appear within a single elementary structure. In this paper we present a compiletime algorithm for transforming a STAG into a stronglyequivalent STAG that optimally minimizes the rank, k, across the grammar. The algorithm performs in O(G  + Y  · L 3 G) time where LG is the maximum number of links in any single synchronous tree pair in the grammar and Y is the set of synchronous tree pairs of G. 1
On Hierarchical Reordering and Permutation Parsing for Phrasebased Decoding
"... The addition of a deterministic permutation parser can provide valuable hierarchical information to a phrasebased statistical machine translation (PBSMT) system. Permutation parsers have been used to implement hierarchical reordering models (Galley and Manning, 2008) and to enforce inversion trans ..."
Abstract
 Add to MetaCart
The addition of a deterministic permutation parser can provide valuable hierarchical information to a phrasebased statistical machine translation (PBSMT) system. Permutation parsers have been used to implement hierarchical reordering models (Galley and Manning, 2008) and to enforce inversion transduction grammar (ITG) constraints (Feng et al., 2010). We present a number of theoretical results regarding the use of permutation parsers in PBSMT. In particular, we show that an existing ITG constraint (Zens et al., 2004) does not prevent all nonITG permutations, and we demonstrate that the hierarchical reordering model can produce analyses during decoding that are inconsistent with analyses made during training. Experimentally, we verify the utility of hierarchical reordering, and compare several theoreticallymotivated variants in terms of both translation quality and the syntactic complexity of their output. 1