Results 1  10
of
18
Synchronous binarization for machine translation
 In Proc. HLTNAACL
, 2006
"... Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary reorderings between the two langu ..."
Abstract

Cited by 35 (10 self)
 Add to MetaCart
Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary reorderings between the two languages, and rules extracted from parallel corpora can be quite large. We devise a lineartime algorithm for factoring syntactic reorderings by binarizing synchronous rules when possible and show that the resulting rule set significantly improves the speed and accuracy of a stateoftheart syntaxbased machine translation system. 1
Empirical lower bounds on the complexity of translational equivalence
 In Proceedings of ACL 2006
, 2006
"... This paper describes a study of the patterns of translational equivalence exhibited by a variety of bitexts. The study found that the complexity of these patterns in every bitext was higher than suggested in the literature. These findings shed new light on why “syntactic ” constraints have not helpe ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
This paper describes a study of the patterns of translational equivalence exhibited by a variety of bitexts. The study found that the complexity of these patterns in every bitext was higher than suggested in the literature. These findings shed new light on why “syntactic ” constraints have not helped to improve statistical translation models, including finitestate phrasebased models, treetostring models, and treetotree models. The paper also presents evidence that inversion transduction grammars cannot generate some translational equivalence relations, even in relatively simple real bitexts in syntactically similar languages with rigid word order. Instructions for replicating our experiments are at
An Introduction to Synchronous Grammars
, 2006
"... Synchronous contextfree grammars are a generalization of contextfree grammars (CFGs) that generate ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Synchronous contextfree grammars are a generalization of contextfree grammars (CFGs) that generate
Extracting synchronous grammar rules from wordlevel alignments in linear time
 In Proceedings of the 22nd International Conference on Computational Linguistics (COLING08
, 2008
"... We generalize Uno and Yagiura’s algorithm for finding all common intervals of two permutations to the setting of two sequences with manytomany alignment links across the two sides. We show how to maximally decompose a wordaligned sentence pair in linear time, which can be used to generate all pos ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
We generalize Uno and Yagiura’s algorithm for finding all common intervals of two permutations to the setting of two sequences with manytomany alignment links across the two sides. We show how to maximally decompose a wordaligned sentence pair in linear time, which can be used to generate all possible phrase pairs or a Synchronous ContextFree Grammar (SCFG) with the simplest rules possible. We also use the algorithm to precisely analyze the maximum SCFG rule length needed to cover handaligned data from various language pairs. 1
Factoring synchronous grammars by sorting
 In Proceedings of the International Conference on Computational Linguistics and the Association for Computational Linguistics (COLING/ACL06
, 2006
"... Synchronous ContextFree Grammars (SCFGs) have been successfully exploited as translation models in machine translation applications. When parsing with an SCFG, computational complexity grows exponentially with the length of the rules, in the worst case. In this paper we examine the problem of facto ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Synchronous ContextFree Grammars (SCFGs) have been successfully exploited as translation models in machine translation applications. When parsing with an SCFG, computational complexity grows exponentially with the length of the rules, in the worst case. In this paper we examine the problem of factorizing each rule of an input SCFG to a generatively equivalent set of rules, each having the smallest possible length. Our algorithm works in time O(n log n), for each rule of length n. This improves upon previous results and solves an open problem about recognizing permutations that can be factored. 1
Two monolingual parses are better than one (synchronous parse
 In Proc. of HLTNAACL
, 2010
"... We describe a synchronous parsing algorithm that is based on two successive monolingual parses of an input sentence pair. Although the worstcase complexity of this algorithm is and must be O(n6) for binary SCFGs, its averagecase runtime is far better. We demonstrate that for a number of common sy ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We describe a synchronous parsing algorithm that is based on two successive monolingual parses of an input sentence pair. Although the worstcase complexity of this algorithm is and must be O(n6) for binary SCFGs, its averagecase runtime is far better. We demonstrate that for a number of common synchronous parsing problems, the twoparse algorithm substantially outperforms alternative synchronous parsing strategies, making it efficient enough to be utilized without resorting to a pruned search. 1
Parsing and Translation Algorithms Based on Weighted Extended Tree Transducers
"... This paper proposes a uniform framework for the development of parsing and translation algorithms for weighted extended (topdown) tree transducers and input strings. The asymptotic time complexity of these algorithms can be improved in practice by exploiting an algorithm for rule factorization in t ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
This paper proposes a uniform framework for the development of parsing and translation algorithms for weighted extended (topdown) tree transducers and input strings. The asymptotic time complexity of these algorithms can be improved in practice by exploiting an algorithm for rule factorization in the above transducers.
Worstcase synchronous grammar rules
 In Proceedings of the 2007 Meeting of the North American chapter of the Association for Computational Linguistics (NAACL07
, 2007
"... We relate the problem of finding the best application of a Synchronous ContextFree Grammar (SCFG) rule during parsing to a Markov Random Field. This representation allows us to use the theory of expander graphs to show that the complexity of SCFG parsing of an input sentence of length N is Ω(Ncn), ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We relate the problem of finding the best application of a Synchronous ContextFree Grammar (SCFG) rule during parsing to a Markov Random Field. This representation allows us to use the theory of expander graphs to show that the complexity of SCFG parsing of an input sentence of length N is Ω(Ncn), for a grammar with maximum rule length n and some constant c. This improves on the previous best result of Ω(N c √ n 1
2006b. Empirical lower bounds on the complexity of translational equivalence
 In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics (ACL
"... This paper describes a study of the patterns of translational equivalence exhibited by a variety of bitexts. The study found that the complexity of these patterns in every bitext was higher than suggested in the literature. These findings shed new light on why “syntactic ” constraints have not helpe ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
This paper describes a study of the patterns of translational equivalence exhibited by a variety of bitexts. The study found that the complexity of these patterns in every bitext was higher than suggested in the literature. These findings shed new light on why “syntactic ” constraints have not helped to improve statistical translation models, including finitestate phrasebased models, treetostring models, and treetotree models. The paper also presents evidence that inversion transduction grammars cannot generate some translational equivalence relations, even in relatively simple real bitexts in syntactically similar languages with rigid word order. Instructions for replicating our experiments are at
Enumeration of Factorizable MultiDimensional Permutations
"... A ddimensional permutation is a sequence of d + 1 permutations with the leading element being the identity permutation. It can be viewed as an alignment structure across d+1 sequences, or visualized as the result of permuting n hypercubes of (d+1) dimensions. We study the hierarchical decomposition ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
A ddimensional permutation is a sequence of d + 1 permutations with the leading element being the identity permutation. It can be viewed as an alignment structure across d+1 sequences, or visualized as the result of permuting n hypercubes of (d+1) dimensions. We study the hierarchical decomposition of ddimensional permutations. We show that when d ≥ 2, the ratio between nondecomposable or simple dpermutations and all dpermutations approaches 1. We also prove that the growth rate of the number of dpermutations that can be factorized into kary branching trees approaches � � k d e as k grows. 1