Results 1 
7 of
7
Terascale translation models via pattern matching
 IN PROC. OF COLING
, 2008
"... Translation model size is growing at a pace that outstrips improvements in computing power, and this hinders research on many interesting models. We show how an algorithmic scaling technique can be used to easily handle very large models. Using this technique, we explore several large model variants ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
Translation model size is growing at a pace that outstrips improvements in computing power, and this hinders research on many interesting models. We show how an algorithmic scaling technique can be used to easily handle very large models. Using this technique, we explore several large model variants and show an improvement 1.4 BLEU on the NIST 2006 ChineseEnglish task. This opens the door for work on a variety of models that are much less constrained by computational limitations.
Scalable purelydiscriminative training for word and tree transducers
, 2006
"... Discriminative training methods have recently led to significant advances in the state of the art of machine translation (MT). Another promising trend is the incorporation of syntactic information into MT systems. Combining these trends is difficult for reasons of system complexity and computational ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Discriminative training methods have recently led to significant advances in the state of the art of machine translation (MT). Another promising trend is the incorporation of syntactic information into MT systems. Combining these trends is difficult for reasons of system complexity and computational complexity. The present study makes progress towards a syntaxaware MT system whose every component is trained discriminatively. Our main innovation is an approach to discriminative learning that is computationally efficient enough for large statistical MT systems, yet whose accuracy on translation subtasks is near the state of the art. Our source code is downloadable from
MACHINE TRANSLATION BY PATTERN MATCHING
, 2008
"... The best systems for machine translation of natural language are based on statistical models learned from data. Conventional representation of a statistical translation model requires substantial offline computation and representation in main memory. Therefore, the principal bottlenecks to the amoun ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
The best systems for machine translation of natural language are based on statistical models learned from data. Conventional representation of a statistical translation model requires substantial offline computation and representation in main memory. Therefore, the principal bottlenecks to the amount of data we can exploit and the complexity of models we can use are available memory and CPU time, and current state of the art already pushes these limits. With data size and model complexity continually increasing, a scalable solution to this problem is central to future improvement. CallisonBurch et al. (2005) and Zhang and Vogel (2005) proposed a solution that we call translation by pattern matching, which we bring to fruition in this dissertation. The training data itself serves as a proxy to the model; rules and parameters are computed on demand. It achieves our desiderata of minimal offline computation and compact representation, but is dependent on fast pattern matching algorithms on text. They demonstrated its application to a common model based on the translation of contiguous substrings, but leave some open problems. Among these is a question: can this approach match the performance of conventional methods despite unavoidable differences that it induces in the model? We show how to answer this question affirmatively. The main
Microsoft Research Treelet . . .
 PROCEEDINGS OF THE WORKSHOP ON STATISTICAL MACHINE TRANSLATION, PAGES 158161,
, 2006
"... The Microsoft Research translation system is a syntactically informed phrasal SMT system that uses a phrase translation model based on dependency treelets and a global reordering model based on the source dependency tree. These models are combined with several other knowledge sources in a loglinear ..."
Abstract
 Add to MetaCart
The Microsoft Research translation system is a syntactically informed phrasal SMT system that uses a phrase translation model based on dependency treelets and a global reordering model based on the source dependency tree. These models are combined with several other knowledge sources in a loglinear manner. The weights of the individual components in the loglinear model are set by an automatic parametertuning method. We give a brief overview of the components of the system and discuss our experience with the Europarl data translating from English to Spanish.
Reordering Grammar Induction
"... We present a novel approach for unsupervised induction of a Reordering Grammar using a modified form of permutation trees (Zhang and Gildea, 2007), which we apply to preordering in phrasebased machine translation. Unlike previous approaches, we induce in one step both the hierarchical structure ..."
Abstract
 Add to MetaCart
(Show Context)
We present a novel approach for unsupervised induction of a Reordering Grammar using a modified form of permutation trees (Zhang and Gildea, 2007), which we apply to preordering in phrasebased machine translation. Unlike previous approaches, we induce in one step both the hierarchical structure and the transduction function over it from wordaligned parallel corpora. Furthermore, our model (1) handles nonITG reordering patterns (up to 5ary branching), (2) is learned from all derivations by treating not only labeling but also bracketing as latent variable, (3) is entirely unlexicalized at the level of reordering rules, and (4) requires no linguistic annotation. Our model is evaluated both for accuracy in predicting target order, and for its impact on translation quality. We report significant performance gains over phrase reordering, and over two known preordering baselines for EnglishJapanese. 1
U N I V E R
, 2009
"... The arguably best performing statistical machine translation systems are based on contextfree formalisms or weakly equivalent ones. These models usually use a synchronous version of a contextfree grammar (SCFG) which we argue is too rigid for the highly ambiguous task of human language translation ..."
Abstract
 Add to MetaCart
(Show Context)
The arguably best performing statistical machine translation systems are based on contextfree formalisms or weakly equivalent ones. These models usually use a synchronous version of a contextfree grammar (SCFG) which we argue is too rigid for the highly ambiguous task of human language translation. This is exacerbated by the fact that the imperfect methods available for aligning parallel texts make extracting an efficient grammar very hard. As a result, the contextfree grammars extracted are usually very large in size after having already been restricted through a variety of constraints. We propose to use Combinatorial Categorial Grammar (CCG) for machine translation models. CCG is a lexicalized, mildlycontextsensitive formalism which is very well suited to capture longdistance dependencies that are not addressed very well by most current models. We believe that CCG is very well suited for the task of machine translation due to its ability to represent nonconstituents in a syntactic way which frequently occur in parallel texts as well as its high derivational flexibility. This allows us to use some of the advantages of nonsyntactic phrasebased approaches within a syntactic framework such as a relatively small grammar size compared to contextfreebased
Abstract Title of dissertation: MACHINE TRANSLATION BY PATTERN MATCHING
"... The best systems for machine translation of natural language are based on statistical models learned from data. Conventional representation of a statistical translation model requires substantial offline computation and representation in main memory. Therefore, the principal bottlenecks to the amoun ..."
Abstract
 Add to MetaCart
(Show Context)
The best systems for machine translation of natural language are based on statistical models learned from data. Conventional representation of a statistical translation model requires substantial offline computation and representation in main memory. Therefore, the principal bottlenecks to the amount of data we can exploit and the complexity of models we can use are available memory and CPU time, and current state of the art already pushes these limits. With data size and model complexity continually increasing, a scalable solution to this problem is central to future improvement. CallisonBurch et al. (2005) and Zhang and Vogel (2005) proposed a solution that we call translation by pattern matching, which we bring to fruition in this dissertation. The training data itself serves as a proxy to the model; rules and parameters are computed on demand. It achieves our desiderata of minimal offline computation and compact representation, but is dependent on fast pattern matching algorithms on text. They demonstrated its application to a common model based on the translation of contiguous substrings, but leave some open problems. Among these is a question: can this approach match the performance of conventional methods despite unavoidable differences that it