Results 1  10
of
79
Hierarchical phrasebased translation
 Computational Linguistics
, 2007
"... We present a statistical machine translation model that uses hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous contextfree grammar but is learned from a parallel text without any syntactic annotations. Thus it can be seen as combining fundamental ideas from b ..."
Abstract

Cited by 567 (9 self)
 Add to MetaCart
(Show Context)
We present a statistical machine translation model that uses hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous contextfree grammar but is learned from a parallel text without any syntactic annotations. Thus it can be seen as combining fundamental ideas from both syntaxbased translation and phrasebased translation. We describe our system’s training and decoding methods in detail, and evaluate it for translation speed and translation accuracy. Using BLEU as a metric of translation accuracy, we find that our system performs significantly better than the Alignment Template System, a stateoftheart phrasebased system. 1.
Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora
, 1997
"... ..."
Statistical syntaxdirected translation with extended domain of locality
 In Proc. AMTA 2006
, 2006
"... A syntaxdirected translator first parses the sourcelanguage input into a parsetree, and then recursively converts the tree into a string in the targetlanguage. We model this conversion by an extended treetostring transducer that have multilevel trees on the sourceside, which gives our system m ..."
Abstract

Cited by 119 (14 self)
 Add to MetaCart
(Show Context)
A syntaxdirected translator first parses the sourcelanguage input into a parsetree, and then recursively converts the tree into a string in the targetlanguage. We model this conversion by an extended treetostring transducer that have multilevel trees on the sourceside, which gives our system more expressive power and flexibility. We also define a direct probability model and use a lineartime dynamic programming algorithm to search for the best derivation. The model is then extended to the general loglinear framework in order to rescore with other features like ngram language models. We devise a simpleyeteffective algorithm to generate nonduplicate kbest translations for ngram rescoring. Initial experimental results on EnglishtoChinese translation are presented. 1
Word sense disambiguation improves statistical machine translation
 In 45th Annual Meeting of the Association for Computational Linguistics (ACL07
, 2007
"... Recent research presents conflicting evidence on whether word sense disambiguation (WSD) systems can help to improve the performance of statistical machine translation (MT) systems. In this paper, we successfully integrate a stateoftheart WSD system into a stateoftheart hierarchical phrasebas ..."
Abstract

Cited by 89 (5 self)
 Add to MetaCart
(Show Context)
Recent research presents conflicting evidence on whether word sense disambiguation (WSD) systems can help to improve the performance of statistical machine translation (MT) systems. In this paper, we successfully integrate a stateoftheart WSD system into a stateoftheart hierarchical phrasebased MT system, Hiero. We show for the first time that integrating a WSD system improves the performance of a stateoftheart statistical MT system on an actual translation task. Furthermore, the improvement is statistically significant. 1
A survey of statistical machine translation
, 2007
"... Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of humanproduced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular tec ..."
Abstract

Cited by 84 (6 self)
 Add to MetaCart
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of humanproduced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of stateoftheart SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.
Learning Dependency Translation Models as Collections of Finite State Head Transducers
 Computational Linguistics
, 2000
"... The paper defines weighted head transducers,finitestate machines that perform middleout string transduction. These transducers are strictly more expressive than the special case of standard lefttoright finitestate transducers. Dependency transduction models are then defined as collections of wei ..."
Abstract

Cited by 77 (3 self)
 Add to MetaCart
(Show Context)
The paper defines weighted head transducers,finitestate machines that perform middleout string transduction. These transducers are strictly more expressive than the special case of standard lefttoright finitestate transducers. Dependency transduction models are then defined as collections of weighted head transducers that are applied hierarchically. A dynamic programming search algorithm is described for finding the optimal transduction of an input string with respect to a dependency transduction model. A method for automatically training a dependency transduction model from a set of inputoutput example strings is presented. The method first searches for hierarchical alignments of the training examples guided by correlation statistics, and then constructs the transitions of head transducers that are consistent with these alignments. Experimental results are given for applying the training method to translation from English to Spanish and Japanese. 1.
Forest rescoring: Faster decoding with integrated language models
 In ACL ’07
, 2007
"... Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on kbest parsing algorithms and demonstrate their effectiveness on bo ..."
Abstract

Cited by 71 (0 self)
 Add to MetaCart
(Show Context)
Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on kbest parsing algorithms and demonstrate their effectiveness on both phrasebased and syntaxbased MT systems. In both cases, our methods achieve significant speed improvements, often by more than a factor of ten, over the conventional beamsearch method at the same levels of search error and translation accuracy. 1
ContextFree Languages and PushDown Automata
 Handbook of Formal Languages
, 1997
"... Contents 1. Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.1 Grammars : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.2 Examples : : : : : : : : : : : : : : : : : : : : : : : : : : : ..."
Abstract

Cited by 68 (0 self)
 Add to MetaCart
(Show Context)
Contents 1. Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.1 Grammars : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.2 Examples : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2. Systems of equations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 2.1 Systems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 2.2 Resolution : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11 2.3 Linear systems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 2.4 Parikh's theorem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
A discriminative latent variable model for statistical machine translation
 In Proc. of the 46th Annual Conference of the Association for Computational Linguistics: Human Language Technologies (ACL08:HLT
, 2008
"... Largescale discriminative machine translation promises to further the stateoftheart, but has failed to deliver convincing gains over current heuristic frequency count systems. We argue that a principle reason for this failure is not dealing with multiple, equivalent translations. We present a tr ..."
Abstract

Cited by 62 (4 self)
 Add to MetaCart
Largescale discriminative machine translation promises to further the stateoftheart, but has failed to deliver convincing gains over current heuristic frequency count systems. We argue that a principle reason for this failure is not dealing with multiple, equivalent translations. We present a translation model which models derivations as a latent variable, in both training and decoding, and is fully discriminative and globally optimised. Results show that accounting for multiple derivations does indeed improve performance. Additionally, we show that regularisation is essential for maximum conditional likelihood models in order to avoid degenerate solutions. 1
Lexicalized Markov grammars for sentence compression
, 2007
"... We present a sentence compression system based on synchronous contextfree grammars (SCFG), following the successful noisychannel approach of (Knight and Marcu, 2000). We define a headdriven Markovization formulation of SCFG deletion rules, which allows us to lexicalize probabilities of constituent ..."
Abstract

Cited by 48 (2 self)
 Add to MetaCart
(Show Context)
We present a sentence compression system based on synchronous contextfree grammars (SCFG), following the successful noisychannel approach of (Knight and Marcu, 2000). We define a headdriven Markovization formulation of SCFG deletion rules, which allows us to lexicalize probabilities of constituent deletions. We also use a robust approach for treetotree alignment between arbitrary documentabstract parallel corpora, which lets us train lexicalized models with much more data than previous approaches relying exclusively on scarcely available documentcompression corpora. Finally, we evaluate different Markovized models, and find that our selected best model is one that exploits headmodifier bilexicalization to accurately distinguish adjuncts from complements, and that produces sentences that were judged more grammatical than those generated by previous work. 1