Results 1 - 10
of
61
A Systematic Comparison of Various Statistical Alignment Models
- Computational Linguistics
, 2003
"... this article the problem of finding the word alignment of a bilingual sentence-aligned corpus by using language-independent statistical methods. There is a vast literature on this topic, and many different systems have been suggested to solve this problem. Our work follows and extends the methods in ..."
Abstract
-
Cited by 805 (22 self)
- Add to MetaCart
this article the problem of finding the word alignment of a bilingual sentence-aligned corpus by using language-independent statistical methods. There is a vast literature on this topic, and many different systems have been suggested to solve this problem. Our work follows and extends the methods introduced by Brown, Della Pietra, Della Pietra, and Mercer (1993) by using refined statistical models for the translation process. The basic idea of this approach is to develop a model of the translation process with the word alignment as a hidden variable of this process, to apply statistical estimation theory to compute the "optimal" model parameters, and to perform alignment search to compute the best word alignment
Fast Decoding and Optimal Decoding for Machine Translation
- In Proceedings of ACL 39
, 2001
"... A good decoding algorithm is critical ..."
Phrase-Based Statistical Machine Translation
, 2002
"... This paper is based on the work carried out in the framework of the Verbmobil project, which is a limited-domain speech translation task (German-English). In the nal evaluation, the statistical approach was found to perform best among ve competing approaches. In this ..."
Abstract
-
Cited by 64 (3 self)
- Add to MetaCart
This paper is based on the work carried out in the framework of the Verbmobil project, which is a limited-domain speech translation task (German-English). In the nal evaluation, the statistical approach was found to perform best among ve competing approaches. In this
A survey of statistical machine translation
, 2007
"... Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular tec ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.
Forest rescoring: Faster decoding with integrated language models
- In ACL ’07
, 2007
"... Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on k-best parsing algorithms and demonstrate their effectiveness on bo ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on k-best parsing algorithms and demonstrate their effectiveness on both phrase-based and syntax-based MT systems. In both cases, our methods achieve significant speed improvements, often by more than a factor of ten, over the conventional beam-search method at the same levels of search error and translation accuracy. 1
Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation
- In Proc. of COLING-ACL
, 2006
"... We propose a novel reordering model for phrase-based statistical machine translation (SMT) that uses a maximum entropy (MaxEnt) model to predicate reorderings of neighbor blocks (phrase pairs). The model provides content-dependent, hierarchical phrasal reordering with generalization based on feature ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
We propose a novel reordering model for phrase-based statistical machine translation (SMT) that uses a maximum entropy (MaxEnt) model to predicate reorderings of neighbor blocks (phrase pairs). The model provides content-dependent, hierarchical phrasal reordering with generalization based on features automatically learned from a real-world bitext. We present an algorithm to extract all reordering events of neighbor blocks from bilingual data. In our experiments on Chineseto-English translation, this MaxEnt-based reordering model obtains significant improvements in BLEU score on the NIST MT-05 and IWSLT-04 tasks. 1
User-Friendly Text Prediction for Translators
- IN PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP
, 2002
"... Text prediction is a form of interactive machine translation that is well suited to skilled translators. In principle it can assist in the production of a target text with minimal disruption to a translator's normal routine. However, recent evaluations of a prototype prediction system showed ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
Text prediction is a form of interactive machine translation that is well suited to skilled translators. In principle it can assist in the production of a target text with minimal disruption to a translator's normal routine. However, recent evaluations of a prototype prediction system showed that it significantly decreased the productivity of most translators who used it. In this paper, we analyze the reasons for this and propose a solution which consists in seeking predictions that maximize the expected benefit to the translator, rather than just trying to anticipate some amount of upcoming text. Using a model of a "typical translator" constructed from data collected in the evaluations of the prediction prototype, we show that this approach has the potential to turn text prediction into a help rather than a hindrance to a translator.
Greedy Decoding for Statistical Machine Translation in Almost Linear Time
, 2003
"... We present improvements to a greedy decoding algorithm for statistical machine translation that reduce its time complexity from at least cubic (O(n^6) when applied navely) to practically linear time without sacrificing translation quality. We achieve this by integrating hypothesis evaluati ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
We present improvements to a greedy decoding algorithm for statistical machine translation that reduce its time complexity from at least cubic (O(n^6) when applied navely) to practically linear time without sacrificing translation quality. We achieve this by integrating hypothesis evaluation into hypothesis creation, tiling improvements over the translation hypothesis at the end of each search iteration, and by imposing restrictions on the amount of word reordering during decoding.
Some computational complexity results for synchronous context-free grammars
- In Proceedings of HLT/EMNLP-05
, 2005
"... This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. These models can be viewed as pairs of probabilistic contextfree grammars working in a ‘synchronous’ way. Two hardness result ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. These models can be viewed as pairs of probabilistic contextfree grammars working in a ‘synchronous’ way. Two hardness results for the class NP are reported, along with an exponential time lower-bound for certain classes of algorithms that are currently used in the literature. 1
Dependency tree translation: Syntactically informed phrasal smt
- In ACL
, 2005
"... done while at Microsoft Research We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. We depend on a source-language dependency parser and a word-aligned parallel corpus. The only targe ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
done while at Microsoft Research We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. We depend on a source-language dependency parser and a word-aligned parallel corpus. The only target language resource assumed is a word breaker. These are used to produce treelet (“phrase”) translation pairs as well as several models, including a channel model, an order model, and a target language model. Together these models and the treelet translation pairs provide a powerful and promising approach to MT that incorporates the power of phrasal SMT with the linguistic generality available in a parser. We evaluate two decoding approaches, one inspired by dynamic programming and the

