• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A survey of statistical machine translation (2007)

Cached

  • Download as a PDF

Download Links

  • [umiacs.umd.edu]
  • [www.cs.umd.edu]
  • [homepages.inf.ed.ac.uk]
  • [www.cs.jhu.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Adam Lopez
Citations:30 - 3 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Lopez07asurvey,
    author = {Adam Lopez},
    title = {A survey of statistical machine translation},
    year = {2007}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and new ideas are constantly introduced. This survey presents a tutorial overview of the state of the art. We describe the context of the current research and then move to a formal problem description and an overview of the main subproblems: translation modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and a discussion of future directions.

Citations

6236 Maximum likelihood from incomplete data via EM algorithm - Dempster, Laird, et al. - 1977
3376 Introduction to Automata Theory Languages and Computation. 2nd edition. Addison-Wesley Publishing Company, 2000. de - Hopcroft, Ullman - 2004
2960 Artificial Intelligence: A Modern Approach. 2nd edn - Russell, Norvig - 2003
1655 M: Building a Large Annotated Corpus of English: The Penn Tree Bank - Marcus, Santorini, et al. - 1993
992 BLEU: A Method for Automatic Evaluation of Machine Translation - Papineni, Roukos, et al. - 2002
892 Mathematics of Statistical Machine Translation: Parameter Estimation - Brown, Pietra, et al. - 1993
874 Error bounds for convolutional codes and an asymptotically optimum decoding algorithm - Viterbi - 1967
846 A Maximum Entropy Approach to Natural Language Processing - Berger, Pietra, et al. - 1996
805 A Systematic Comparison of Various Statistical Alignment Models - Och, Ney - 2003
612 Statistical methods for speech recognition - JELINEK - 1998
549 Tree-adjoining grammars - Joshi, Schabes - 1997
540 Class-based n-gram models of natural language - Brown, Pietra, et al. - 1992
502 A statistical approach to machine translation - Brown, Cocke, et al. - 1990
434 Moses: Open source toolkit for statistical machine translation - Koehn, Hoang, et al. - 2007
422 An Inequality and Associated Maximization Technique in Statistical Estimation of a Markov Process - Baum - 1972
417 Statistical Phrase-Based Translation - Koehn, Och, et al. - 2003
356 Generalized Iterative Scaling for Log-Linear Models - Darroch, Ratcliff - 1972
349 A program for aligning sentences in bilingual corpora - Gale, Church - 1993
343 Stochastic inversion transduction grammars and bilingual parsing of parallel corpora - Wu - 1997
341 Improved statistical alignment models - Och, Ney - 2000
317 SJ: The estimation of stochastic context-free grammars using the INSIDE-OUTSIDE algorithm. Computer Speech and Language - Lari, Young - 1990
282 Lattice-based minimum error rate training for statistical machine translation - Macherey, Och, et al. - 2008
257 A hierarchical phrase-based model for statistical machine translation - Chiang - 2005
256 Discriminative training and maximum entropy models for statistical machine translation - Och, Ney
216 Automatic Evaluation of Machine Translation Quality using N-gram Co-occurrence Statistics - Doddington - 2010
212 Tagging English text with probabilistic model - Merialdo - 1994
209 Hierarchical phrase-based translation - Chiang - 2007
205 Improved alignment models for statistical machine translation - Och, Tillmann, et al. - 1999
204 A Study of Translation Edit Rate with Targeted Human Annotation - Snover, Dorr, et al. - 2006
202 Speech and Language Processing – An Introduction to - Jurafsky, Martin - 2009
202 K.: A syntax-based statistical translation model - Yamada, Knight - 2001
167 Maximum Entropy Models for natural language ambiguity resolution - Ratnaparkhi - 1993
162 D.: What’s in a translation rule - Galley, Hopkins, et al. - 2004
158 Europarl: A parallel corpus for statistical machine translation - Koehn - 2005
141 KW: Identifying word correspondences in parallel texts - WA, Church - 1991
136 Scalable inference and training of context-rich syntactic translation models - Galley, Graehl, et al. - 2006
135 A phrase-based, joint probability model for statistical machine translation - Marcu, Wong - 2002
122 The convergence of mildly context-sensitive grammatical formalisms - Joshi, Vijay-Shanker, et al. - 1991
121 Models of translational equivalence among words - Melamed
119 Two decades of statistical language modeling: Where do we go from here - Rosenfeld
106 Exploiting syntactic structure for language modeling - Chelba, Jelinek - 1998
103 Better k-best parsing - Huang, Chiang - 2005
102 Statistical Significance Tests for Machine Translation Evaluation - Koehn - 2004
102 Dependency treelet translation: Syntactically informed phrasal smt - Quirk, Menezes, et al. - 2005
101 The web as a parallel corpus - Resnik, Smith - 2003
100 A Statistical Parser for Czech - Collins, Ramshaw, et al. - 1999
97 Paraphrasing with bilingual parallel corpora - Bannard, Callison-Burch - 2005
95 Alignment by Agreement - Liang, Taskar, et al. - 2006
94 Manning and Hinrich Schutze. Foundation of Statistical Language Processing - Christopher - 1999
93 A smorgasbord of features for statistical machine translation - Och, Gildea, et al. - 2004
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University