• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Robust Machine Translation Evaluation with Entailment Features (2009)

Cached

  • Download as a PDF

Download Links

  • [www.nlpado.de]
  • [www.nlpado.de]
  • [aclweb.org]
  • [www.aclweb.org]
  • [nlp.stanford.edu]
  • [nlp.stanford.edu]
  • [www-nlp.stanford.edu]
  • [www.aclweb.org]
  • [nlp.stanford.edu]
  • [www-nlp.stanford.edu]
  • [wing.comp.nus.edu.sg]
  • [aclweb.org]
  • [aclweb.org]
  • [newdesign.aclweb.org]
  • [nlp.csie.ncnu.edu.tw]
  • [wing.comp.nus.edu.sg]
  • [www.mt-archive.info]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Sebastian Padó , Michel Galley , Dan Jurafsky , Chris Manning
Venue:Proceedings of ACL-IJCNLP
Citations:33 - 3 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Padó09robustmachine,
    author = {Sebastian Padó and Michel Galley and Dan Jurafsky and Chris Manning},
    title = {Robust Machine Translation Evaluation with Entailment Features},
    booktitle = {Proceedings of ACL-IJCNLP},
    year = {2009},
    pages = {297--305}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

Existing evaluation metrics for machine translation lack crucial robustness: their correlations with human quality judgments vary considerably across languages and genres. We believe that the main reason is their inability to properly capture meaning: A good translation candidate means the same thing as the reference translation, regardless of formulation. We propose a metric that evaluates MT output based on a rich set of features motivated by textual entailment, such as lexical-semantic (in-)compatibility and argument structure overlap. We compare this metric against a combination metric of four state-of-theart scores (BLEU, NIST, TER, and METEOR) in two different settings. The combination metric outperforms the individual scores, but is bested by the entailment-based metric. Combining the entailment and traditional features yields further improvements. 1

Keyphrases

robust machine translation evaluation    entailment feature    machine translation    mt output    evaluation metric    main reason    state-of-theart score    argument structure overlap    reference translation    crucial robustness    different setting    individual score    rich set    textual entailment    human quality judgment    good translation candidate    traditional feature yield improvement   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University