Results 1 - 10
of
13
Application-driven Statistical Paraphrase Generation
"... Paraphrase generation (PG) is important in plenty of NLP applications. However, the research of PG is far from enough. In this paper, we propose a novel method for statistical paraphrase generation (SPG), which can (1) achieve various applications based on a uniform statistical model, and (2) natura ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Paraphrase generation (PG) is important in plenty of NLP applications. However, the research of PG is far from enough. In this paper, we propose a novel method for statistical paraphrase generation (SPG), which can (1) achieve various applications based on a uniform statistical model, and (2) naturally combine multiple resources to enhance the PG performance. In our experiments, we use the proposed method to generate paraphrases for three different applications. The results show that the method can be easily transformed from one application to another and generate valuable and interesting paraphrases. 1
Sentence-level MT Evaluation Without Reference Translations: Beyond Language Modeling
- In European Association for Machine Translation (EAMT
, 2005
"... Abstract. In this paper we investigate the possibility of evaluating MT quality and fluency at the sentence level in the absence of reference translations. We measure the correlation between automatically-generated scores and human judgments, and we evaluate the performance of our system when used a ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract. In this paper we investigate the possibility of evaluating MT quality and fluency at the sentence level in the absence of reference translations. We measure the correlation between automatically-generated scores and human judgments, and we evaluate the performance of our system when used as a classifier for identifying highly dysfluent and illformed sentences. We show that we can substantially improve on the correlation between language model perplexity scores and human judgment by combining these perplexity scores with class probabilities from a machine-learned classifier. The classifier uses linguistic features and has been trained to distinguish human translations from machine translations. We show that this approach also performs well in identifying dysfluent sentences. 1.
Joint optimization for machine translation system combination
- in Proc. EMNLP
, 2009
"... System combination has emerged as a powerful method for machine translation (MT). This paper pursues a joint optimization strategy for combining outputs from multiple MT systems, where word alignment, ordering, and lexical selection decisions are made jointly according to a set of feature functions ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
System combination has emerged as a powerful method for machine translation (MT). This paper pursues a joint optimization strategy for combining outputs from multiple MT systems, where word alignment, ordering, and lexical selection decisions are made jointly according to a set of feature functions combined in a single log-linear model. The decoding algorithm is described in detail and a set of new features that support this joint decoding approach is proposed. The approach is evaluated in comparison to state-of-the-art confusion-network-based system combination methods using equivalent features and shown to outperform them significantly. 1
Learning for Semantic Parsing and Natural Language Generation Using Statistical Machine Translation Techniques
, 2007
"... ..."
Improving Phrase-Based Translation via Word Alignments from Stochastic Inversion Transduction Grammars
"... We argue that learning word alignments through a compositionally-structured, joint process yields higher phrase-based translation accuracy than the conventional heuristic of intersecting conditional models. Flawed word alignments can lead to flawed phrase translations that damage translation accurac ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We argue that learning word alignments through a compositionally-structured, joint process yields higher phrase-based translation accuracy than the conventional heuristic of intersecting conditional models. Flawed word alignments can lead to flawed phrase translations that damage translation accuracy. Yet the IBM word alignments usually used today are known to be flawed, in large part because IBM models (1) model reordering by allowing unrestricted movement of words, rather than constrained movement of compositional units, and therefore must (2) attempt to compensate via directed, asymmetric distortion and fertility models. The conventional heuristics for attempting to recover from the resulting alignment errors involve estimating two directed models in opposite directions and then intersecting their alignments – to make up for the fact that, in reality, word alignment is an inherently joint relation. A natural alternative is provided by Inversion Transduction Grammars, which estimate the joint word alignment relation directly, eliminating the need for any of the conventional heuristics. We show that this alignment ultimately produces superior translation accuracy on BLEU, NIST, and METEOR metrics over three distinct language pairs. 1
Web-Based Machine Translation
, 2003
"... Abstract This chapter has two main aims: (i) to present the state-of-the-art in Machine Translation (MT), namely Phrase-Based Statistical MT, together with the major competing paradigms used in MT research and development today; and (ii) to provide an overview of the MT research carried out by my te ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract This chapter has two main aims: (i) to present the state-of-the-art in Machine Translation (MT), namely Phrase-Based Statistical MT, together with the major competing paradigms used in MT research and development today; and (ii) to provide an overview of the MT research carried out by my team here at DCU, characterised here in terms of ‘hybrid MT’. In addition, we provide our views on the directions that MT research might take in the near future, and conclude the chapter with lists of further reading for the interested reader.
Detecting Inappropriate Use of Free Online Machine Translation by Language Students – A Special Case of Plagiarism Detection
- Proceedings of the Eleventh Annual Conference of the European Association for Machine Translation
"... Abstract. The ready availability of free online machine translation (MT) systems has given rise to a problem in the world of language teaching in that students – especially weaker ones – use free online MT to do their translation homework. Apart from the pedagogic implications, one question of inter ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. The ready availability of free online machine translation (MT) systems has given rise to a problem in the world of language teaching in that students – especially weaker ones – use free online MT to do their translation homework. Apart from the pedagogic implications, one question of interest is whether we can devise any techniques for automatically detecting such use. This paper reports an experiment which aims to address this particular problem, using methods from the broader world of computational stylometry, plagiarism detection, text reuse, and MT evaluation. A pilot experiment comparing ‘honest’ and ‘derived ’ translations produced by 25 intermediate learners of Spanish, Italian and German is reported. 1.
Overview of NTCIR-9 RITE: Recognizing Inference in TExt
"... This paper introduces an overview of the RITE (Recognizing Inference in TExt) task in NTCIR-9. We evaluate systems that automatically recognize entailment, paraphrase, and contradiction between two texts written in Japanese, Simplified Chinese, or Traditional Chinese. The task consists of four subta ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper introduces an overview of the RITE (Recognizing Inference in TExt) task in NTCIR-9. We evaluate systems that automatically recognize entailment, paraphrase, and contradiction between two texts written in Japanese, Simplified Chinese, or Traditional Chinese. The task consists of four subtasks: Binary classification of entailment (BC); Multi-class classification including paraphrase and contradiction (MC); and two extrinsic application-oriented datasets: Entrance Exam and RITE4QA. This paper also describes how we built the test collection, evaluation metrics, and evaluation results of the submitted runs.
Statistical Post-Editing of a Rule-Based Machine Translation System ∗
"... Automatic post-editing (APE) systems aim at correcting the output of machine translation systems to produce better quality translations, i.e. produce translations can be manually postedited with an increase in productivity. In this work, we present an APE system that uses statistical models to enhan ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Automatic post-editing (APE) systems aim at correcting the output of machine translation systems to produce better quality translations, i.e. produce translations can be manually postedited with an increase in productivity. In this work, we present an APE system that uses statistical models to enhance a commercial rulebased machine translation (RBMT) system. In addition, a procedure for effortless human evaluation has been established. We have tested the APE system with two corpora of different complexity. For the Parliament corpus, we show that the APE system significantly complements and improves the RBMT system. Results for the Protocols corpus, although less conclusive, are promising as well. Finally, several possible sources of errors have been identified which will help develop future system enhancements. 1
The Back-translation Score: Automatic MT Evaluation at the Sentence Level without Reference Translations
"... Automatic tools for machine translation (MT) evaluation such as BLEU are well established, but have the drawbacks that they do not perform well at the sentence level and that they presuppose manually translated reference texts. Assuming that the MT system to be evaluated can deal with both direction ..."
Abstract
- Add to MetaCart
Automatic tools for machine translation (MT) evaluation such as BLEU are well established, but have the drawbacks that they do not perform well at the sentence level and that they presuppose manually translated reference texts. Assuming that the MT system to be evaluated can deal with both directions of a language pair, in this research we suggest to conduct automatic MT evaluation by determining the orthographic similarity between a back-translation and the original source text. This way we eliminate the need for human translated reference texts. By correlating BLEU and back-translation scores with human judgments, it could be shown that the backtranslation score gives an improved performance at the sentence level. 1

