• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 288
Next 10 →

Seoul, Korea

by Andrew Finch, Ezra Black, Epimenides Corp, Young-sook Hwang, Eiichiro Sumita
"... eiichiro.sumita ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
eiichiro.sumita

Using Languageand Translation Mod to Select the Best among Outputs from Multiple MT systems YasuhiroAkhir Taro Watanabe and Eiichiro Sumita

by Atr Spok En, Taro Watanabe, Eiichiro Sumita
"... addressestd problem ofautC4WP ically selectKH tl best amongoutgvP from multW4v machine tevH+K+Hv3 (MT)systPC4 ExistPC approachesselect tl outct assigned ts highest score accordingt at arget language model. In some cases,ts existvH approaches donot work well. This paper proposest wo met4 ds t improve ..."
Abstract - Add to MetaCart
addressestd problem ofautC4WP ically selectKH tl best amongoutgvP from multW4v machine tevH+K+Hv3 (MT)systPC4 ExistPC approachesselect tl outct assigned ts highest score accordingt at arget language model. In some cases,ts existvH approaches donot work well. This paper proposest wo met4 ds t improve performance. Thefirst mett d is based on amult4HC comparisontom and checkswhetBL a score from language and tdvP4KWPv3 models is significantg highertgh t otgher The secondmetn d is based on probabilit ytHP at+KLBv3BLW isnot inferiort t otferi which ispredictv fromto above scores. Experimenti result showt hat t he proposed metH ds achieve an improvement f2t o 6 % in performance.

Toward a broad-coverage bilingual corpus for speech translation of travel conversations in the real world

by Toshiyuki Takezawa, Eiichiro Sumita, Fumiaki Sugaya, Hirofumi Yamamoto, Seiichi Yamamoto - in Proc. of the Third Int. Conf. on Language Resources and Evaluation (LREC), Las , 2002
"... At ATR Spoken Language Translation Research Laboratories, we are building a broad-coverage bilingual corpus to study corpus-based speech translation technologies for the real world. There are three important points to consider in designing and constructing a corpus for future speech translation rese ..."
Abstract - Cited by 99 (16 self) - Add to MetaCart
At ATR Spoken Language Translation Research Laboratories, we are building a broad-coverage bilingual corpus to study corpus-based speech translation technologies for the real world. There are three important points to consider in designing and constructing a corpus for future speech translation research. The first is to have a variety of speech samples, with a wide range of pronunciations and speakers. The second is to have data for a variety of situations. The third is to have a variety of expressions. This paper reports our trials and discusses the methodology. First, we introduce a bilingual travel conversation (TC) corpus of spoken languages and a broad-coverage bilingual basic expression (BE) corpus. TC and BE are designed to be complementary. TC is a collection of transcriptions of bilingual spoken dialogues, while BE is a collection of Japanese sentences and their English translations. Whereas TC covers a small domain, BE covers a wide variety of domains. We compare the characteristics of vocabulary and expressions between these two corpora and suggest that we need a much greater variety of expressions. One promising approach might be to collect paraphrases representing various different expressions generated by many people for similar concepts. 1.

Experiments And Prospects Of Example-Based Machine Translation

by Eiichiro Sumita, Hitoshi Iida , 1991
"... EBMT (Example-Based Machine Translation) is proposed. EBMT retrieves similar examples (pairs of source phrases, sentences, or texts and their translations) from a tahase of examples, adapting the examples to franslate a new input. EBMT has the following features: (1) It is easily upgraded simply by ..."
Abstract - Cited by 85 (10 self) - Add to MetaCart
EBMT (Example-Based Machine Translation) is proposed. EBMT retrieves similar examples (pairs of source phrases, sentences, or texts and their translations) from a tahase of examples, adapting the examples to franslate a new input. EBMT has the following features: (1) It is easily upgraded simply by inputting appropriate examples to the database; (2) It assigns a reliability factor to the translation result; (3) It is accelerated effectively by both indexing axi parallel computing; (4) It is robust because of best-match reasoning; (5) It well utilizes translator expertise. A prototype system has been implemented to deal with a difficult Iranslation problem fee conventional Rule-Based Machine Translation (RBMT), i.e., translating Japanese noun phrases of the form 'lq a no N2" into English. The system has achieved about a 78% success rate on average. This paper explains the basic idea of EBMT, illustrates the experiment in detail, explains the broad applicability of EBMT to several difficult translation problems fee RBMT discusses the advantages of integrating EBMT with RBMT.

Using multiple edit distances to automatically rank machine translation output

by Yasuhiro Akiba, Kenji Imamura, Eiichiro Sumita - In "Proc. MT Summit VIII , 2001
"... This paper addresses the challenging problem of automatically evaluating output from machine translation (MT) systems in order to support the developers of these systems. Conventional approaches to the problem include methods that automatically assign a rank such as A, B, C, or D to MT output accord ..."
Abstract - Cited by 34 (5 self) - Add to MetaCart
This paper addresses the challenging problem of automatically evaluating output from machine translation (MT) systems in order to support the developers of these systems. Conventional approaches to the problem include methods that automatically assign a rank such as A, B, C, or D to MT output according to a single edit distance between this output and a correct translation example. The single edit distance can be differently designed, but changing its design makes assigning a certain rank more accurate, but another rank less accurate. This inhibits improving accuracy of rank assignment. To overcome this obstacle, this paper proposes an automatic ranking method that, by using multiple edit distances, encodes machine-translated sentences with a rank assigned by humans into multi-dimensional vectors from which a classifier of ranks is learned in the form of a decision tree (DT). The proposed method assigns a rank to MT output through the learned DT. The proposed method is evaluated using transcribed texts of real conversations in the travel arrangement domain. Experimental results show that the proposed method is more accurate than the single-edit-distance-based ranking methods, in both closed and open tests. Moreover, the proposed method could estimate MT quality within 3 % error in some cases.

Feedback Cleaning of Machine Translation Rules Using Automatic Evaluation

by Kenji Imamura, Eiichiro Sumita - In IPSJ , 2003
"... When rules of transfer-based machine translation (MT) are automatically acquired from bilingual corpora, incorrect /redundant rules are generated due to acquisition errors or translation variety in the corpora. As a new countermeasure to this problem, we propose a feedback cleaning method usi ..."
Abstract - Cited by 22 (6 self) - Add to MetaCart
When rules of transfer-based machine translation (MT) are automatically acquired from bilingual corpora, incorrect /redundant rules are generated due to acquisition errors or translation variety in the corpora. As a new countermeasure to this problem, we propose a feedback cleaning method using automatic evaluation of MT quality, which removes incorrect /redundant rules as a way to increase the evaluation score. BLEU is utilized for the automatic evaluation. The hillclimbing algorithm, which involves features of this task, is applied to searching for the optimal combination of rules. Our experiments show that the MT quality improves by 10% in test sentences according to a subjective evaluation. This is considerable improvement over previous methods.

Using machine translation evaluation techniques to determine sentence-level semantic equivalence

by Andrew Finch - In IWP2005 , 2005
"... The task of machine translation (MT) evaluation is closely related to the task of sentence-level semantic equivalence classification. This paper investigates the utility of applying standard MT evaluation methods (BLEU, NIST, WER and PER) to building classifiers to predict semantic equivalence and e ..."
Abstract - Cited by 20 (0 self) - Add to MetaCart
The task of machine translation (MT) evaluation is closely related to the task of sentence-level semantic equivalence classification. This paper investigates the utility of applying standard MT evaluation methods (BLEU, NIST, WER and PER) to building classifiers to predict semantic equivalence and entailment. We also introduce a novel classification method based on PER which leverages part of speech information of the words contributing to the word matches and non-matches in the sentence. Our results show that MT evaluation techniques are able to produce useful features for paraphrase classification and to a lesser extent entailment. Our technique gives a substantial improvement in paraphrase classification accuracy over all of the other models used in the experiments. 1

Automatic Paraphrasing Based on Parallel Corpus for Normalization

by Mitsuo Shimohata, Eiichiro Sumita - PROC. OF LREC , 2002
"... ..."
Abstract - Cited by 14 (5 self) - Add to MetaCart
Abstract not found

A unified approach in speech-tospeech translation: Integrating features of speech recognition and machine translation

by Ruiqiang Zhang, Genichiro Kikui, Hirofumi Yamamoto, Taro Watanabe, Frank Soong, Wai Kit Lo - Proc. of Coling 2004, Geveva , 2004
"... Based upon a statistically trained speech translation system, in this study, we try to combine distinctive features derived from the two modules: speech recognition and statistical machine translation, in a loglinear model. The translation hypotheses are then rescored and translation performance is ..."
Abstract - Cited by 21 (3 self) - Add to MetaCart
Based upon a statistically trained speech translation system, in this study, we try to combine distinctive features derived from the two modules: speech recognition and statistical machine translation, in a loglinear model. The translation hypotheses are then rescored and translation performance is improved. The standard translation evaluation metrics, including BLEU, NIST, multiple reference word error rate and its position independent counterpart, were optimized to solve the weights of the features in the log-linear model. The experimental results have shown significant improvement over the baseline IBM model 4 in all automatic translation evaluation metrics. The largest was for BLEU, by 7.9% absolute. 1

A Translation Aid System Using Flexible Text Retrieval Based on Syntax-Matching

by Eiichiro Sumita, Yutaka Tsutsumi - on Syntax Matching. TRL Research Report, IBM , 1988
"... Abstract: ETOC (Easy TO Consult) is a translation aid that provides a useful capability for flexible retrieval of texts from a bi-lingual dictionary or a translation database accumulated by the user or other users. The retrieval mechanism is based on syntax-matching driven by generalization rules. A ..."
Abstract - Cited by 16 (3 self) - Add to MetaCart
Abstract: ETOC (Easy TO Consult) is a translation aid that provides a useful capability for flexible retrieval of texts from a bi-lingual dictionary or a translation database accumulated by the user or other users. The retrieval mechanism is based on syntax-matching driven by generalization rules. A practical response time is made possible by restricting the retrieval space, using a new data structure called a quick-look-up table. This method has the following advantages: (1) the user can input an appropriate text as a key, without using any special formal language, and (2) it is easy to produce domain-oriented systems by collecting pairs of typical source sentences and target translations that are specific to a particular domain,
Next 10 →
Results 1 - 10 of 288
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University