Results 1 -
6 of
6
Learning Transliteration Lexicons from the Web
"... ntust.edu.tw This paper presents an adaptive learning framework for Phonetic Similarity Modeling (PSM) that supports the automatic construction of transliteration lexicons. The learning algorithm starts with minimum prior knowledge about machine transliteration, and acquires knowledge iteratively fr ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
ntust.edu.tw This paper presents an adaptive learning framework for Phonetic Similarity Modeling (PSM) that supports the automatic construction of transliteration lexicons. The learning algorithm starts with minimum prior knowledge about machine transliteration, and acquires knowledge iteratively from the Web. We study the active learning and the unsupervised learning strategies that minimize human supervision in terms of data labeling. The learning process refines the PSM and constructs a transliteration lexicon at the same time. We evaluate the proposed PSM and its learning algorithm through a series of systematic experiments, which show that the proposed framework is reliably effective on two independent databases. 1
Chinese-English Organization Name Translation Based on Correlative Expansion
"... This paper presents an approach to translating Chinese organization names into English based on correlative expansion. Firstly, some candidate translations are generated by using statistical translation method. And several correlative named entities for the input are retrieved from a correlative nam ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents an approach to translating Chinese organization names into English based on correlative expansion. Firstly, some candidate translations are generated by using statistical translation method. And several correlative named entities for the input are retrieved from a correlative named entity list. Secondly, three kinds of expansion methods are used to generate some expanded queries. Finally, these queries are submitted to a search engine, and the refined translation results are mined and re-ranked by using the returned web pages. Experimental results show that this approach outperforms the compared system in overall translation accuracy. 1
Toward Statistical Machine Translation without Parallel Corpora
"... We estimate the parameters of a phrasebased statistical machine translation system from monolingual corpora instead of a bilingual parallel corpus. We extend existing research on bilingual lexicon induction to estimate both lexical and phrasal translation probabilities for MT-scale phrasetables. We ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We estimate the parameters of a phrasebased statistical machine translation system from monolingual corpora instead of a bilingual parallel corpus. We extend existing research on bilingual lexicon induction to estimate both lexical and phrasal translation probabilities for MT-scale phrasetables. We propose a novel algorithm to estimate reordering probabilities from monolingual data. We report translation results for an end-to-end translation system using these monolingual features alone. Our method only requires monolingual corpora in source and target languages, a small bilingual dictionary, and a small bitext for tuning feature weights. In this paper, we examine an idealization where a phrase-table is given. We examine the degradation in translation performance when bilingually estimated translation probabilities are removed and show that 80%+ of the loss can be recovered with monolingually estimated features alone. We further show that our monolingual features add 1.5 BLEU points when combined with standard bilingually estimated phrase table features. 1
Improving Named Entity Translation by Exploiting Comparable and Parallel Corpora
"... Translation of named entities (NEs), such as person, organization, country, and location names is very important for several natural language processing applications. It plays a vital role in applications like cross lingual information retrieval, and machine translation. Web and news documents intro ..."
Abstract
- Add to MetaCart
Translation of named entities (NEs), such as person, organization, country, and location names is very important for several natural language processing applications. It plays a vital role in applications like cross lingual information retrieval, and machine translation. Web and news documents introduce new named entities on regular basis. Those new names cannot be captured by ordinary machine translation systems. In this paper, we introduce a framework for extracting named entity translation pairs. The framework contains methods for exploiting both comparable and parallel corpora to generate a regularly updated list of named entity translation pairs. We evaluate the quality of the extracted translation pairs by showing that it improves the performance of a named entity translation system.
Mining the Web for Domain-Specific Translations
"... We introduce a method for learning to find domain-specific translations for a given term on the Web. In our approach, the source term is transformed into an expanded query aimed at maximizing the probability of retrieving translations from a very large collection of mixed-code documents. The method ..."
Abstract
- Add to MetaCart
We introduce a method for learning to find domain-specific translations for a given term on the Web. In our approach, the source term is transformed into an expanded query aimed at maximizing the probability of retrieving translations from a very large collection of mixed-code documents. The method involves automatically generating sets of targetlanguage words from training data in specific domains, automatically selecting target words for effectiveness in retrieving documents containing the sought-after translations. At run time, the given term is transformed into an expanded query and submitted to a search engine, and ranked translations are extracted from the document snippets returned by the search engine. We present a prototype, TermMine, which applies the method to a Web search engine. Evaluations over a set of domains and terms show that TermMine outperforms state-of-the-art machine translation systems. 1
NLP MEETS LIBRARY SCIENCE: PROVIDING A SET OF ENHANCED LANGUAGE REFERENCE TOOLS FOR ONLINE TRANSLATORS
"... Introduction. We are developing an online translation aid tool that provides enhanced language reference tools, specifically designed for translators working online. This paper introduces the concepts and framework for enhancing and organising the reference tools to maximally help translators. Metho ..."
Abstract
- Add to MetaCart
Introduction. We are developing an online translation aid tool that provides enhanced language reference tools, specifically designed for translators working online. This paper introduces the concepts and framework for enhancing and organising the reference tools to maximally help translators. Method. This paper is an essentially theoretical work, resorting to deduction from basic premises on language and translation. In order to design a maximally useful system, however, we also analysed

