Results 1 -
2 of
2
Learning an English-Chinese lexicon from a parallel corpus
- In Proceedings of the First Conference of the Association for Machine Translation in the Americas
, 1994
"... We report experiments on automatic learning of an English-Chinese translation lexicon, through statistical training on a large parallel corpus. The learned vocabulary size is nontrivial at 6,517 English words averaging 2.33 Chinese translations per entry, with a manuallyfiltered precision of 95.1 % ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
We report experiments on automatic learning of an English-Chinese translation lexicon, through statistical training on a large parallel corpus. The learned vocabulary size is nontrivial at 6,517 English words averaging 2.33 Chinese translations per entry, with a manuallyfiltered precision of 95.1 % and a single-most-probable precision of 91.2%. We then introduce a significance filtering method that is fully automatic, yet still yields a weighted precision of 86.0%. Learning of translations is adaptive to the domain. To our knowledge, these are the first empirical results of the kind between an Indo-European and non-Indo-European language for any significant corpus size with a non-toy vocabulary. 1
Clause Alignment for Hong Kong Legal Texts: A Lexical-based Approach
- International Journal of Corpus Linguistics
, 2004
"... In this paper we report on our recent work in clause alignment for English-Chinese legal texts using available lexical resources including a bilingual legal glossary and a bilingual dictionary, for the purpose of acquiring examples at various linguistic levels for example-based machine translation ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper we report on our recent work in clause alignment for English-Chinese legal texts using available lexical resources including a bilingual legal glossary and a bilingual dictionary, for the purpose of acquiring examples at various linguistic levels for example-based machine translation. We present our formulation of an appropriate measure for the similarity of a candidate pair of clauses with respect to matched lexical items and the corresponding implementation of an e#ective algorithm for clause alignment based on this similarity measure. Experimental results show that the similarity measure and the lexical-based clause alignment algorithm, though very simple, are very e#ective, with a performance of 94.6% alignment accuracy. It confirms our intuition that lexical information gives a reliable indication of correct alignment. The significance of this lexical-based approach lies in both its simplicity and e#ectiveness.

