Results 1 -
9 of
9
The Case for VM-based Cloudlets in Mobile Computing
"... Mobile computing is at a fork in the road. After two decades of sustained effort by many researchers, we have developed the core concepts, techniques and mechanisms to provide a solid foundation for this still fast-growing ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
Mobile computing is at a fork in the road. After two decades of sustained effort by many researchers, we have developed the core concepts, techniques and mechanisms to provide a solid foundation for this still fast-growing
METIS-II. The German to English MT System
- In Proceedings of the 11th Machine Translation Summit
, 2007
"... Within the METIS-II project 1, we have implemented a machine translation system which uses transfer and expander rules to build an AND/OR graph of partial translation hypotheses and a statistical ranker to find the best path through the graph. The paper gives an overview of the architecture and an e ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Within the METIS-II project 1, we have implemented a machine translation system which uses transfer and expander rules to build an AND/OR graph of partial translation hypotheses and a statistical ranker to find the best path through the graph. The paper gives an overview of the architecture and an evaluation of the system for several languages. 1
Compiling a Massive, Multilingual Dictionary via Probabilistic Inference
, 2009
"... Can we automatically compose a large set of Wiktionaries and translation dictionaries to yield a massive, multilingual dictionary whose coverage is substantially greater than that of any of its constituent dictionaries? The composition of multiple translation dictionaries leads to a transitive infer ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Can we automatically compose a large set of Wiktionaries and translation dictionaries to yield a massive, multilingual dictionary whose coverage is substantially greater than that of any of its constituent dictionaries? The composition of multiple translation dictionaries leads to a transitive inference problem: if word A translates to word B which in turn translates to word C, what is the probability that C is a translation of A? The paper introduces a novel algorithm that solves this problem for 10,000,000 words in more than 1,000 languages. The algorithm yields PANDIC-TIONARY, a novel multilingual dictionary. PANDICTIONARY contains more than four times as many translations than in the largest Wiktionary at precision 0.90 and over 200,000,000 pairwise translations in over 200,000 language pairs at precision 0.8.
Panlingual Lexical Translation via Probabilistic Inference
"... The bare minimum lexical resource required to translate between a pair of languages is a translation dictionary. Unfortunately, dictionaries exist only between a tiny fraction of the 49 million possible language-pairs making machine translation virtually impossible between most of the languages. Thi ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The bare minimum lexical resource required to translate between a pair of languages is a translation dictionary. Unfortunately, dictionaries exist only between a tiny fraction of the 49 million possible language-pairs making machine translation virtually impossible between most of the languages. This paper summarizes the last four years of our research motivated by the vision of panlingual communication. Our research comprises three key steps. First, we compile over 630 freely available dictionaries over the Web and convert this data into a single representation – the translation graph. Second, we build several inference algorithms that infer translations between word pairs even when no dictionary lists them as translations. Finally, we run our inference procedure offline to construct PANDICTIONARY – a sense-distinguished, massively multilingual dictionary that has translations in more than 1000 languages. Our experiments assess the quality of this dictionary and find that we have 4 times as many translations at a high precision of 0.9 compared to the English Wiktionary, which is the lexical resource closest to PANDIC-TIONARY.
Lemmatic Machine Translation
"... Statistical MT is limited by reliance on large parallel corpora. We propose Lemmatic MT, a new paradigm that extends MT to a far broader set of languages, but requires substantial manual encoding effort. We present PANLINGUAL TRANSLATOR, a prototype Lemmatic MT system with high translation adequacy ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Statistical MT is limited by reliance on large parallel corpora. We propose Lemmatic MT, a new paradigm that extends MT to a far broader set of languages, but requires substantial manual encoding effort. We present PANLINGUAL TRANSLATOR, a prototype Lemmatic MT system with high translation adequacy on 59 % to 99 % of sentences (average 84%) on a sample of 6 language pairs that Google Translate (GT) handles. GT ranged from 34 % to 93%, average 65%. PANLINGUAL TRANSLATOR also had high translation adequacy on 27 % to 82 % of sentences (average 62%) from a sample of 5 language pairs not handled by GT.
Building a Sense-Distinguished Multilingual Lexicon from Monolingual Corpora and Bilingual Lexicons
, 2007
"... Both lexical translation and knowledge-based translation systems require sense-distinguished translation lexicons, yet such lexicons are expensive to create manually. However, the abundance of untagged monolingual corpora and the availability of bilingual, machinereadable dictionaries (MRDs) suggest ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Both lexical translation and knowledge-based translation systems require sense-distinguished translation lexicons, yet such lexicons are expensive to create manually. However, the abundance of untagged monolingual corpora and the availability of bilingual, machinereadable dictionaries (MRDs) suggest an opportunity. Our PanLexicon system takes advantage of these resources to automatically construct a sensedistinguished multilingual lexicon. The challenge for PanLexicon is that free, bilingual MRDs do not make sense distinctions, and often have spotty coverage. PanLexicon uses word contexts from monolingual corpora to guide it in finding translation sets – sets of words that share the same word sense across multiple languages. By maintaining word sense distinctions, PanLexicon finds translations between language pairs that are not supported by any of its bilingual source dictionaries. PanLexicon runs in time linear in the size of its input, and thus scales readily to large numbers of languages. We built a prototype of PanLexicon with inputs from Spanish-English and Chinese-English dictionaries. Our initial experimental results show that PanLexicon is able to find high-quality translation sets despite the limitations of its inputs.
Panlingual Lexical Translation via Probabilistic Inference
, 2010
"... The bare minimum lexical resource required to translate between a pair of languages is a translation dictionary. Unfortunately, dictionaries exist only between a tiny fraction of the 49 million possible language-pairs making machine translation virtually impossible between most of the languages. Thi ..."
Abstract
- Add to MetaCart
The bare minimum lexical resource required to translate between a pair of languages is a translation dictionary. Unfortunately, dictionaries exist only between a tiny fraction of the 49 million possible language-pairs making machine translation virtually impossible between most of the languages. This paper summarizes the last four years of our research motivated by the vision of panlingual communication. Our research comprises three key steps. First, we compile over 630 freely available dictionaries over the Web and convert this data into a single representation – the translation graph. Second, we build several inference algorithms that infer translations between word pairs even when no dictionary lists them as translations. Finally, we run our inference procedure offline to construct PANDICTIONARY – a sense-distinguished, massively multilingual dictionary that has translations in more than 1000 languages. Our experiments assess the quality of this dictionary and find that we have 4 times as many translations at a high precision of 0.9 compared to the English Wiktionary, which is the lexical resource closest to PANDICTIONARY.
Dealing with Bilingual Divergences . . .
, 2001
"... In this paper we present a prototype translation system that uses only a sourcelanguage (SL) tagger, a bilingual dictionary and a lemmatised target-language (TL) corpus. In our approach, the TL corpus is innovatively exploited both for lexical selection (selecting among the different translations pr ..."
Abstract
- Add to MetaCart
In this paper we present a prototype translation system that uses only a sourcelanguage (SL) tagger, a bilingual dictionary and a lemmatised target-language (TL) corpus. In our approach, the TL corpus is innovatively exploited both for lexical selection (selecting among the different translations proposed by the dictionary) and for structure building of the output. To that end a series of n-gram model over lemmas and POS tags are built from the TL corpus, which are then searched at run-time. The system presented here uses Spanish as SL and English as TL but the architecture is language independent and translatable to languages with very little NLP development.

