Results 1 -
3 of
3
The ISL phrase-based MT system for the 2007 ACL workshop on statistical MT
- In Proc. of the Association of Computational Linguistics Workshop on Statistical Machine Translation
, 2007
"... In this paper we describe the Interactive Systems Laboratories (ISL) phrase-based machine translation system used in the shared task ”Machine Translation for European Languages ” of the ACL 2007 Workshop on ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In this paper we describe the Interactive Systems Laboratories (ISL) phrase-based machine translation system used in the shared task ”Machine Translation for European Languages ” of the ACL 2007 Workshop on
Rapid Unsupervised Topic . . .
, 2009
"... In open-domain language exploitation applications, a wide variety of topics with swift topic shifts has to be captured. Consequently, it is crucial to rapidly adapt all language components of a spoken language system. This thesis addresses unsupervised topic adaptation in both monolingual and crossl ..."
Abstract
- Add to MetaCart
In open-domain language exploitation applications, a wide variety of topics with swift topic shifts has to be captured. Consequently, it is crucial to rapidly adapt all language components of a spoken language system. This thesis addresses unsupervised topic adaptation in both monolingual and crosslingual settings. For automatic speech recognition we rapidly adapt a language model on a source language. For statistical machine translation, we adapt a language model of a target language, a translation lexicon and a phrase table using a source text. For monolingual adaptation, we propose latent Dirichlet-Tree allocation for Bayesian latent semantic analysis. Our model enables rapid incremental language model adaptation via caching the fractional topic counts of word hypotheses decoded from previous speech utterances. Latent Dirichlet-Tree allocation models topic correlation in a tree-based hierarchy and thus addresses the model initialization issue. To address the “bag-of-word” assumption in latent semantic analysis, we extend our approach to N-gram latent Dirichlet-Tree allocation. We investigate a fractional Kneser-Ney smoothing approach to handle
Parallel combination of speech streams for improved ASR
"... In a growing number of applications, such as simultaneous interpretation, audio or text may be available conveying the same information in different languages. These different views contain redundant information that can be explored to enhance the performance of speech and language processing applic ..."
Abstract
- Add to MetaCart
In a growing number of applications, such as simultaneous interpretation, audio or text may be available conveying the same information in different languages. These different views contain redundant information that can be explored to enhance the performance of speech and language processing applications. We propose a method that directly integrates ASR word graphs or lattices and phrase tables from an SMT system to combine such parallel speech data and improve ASR performance. We apply this technique to speeches from four European Parliament committees and obtain a 16.6 % relative improvement (20.8 % after a second iteration) in WER, when Portuguese and Spanish interpreted versions are combined with the original English speeches. Our results indicate that further improvements may be possible by including additional languages. Index Terms: multistream combination, speech recognition, machine translation

