Results 1 -
6 of
6
COMBINING KNOWLEDGE SOURCES TO REORDER N-BEST SPEECH HYPOTHESIS LISTS
, 1994
"... A simple and general method is described that can combine different knowledge sources to reorder N-best lists of hypothe-ses produced by a speech recognizer. The method is automat-ically trainable, acquiring information from both positive and negative examples. In experiments, the method was tested ..."
Abstract
-
Cited by 40 (13 self)
- Add to MetaCart
A simple and general method is described that can combine different knowledge sources to reorder N-best lists of hypothe-ses produced by a speech recognizer. The method is automat-ically trainable, acquiring information from both positive and negative examples. In experiments, the method was tested on a 1000-utterance sample of unseen ATIS data.
Estimating Performance of Pipelined Spoken Language Translation Systems
- ICSLP'94. MULTILINGUAL EVALUATION
, 1994
"... Most spoken language translation systems developed to date rely on a pipelined architecture, in which the main stages are speech recognition, linguistic analysis, transfer, generation and speech synthesis. When making projections of error rates for systems of this kind, it is natural to assume that ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Most spoken language translation systems developed to date rely on a pipelined architecture, in which the main stages are speech recognition, linguistic analysis, transfer, generation and speech synthesis. When making projections of error rates for systems of this kind, it is natural to assume that the error rates for the individual components are independent, making the system accuracy the product of the component accuracies. The paper reports experiments carried out using the SRI-SICSTelia Research Spoken Language Translator and a 1000-utterance sample of unseen data. The results suggest that the naive performance model leads to serious overestimates of system error rates, since there are in fact strong dependencies between the components. Predicting the system error rate on the independence assumption by simple multiplication resulted in a 16% proportional overestimate for all utterances,
Spoken-Language Machine Translation in Limited-Domain Tasks
- In Proceedings in Arti Intelligence: CRIM/FORWISS Workshop on Progress and Prospects of Speech Research and Technology
, 1994
"... Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can suffice in limited-domain translation tasks. The finitestate nature of subsequential transducers makes their int ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can suffice in limited-domain translation tasks. The finitestate nature of subsequential transducers makes their integration with well-known Continuous Speech Recognition technology both easy and efficient. A recent algorithm allows the automatic learning of these transducers, given a sufficiently large set of examples of sentences and their corresponding translations, and it also allows the incorporation of syntactic restrictions of the input and/or output languages. In this paper, we describe an implementation of a Speech Translation System for limited domains which is fully trainable and capable of real time translation from speech input.
Adaptation Of Hidden Markov Models Using Multiple Stochastic Transformations
- IEEE Trans. Speech and Audio Processing
, 1997
"... The recognition accuracy in recent large vocabulary Automatic Speech Recognition (ASR) systems is highly related to the existing mismatch between the training and test sets. For example, dialect differences across the training and testing speakers result to a significant degradation in recognition p ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The recognition accuracy in recent large vocabulary Automatic Speech Recognition (ASR) systems is highly related to the existing mismatch between the training and test sets. For example, dialect differences across the training and testing speakers result to a significant degradation in recognition performance. Some popular adaptation approaches improve the recognition performance of speech recognizers based on hidden Markov models with continuous mixture densities by using linear transforms to adapt the means, and possibly the covariances of the mixture Gaussians. In this paper, we propose a novel adaptation technique that adapts the means and, optionally, the covariances of the mixture Gaussians by using multiple stochastic transformations. We perform both speaker and dialect adaptation experiments, and we show that our method significantly improves the recognition accuracy and the robustness of our system. The experiments are carried out with SRI's DECIPHER TM speech recognition sy...
Spoken-Language Machine Translation in Limited Domains: Can it be Achieved by Finite-State Models?
, 1995
"... Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can su ce in limited-domain translation tasks. The finite state nature of subsequential transducers makes their inte ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can su ce in limited-domain translation tasks. The finite state nature of subsequential transducers makes their integration with well-known Continuous Speech Recognition technology both easy and e cient. A recent algorithm allows the automatic learning of these transducers, given a su ciently large set of examples of sentences and their corresponding translations, and it also allows the incorporation of syntactic restrictions of the input and/or output languages. In this paper, we describe an implementation of a Speech Translation System for limited domains which is fully trainable and capable of real time translation from speech input.
Abstract
, 2008
"... Most spoken language translation systems developed to date rely on a pipelined architecture, in which the main stages are speech recognition, linguistic analysis, transfer, generation and speech synthesis. When making projections of error rates for systems of this kind, it is natural to assume that ..."
Abstract
- Add to MetaCart
Most spoken language translation systems developed to date rely on a pipelined architecture, in which the main stages are speech recognition, linguistic analysis, transfer, generation and speech synthesis. When making projections of error rates for systems of this kind, it is natural to assume that the error rates for the individual components are independent, making the system accuracy the product of the component accuracies. The paper reports experiments carried out using the SRI-SICS-Telia Research Spoken Language Translator and a 1000-utterance sample of unseen data. The results suggest that the naive performance model leads to serious overestimates of system error rates, since there are in fact strong dependencies between the components. Predicting the system error rate on the independence assumption by simple multiplication resulted in a 16 % proportional overestimate for all utterances, and a 19 % overestimate when only utterances of length 1-10 words were considered. 1 1

