Results 11 -
18 of
18
Acoustic And Syntactical Modeling in the ATROS System
, 1999
"... Current speech technology allows us to build efficient speech recognition systems. However, model learning of knowledge sources in a speech recognition system is not a closed problem. In addition, lower demand of computational requirements are crucial to building real-time systems. ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Current speech technology allows us to build efficient speech recognition systems. However, model learning of knowledge sources in a speech recognition system is not a closed problem. In addition, lower demand of computational requirements are crucial to building real-time systems.
A New Verification-Based Fast Match Approach To Large Vocabulary Constinuous Speech Recognition
- Proc. of European Conference on Speech Communication and Technology
, 2001
"... Acoustic fast match is usually used to accelerate search in large vocabulary continuous speech recognition. This paper discusses a new acoustic fast match algorithm. This proposed fast match is based on incremental evaluation of the score and the use of normalized likelihood scores. This is in contr ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Acoustic fast match is usually used to accelerate search in large vocabulary continuous speech recognition. This paper discusses a new acoustic fast match algorithm. This proposed fast match is based on incremental evaluation of the score and the use of normalized likelihood scores. This is in contrast to more traditional fast matches where a likelihood score is used. In addition, streaming SIMD extensions (SSE) for Intel machine instructions are used for fast Gaussian calculation. Results on a 20K Japanese broadcast news task show that the proposed fast match leads to about 30% improvement in speed with a slight performance degradation.
Fast Phoneme Look-Ahead in the ATROS system
- Accepted in VIII Spanish Symposium on Pattern Recognition and Image Analysis
, 1999
"... Current speech recognition systems require a lot of computational resources to decode an input utterance. Many efforts have been done in order to reduce these requirements. One of the techniques that is being explored is the fast phoneme look-ahead. The idea is to compute quickly approximate scor ..."
Abstract
- Add to MetaCart
Current speech recognition systems require a lot of computational resources to decode an input utterance. Many efforts have been done in order to reduce these requirements. One of the techniques that is being explored is the fast phoneme look-ahead. The idea is to compute quickly approximate scores in order to prune little promising hypothesis. These scores are computed by using simple phone-like units and analysing an acoustic segment look-ahead.
AssessingandImproving the Performanceof Speech Recognition for IncrementalSystems
"... In incremental spoken dialogue systems, partial hypotheses about what was said are required even while the utterance is still ongoing. We define measures for evaluating the quality of incremental ASR components with respect to the relative correctness of the partial hypotheses compared to hypotheses ..."
Abstract
- Add to MetaCart
In incremental spoken dialogue systems, partial hypotheses about what was said are required even while the utterance is still ongoing. We define measures for evaluating the quality of incremental ASR components with respect to the relative correctness of the partial hypotheses compared to hypotheses that canoptimizeoverthecompleteinput,thetimingofhypothesisformationrelativetotheportionoftheinputtheyareabout,andhypothesis stability, defined as the number of times they are revised. We show that simple incremental post-processing can improve stability dramatically,at thecost oftimeliness(from90% ofedits ofhypothesesbeingspuriousdownto 10 % at a lag of 320ms). The measures are notindependent,andweshowhowsystemdesigners can find a desired operating point for their ASR. To our knowledge, we are the first to suggest and examine a variety of measures for assessing incremental ASR and improve performanceonthisbasis. 1
unknown title
, 2006
"... lattice search technique for a long-contextual-span hidden trajectory model of speech q ..."
Abstract
- Add to MetaCart
lattice search technique for a long-contextual-span hidden trajectory model of speech q
Online Adaptive Learning for Speech Recognition Decoding
"... We describe a new method for pruning in dynamic models based on running an adaptive filtering algorithm online during decoding to predict aspects of the scores in the near future. These predictions are used to make well-informed pruning decisions during model expansion. We apply this idea to the cas ..."
Abstract
- Add to MetaCart
We describe a new method for pruning in dynamic models based on running an adaptive filtering algorithm online during decoding to predict aspects of the scores in the near future. These predictions are used to make well-informed pruning decisions during model expansion. We apply this idea to the case of dynamic graphical models and test it on a speech recognition database derived from Switchboard. Results show that significant (approximately factor of 2) speedups can be obtained without any increase in word error rate or increase in memory usage. Index Terms: graphical models, decoding, speech recognition, online learning
The Time-Conditioned Approach in Dynamic Programming Search for LVCSR
"... Abstract—This paper presents the time-conditioned approach in dynamic programming search for large-vocabulary continuousspeech recognition. The following topics are presented: the baseline algorithm, a time-synchronous beam search version, a comparison with the word-conditioned approach, a compariso ..."
Abstract
- Add to MetaCart
Abstract—This paper presents the time-conditioned approach in dynamic programming search for large-vocabulary continuousspeech recognition. The following topics are presented: the baseline algorithm, a time-synchronous beam search version, a comparison with the word-conditioned approach, a comparison with stack decoding. The approach has been successfully tested on the NAB task using a vocabulary of 64 000 words. Index Terms—Beam search, dynamic programming, large vocabulary speech recognition, one-pass DP search, search organization, time-conditioned DP search. I.
Evaluation and Optimisation of Incremental Processors
"... Incremental spoken dialogue systems, which process user input as it unfolds, pose additional engineering challenges compared to more standard non-incremental systems: Their processing components must be able to accept partial, and possibly subsequently revised input, and must produce output that is ..."
Abstract
- Add to MetaCart
Incremental spoken dialogue systems, which process user input as it unfolds, pose additional engineering challenges compared to more standard non-incremental systems: Their processing components must be able to accept partial, and possibly subsequently revised input, and must produce output that is at the same time as accurate as possible and delivered with as little delay as possible. In this article, we define metrics that measure how well a given processor meets these challenges, and we identify types of gold standards for evaluation. We exemplify these metrics in the evaluation of several incremental processors that we have developed. We also present generic means to optimise some of the measures, if certain trade-offs are accepted. We believe that this work will help enable principled comparison of components for incremental dialogue systems and portability of results.

