Results 11 -
15 of
15
Hidden Model Sequence Models for Automatic Speech Recognition
, 2001
"... Most modern automatic speech recognition systems make use of acoustic models based on hidden Markov models. To obtain reasonable recognition performance within a large vocabulary framework, the acoustic models usually include a pronunciation model, together with complex parameter tying schemes. In m ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Most modern automatic speech recognition systems make use of acoustic models based on hidden Markov models. To obtain reasonable recognition performance within a large vocabulary framework, the acoustic models usually include a pronunciation model, together with complex parameter tying schemes. In many cases the pronunciation model operates on a phoneme level and is derived independently of the underlying models. In contrast, this work is aimed at improving pronunciation modelling on a sub-phone level in a combined framework. The modelling of pronunciation variation is assumed to be of special importance for recognition of spontaneous speech.
Recognizing Sloppy Speech
, 2004
"... As speech recognition moves from labs into the real world, the sloppy speech problem emerges as a major challenge. Sloppy speech, or conversational speech, refers to the speaking style people typically use in daily conversations. The recognition error rate for sloppy speech has been found to double ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
As speech recognition moves from labs into the real world, the sloppy speech problem emerges as a major challenge. Sloppy speech, or conversational speech, refers to the speaking style people typically use in daily conversations. The recognition error rate for sloppy speech has been found to double that of read speech in many circumstances. Previous work on sloppy speech has focused on modeling pronunciation changes, primarily by adding pronunciation variants to the dictionary. The improvement, unfortunately, has been unsatisfactory. To improve
the State Based Mixture of Expert HMM with Applications to the Recognition of Spontaneous Speech
, 2001
"... Dissertation submitted to the University of Cambridge for the degree of Doctor of Philosophy Although the performance of speech recognition systems has increased substantially over the last decades, there still remain a number of tasks which pose considerable problems for current state-of-the-art te ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Dissertation submitted to the University of Cambridge for the degree of Doctor of Philosophy Although the performance of speech recognition systems has increased substantially over the last decades, there still remain a number of tasks which pose considerable problems for current state-of-the-art techniques. One of these tasks is the recognition of spontaneous speech which differs from read or planned speech in that its underlying dynamics change frequently over time. The negative effect of changes in acoustic background condition on recognition performance can also be observed in other situations as, for instance, in the case of speech that is corrupted by non-stationary noise. This thesis is concerned with the development of an acoustic model for speech recognition which automatically detects changes in the background condition of a signal and compensates for the model-data mismatch by combining the information of several expert models. These experts are specialised on the different acoustic conditions under consideration and their influ-ence on the recognition process is determined by how well their associated condition matches
Taming Recognition Errors woth a Multimodal Interface
, 2000
"... e tion of this technology is the rate of errors and lack of graceful error handling [8]. Benchmark error rates reported for speech recognition systems are still too high to support many applications [4]; the amount of time users spend resolving errors can be substantial and frustrating. Although s ..."
Abstract
- Add to MetaCart
e tion of this technology is the rate of errors and lack of graceful error handling [8]. Benchmark error rates reported for speech recognition systems are still too high to support many applications [4]; the amount of time users spend resolving errors can be substantial and frustrating. Although speech technology often performs adequately for native speakers of a language, for reading text, or when speaking in idealized laboratory conditions, current estimates indicate a 20%--50% decrease in recognition rates when speech is delivered under the following conditions: . During natural spontaneous interaction; . By diverse speakers (such as those with accents); or . In a natural field environment. Word-error rates are known to vary directly with speaking style, such that the more natural the speech delivery, the higher is the recognition system's worderror rate [11]. In a study by Mitch Weintraub and his colleagues at SRI International, speakers' worderror rates increased from 29% du
Searching Multimedia Content with a Spontaneous Conversational Speech Track
"... retrieval, multimedia access ..."

