Results 1 - 10
of
16,991
A maximum likelihood approach to continuous speech recognition
- IEEE Trans. Pattern Anal. Machine Intell
, 1983
"... Abstract-Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of sta-tistical models for use in speech recognition. We give special attention to determining the ..."
Abstract
-
Cited by 477 (9 self)
- Add to MetaCart
Abstract-Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of sta-tistical models for use in speech recognition. We give special attention to determining
Maximum Likelihood Linear Transformations for HMM-Based Speech Recognition
- COMPUTER SPEECH AND LANGUAGE
, 1998
"... This paper examines the application of linear transformations for speaker and environmental adaptation in an HMM-based speech recognition system. In particular, transformations that are trained in a maximum likelihood sense on adaptation data are investigated. Other than in the form of a simple bias ..."
Abstract
-
Cited by 570 (68 self)
- Add to MetaCart
This paper examines the application of linear transformations for speaker and environmental adaptation in an HMM-based speech recognition system. In particular, transformations that are trained in a maximum likelihood sense on adaptation data are investigated. Other than in the form of a simple
A tutorial on hidden Markov models and selected applications in speech recognition
- PROCEEDINGS OF THE IEEE
, 1989
"... Although initially introduced and studied in the late 1960s and early 1970s, statistical methods of Markov source or hidden Markov modeling have become increasingly popular in the last several years. There are two strong reasons why this has occurred. First the models are very rich in mathematical s ..."
Abstract
-
Cited by 5892 (1 self)
- Add to MetaCart
of statistical modeling and show how they have been applied to selected problems in machine recognition of speech.
The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions
- in ISCA ITRW ASR2000
, 2000
"... This paper describes a database designed to evaluate the performance of speech recognition algorithms in noisy conditions. The database may either be used to measure frontend feature extraction algorithms, using a defined HMM recognition back-end, or complete recognition systems. The source speech f ..."
Abstract
-
Cited by 534 (6 self)
- Add to MetaCart
This paper describes a database designed to evaluate the performance of speech recognition algorithms in noisy conditions. The database may either be used to measure frontend feature extraction algorithms, using a defined HMM recognition back-end, or complete recognition systems. The source speech
Self-organized language modeling for speech recognition
- Readings in Speech Recognition
, 1990
"... In the case of a trlgr~m language model, the proba-bility of the next word conditioned on the previous two words is estimated from a large corpus of text. The re-sulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve the l ..."
Abstract
-
Cited by 394 (6 self)
- Add to MetaCart
In the case of a trlgr~m language model, the proba-bility of the next word conditioned on the previous two words is estimated from a large corpus of text. The re-sulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve the language mode] (LM), one can adapt the probabilities of the trigram language model to match the current document more closely. The partially dictated document provides significant clues about what words ~re more likely to be used next. Of many meth-ods that can be used to adapt the LM, we describe in this paper a simple model based on the trigram frequencies es-timated from the partially dictated document. We call this model ~ cache trigram language model (CTLM) since we are c~chlng the recent history of words. We have found that the CTLM red,aces the perplexity of a dictated doc-ument by 23%. The error rate of a 20,000-word isolated word recognizer decreases by about 5 % at the beginning of a document and by about 24 % after a few hundred words.
Speech Recognition
, 1996
"... you to attend a mini-conference on Speech Recognition, being given by students in EE 8993 — Fundamentals of Speech Recognition. Papers will be presented on a wide range of topics including signal processing, Hidden Markov Models, search, and language modeling. Students will present their semester-lo ..."
Abstract
- Add to MetaCart
you to attend a mini-conference on Speech Recognition, being given by students in EE 8993 — Fundamentals of Speech Recognition. Papers will be presented on a wide range of topics including signal processing, Hidden Markov Models, search, and language modeling. Students will present their semester
Multi Stream Speech Recognition
, 1996
"... . In this paper, we discuss a new automatic speech recognition (ASR) approach based on independent processing and recombination of several feature streams. In this framework, it is assumed that the speech signal is represented in terms of multiple input streams, each input stream representing a diff ..."
Abstract
-
Cited by 150 (18 self)
- Add to MetaCart
. In this paper, we discuss a new automatic speech recognition (ASR) approach based on independent processing and recombination of several feature streams. In this framework, it is assumed that the speech signal is represented in terms of multiple input streams, each input stream representing a
Speech Recognition
"... Abstract — The use of speech recognition in noisy automotive environments requires the application of speech enhancement algorithms to improve recognition performance. Deploying these enhancement techniques necessitates significant engineering to ensure algorithms are realisable in electronic hardwa ..."
Abstract
- Add to MetaCart
Abstract — The use of speech recognition in noisy automotive environments requires the application of speech enhancement algorithms to improve recognition performance. Deploying these enhancement techniques necessitates significant engineering to ensure algorithms are realisable in electronic
The Kaldi speech recognition toolkit,” in
- Proc. ASRU,
, 2011
"... Abstract-We describe the design of Kaldi, a free, open-source toolkit for speech recognition research. Kaldi provides a speech recognition system based on finite-state transducers (using the freely available OpenFst), together with detailed documentation and scripts for building complete recognitio ..."
Abstract
-
Cited by 147 (16 self)
- Add to MetaCart
Abstract-We describe the design of Kaldi, a free, open-source toolkit for speech recognition research. Kaldi provides a speech recognition system based on finite-state transducers (using the freely available OpenFst), together with detailed documentation and scripts for building complete
Speech recognition by machines and humans
, 1997
"... This paper reviews past work comparing modern speech recognition systems and humans to determine how far recent dramatic advances in technology have progressed towards the goal of human-like performance. Comparisons use six modern speech corpora with vocabularies ranging from 10 to more than 65,000 ..."
Abstract
-
Cited by 185 (0 self)
- Add to MetaCart
This paper reviews past work comparing modern speech recognition systems and humans to determine how far recent dramatic advances in technology have progressed towards the goal of human-like performance. Comparisons use six modern speech corpora with vocabularies ranging from 10 to more than 65
Results 1 - 10
of
16,991