Results 1 - 10
of
303
Deep Neural Networks for Acoustic Modeling in Speech Recognition
"... Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative ..."
Abstract
-
Cited by 272 (47 self)
- Add to MetaCart
Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input
Multilingual Acoustic Modeling for Speech Recognition based on Subspace Gaussian Mixture Models
- in IEEE ICASSP
, 2010
"... Although research has previously been done on multilingual speech recognition, it has been found to be very difficult to improve over separately trained systems. The usual approach has been to use some kind of “universal phone set ” that covers multiple languages. We report experiments on a differen ..."
Abstract
-
Cited by 30 (5 self)
- Add to MetaCart
different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the model has parameters not tied to specific states that are shared across languages. We use a model called a “Subspace Gaussian Mixture Model ” where states ’ distributions are Gaussian Mixture Models
SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
"... This technical report contains the details of an acoustic modeling approach based on subspace adaptation of a shared Gaussian Mixture Model. This refers to adaptation to a particular speech state; it is not a speaker adaptation technique, although we do later introduce a speaker adaptation technique ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
This technical report contains the details of an acoustic modeling approach based on subspace adaptation of a shared Gaussian Mixture Model. This refers to adaptation to a particular speech state; it is not a speaker adaptation technique, although we do later introduce a speaker adaptation
Subspace Gaussian Mixture Models for Automatic Speech Recognition
"... In most of state-of-the-art speech recognition systems, Gaussian mixture models (GMMs) are used to model the density of the emitting states in the hidden Markov models (HMMs). In a conventional system, the model parameters of each GMM are estimated directly and independently given the alignment. Thi ..."
Abstract
- Add to MetaCart
In most of state-of-the-art speech recognition systems, Gaussian mixture models (GMMs) are used to model the density of the emitting states in the hidden Markov models (HMMs). In a conventional system, the model parameters of each GMM are estimated directly and independently given the alignment
Regularized Subspace Gaussian Mixture Models for Cross-lingual Speech Recognition
"... Abstract—We investigate cross-lingual acoustic modelling for low resource languages using the subspace Gaussian mixture model (SGMM). We assume the presence of acoustic models trained on multiple source languages, and use the global subspace parameters from those models for improved modelling in a t ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
Abstract—We investigate cross-lingual acoustic modelling for low resource languages using the subspace Gaussian mixture model (SGMM). We assume the presence of acoustic models trained on multiple source languages, and use the global subspace parameters from those models for improved modelling in a
The Kaldi speech recognition toolkit,” in
- Proc. ASRU,
, 2011
"... Abstract-We describe the design of Kaldi, a free, open-source toolkit for speech recognition research. Kaldi provides a speech recognition system based on finite-state transducers (using the freely available OpenFst), together with detailed documentation and scripts for building complete recognitio ..."
Abstract
-
Cited by 147 (16 self)
- Add to MetaCart
recognition systems. Kaldi is written is C++, and the core library supports modeling of arbitrary phonetic-context sizes, acoustic modeling with subspace Gaussian mixture models (SGMM) as well as standard Gaussian mixture models, together with all commonly used linear and affine transforms. Kaldi is released
Comparison of Subspace Methods for Gaussian Mixture Models in Speech Recognition
"... Speech recognizers typically use high-dimensional feature vectors to capture the essential cues for speech recognition purposes. The acoustics are then commonly modeled with a Hidden Markov Model with Gaussian Mixture Models as observation probability density functions. Using unrestricted Gaussian p ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Speech recognizers typically use high-dimensional feature vectors to capture the essential cues for speech recognition purposes. The acoustics are then commonly modeled with a Hidden Markov Model with Gaussian Mixture Models as observation probability density functions. Using unrestricted Gaussian
Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition
- IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 28(7):1116–1126, 2010
, 2013
"... This paper studies cross-lingual acoustic modelling in the context of subspace Gaussian mixture models (SGMMs). SGMMs factorize the acoustic model parameters into a set that is globally shared between all the states of a hidden Markov model (HMM) and another that is specific to the HMM states. We de ..."
Abstract
- Add to MetaCart
This paper studies cross-lingual acoustic modelling in the context of subspace Gaussian mixture models (SGMMs). SGMMs factorize the acoustic model parameters into a set that is globally shared between all the states of a hidden Markov model (HMM) and another that is specific to the HMM states. We
Noise Compensation for Subspace Gaussian Mixture Models
"... Joint uncertainty decoding (JUD) is an effective model-based noise compensation technique for conventional Gaussian mixture model (GMM) based speech recognition systems. In this paper, we apply JUD to subspace Gaussian mixture model (SGMM) based acoustic models. The total number of Gaussians in the ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Joint uncertainty decoding (JUD) is an effective model-based noise compensation technique for conventional Gaussian mixture model (GMM) based speech recognition systems. In this paper, we apply JUD to subspace Gaussian mixture model (SGMM) based acoustic models. The total number of Gaussians
Acoustic Modeling With Mixtures of Subspace Constrained Exponential Models
- IN PROC. EUROSPEECH
, 2003
"... Gaussian distributions are usually parameterized with their natural parameters: the mean and the covariance #. They can also be re-parameterized as exponential models with canonical parameters P = # -1 and # = P. In this paper we consider modeling acoustics with mixtures of Gaussians parameterized ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
likelihood estimation of the subspace and parameters within a fixed subspace. In speech recognition experiments, we show that this model improves upon all of the above classes of models with roughly the same number of parameters and with little computational overhead. In particular we get 30-40% relative
Results 1 - 10
of
303