• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 303
Next 10 →

Deep Neural Networks for Acoustic Modeling in Speech Recognition

by Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, Brian Kingsbury
"... Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative ..."
Abstract - Cited by 272 (47 self) - Add to MetaCart
Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input

Multilingual Acoustic Modeling for Speech Recognition based on Subspace Gaussian Mixture Models

by Petr Schwarz, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Nagendra Goel, Martin Karafiát, Daniel Povey, Ariya Rastrow, Richard C. Rose, Samuel Thomas - in IEEE ICASSP , 2010
"... Although research has previously been done on multilingual speech recognition, it has been found to be very difficult to improve over separately trained systems. The usual approach has been to use some kind of “universal phone set ” that covers multiple languages. We report experiments on a differen ..."
Abstract - Cited by 30 (5 self) - Add to MetaCart
different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the model has parameters not tied to specific states that are shared across languages. We use a model called a “Subspace Gaussian Mixture Model ” where states ’ distributions are Gaussian Mixture Models

SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION

by Daniel Povey
"... This technical report contains the details of an acoustic modeling approach based on subspace adaptation of a shared Gaussian Mixture Model. This refers to adaptation to a particular speech state; it is not a speaker adaptation technique, although we do later introduce a speaker adaptation technique ..."
Abstract - Cited by 15 (5 self) - Add to MetaCart
This technical report contains the details of an acoustic modeling approach based on subspace adaptation of a shared Gaussian Mixture Model. This refers to adaptation to a particular speech state; it is not a speaker adaptation technique, although we do later introduce a speaker adaptation

Subspace Gaussian Mixture Models for Automatic Speech Recognition

by Liang Lu
"... In most of state-of-the-art speech recognition systems, Gaussian mixture models (GMMs) are used to model the density of the emitting states in the hidden Markov models (HMMs). In a conventional system, the model parameters of each GMM are estimated directly and independently given the alignment. Thi ..."
Abstract - Add to MetaCart
In most of state-of-the-art speech recognition systems, Gaussian mixture models (GMMs) are used to model the density of the emitting states in the hidden Markov models (HMMs). In a conventional system, the model parameters of each GMM are estimated directly and independently given the alignment

Regularized Subspace Gaussian Mixture Models for Cross-lingual Speech Recognition

by Liang Lu, Arnab Ghoshal, Steve Renals
"... Abstract—We investigate cross-lingual acoustic modelling for low resource languages using the subspace Gaussian mixture model (SGMM). We assume the presence of acoustic models trained on multiple source languages, and use the global subspace parameters from those models for improved modelling in a t ..."
Abstract - Cited by 16 (5 self) - Add to MetaCart
Abstract—We investigate cross-lingual acoustic modelling for low resource languages using the subspace Gaussian mixture model (SGMM). We assume the presence of acoustic models trained on multiple source languages, and use the global subspace parameters from those models for improved modelling in a

The Kaldi speech recognition toolkit,” in

by Daniel Povey , Arnab Ghoshal , Gilles Boulianne , Lukáš Burget , Ondřej Glembek , Nagendra Goel , Mirko Hannemann , Petr Motlíček , Yanmin Qian , Petr Schwarz , Jan Silovský , Georg Stemmer , Karel Veselý - Proc. ASRU, , 2011
"... Abstract-We describe the design of Kaldi, a free, open-source toolkit for speech recognition research. Kaldi provides a speech recognition system based on finite-state transducers (using the freely available OpenFst), together with detailed documentation and scripts for building complete recognitio ..."
Abstract - Cited by 147 (16 self) - Add to MetaCart
recognition systems. Kaldi is written is C++, and the core library supports modeling of arbitrary phonetic-context sizes, acoustic modeling with subspace Gaussian mixture models (SGMM) as well as standard Gaussian mixture models, together with all commonly used linear and affine transforms. Kaldi is released

Comparison of Subspace Methods for Gaussian Mixture Models in Speech Recognition

by Matti Varjokallio, Mikko Kurimo
"... Speech recognizers typically use high-dimensional feature vectors to capture the essential cues for speech recognition purposes. The acoustics are then commonly modeled with a Hidden Markov Model with Gaussian Mixture Models as observation probability density functions. Using unrestricted Gaussian p ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Speech recognizers typically use high-dimensional feature vectors to capture the essential cues for speech recognition purposes. The acoustics are then commonly modeled with a Hidden Markov Model with Gaussian Mixture Models as observation probability density functions. Using unrestricted Gaussian

Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition

by Liang Lu, Arnab Ghoshal , Steve Renals - IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 28(7):1116–1126, 2010 , 2013
"... This paper studies cross-lingual acoustic modelling in the context of subspace Gaussian mixture models (SGMMs). SGMMs factorize the acoustic model parameters into a set that is globally shared between all the states of a hidden Markov model (HMM) and another that is specific to the HMM states. We de ..."
Abstract - Add to MetaCart
This paper studies cross-lingual acoustic modelling in the context of subspace Gaussian mixture models (SGMMs). SGMMs factorize the acoustic model parameters into a set that is globally shared between all the states of a hidden Markov model (HMM) and another that is specific to the HMM states. We

Noise Compensation for Subspace Gaussian Mixture Models

by Liang Lu, Kk Chin, Arnab Ghoshal, Steve Renals
"... Joint uncertainty decoding (JUD) is an effective model-based noise compensation technique for conventional Gaussian mixture model (GMM) based speech recognition systems. In this paper, we apply JUD to subspace Gaussian mixture model (SGMM) based acoustic models. The total number of Gaussians in the ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
Joint uncertainty decoding (JUD) is an effective model-based noise compensation technique for conventional Gaussian mixture model (GMM) based speech recognition systems. In this paper, we apply JUD to subspace Gaussian mixture model (SGMM) based acoustic models. The total number of Gaussians

Acoustic Modeling With Mixtures of Subspace Constrained Exponential Models

by Karthik Visweswariah , Scott Axelrod, Ramesh Gopinath - IN PROC. EUROSPEECH , 2003
"... Gaussian distributions are usually parameterized with their natural parameters: the mean and the covariance #. They can also be re-parameterized as exponential models with canonical parameters P = # -1 and # = P. In this paper we consider modeling acoustics with mixtures of Gaussians parameterized ..."
Abstract - Cited by 5 (5 self) - Add to MetaCart
likelihood estimation of the subspace and parameters within a fixed subspace. In speech recognition experiments, we show that this model improves upon all of the above classes of models with roughly the same number of parameters and with little computational overhead. In particular we get 30-40% relative
Next 10 →
Results 1 - 10 of 303
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University