Results 1  10
of
49
From HMM's to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition
, 1996
"... ..."
Hybrid HMM/ANN Systems for Speech Recognition: Overview and New Research Directions
 in Adaptive Processing of Sequences and Data Structures, ser. Lecture Notes in Artificial Intelligence (1387
, 1998
"... ..."
(Show Context)
Discriminative Training of Hidden Markov Models
, 1998
"... vi Abbreviations vii Notation viii 1 Introduction 1 2 Hidden Markov Models 4 2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 HMM Modelling Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 HMM Topology . . . . . . . . . ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
vi Abbreviations vii Notation viii 1 Introduction 1 2 Hidden Markov Models 4 2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 HMM Modelling Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 HMM Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Finding the Best Transcription . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.5 Setting the Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3 Objective Functions 19 3.1 Properties of Maximum Likelihood Estimators . . . . . . . . . . . . . . . . . . . 19 3.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 Maximum Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4 Frame Discrimination . . . . . . . . . . . . . . . . ....
Using SelfOrganizing Maps and Learning Vector Quantization for Mixture Density Hidden Markov Models
, 1997
"... This work presents experiments to recognize pattern sequences using hidden Markov models (HMMs). The pattern sequences in the experiments are computed from speech signals and the recognition task is to decode the corresponding phoneme sequences. The training of the HMMs of the phonemes using the col ..."
Abstract

Cited by 22 (9 self)
 Add to MetaCart
This work presents experiments to recognize pattern sequences using hidden Markov models (HMMs). The pattern sequences in the experiments are computed from speech signals and the recognition task is to decode the corresponding phoneme sequences. The training of the HMMs of the phonemes using the collected speech samples is a difficult task because of the natural variation in the speech. Two neural computing paradigms, the SelfOrganizing Map (SOM) and the Learning Vector Quantization (LVQ) are used in the experiments to improve the recognition performance of the models. A HMM consists of sequential states which are trained to model the feature changes in the signal produced during the modeled process. The output densities applied in this work are mixtures of Gaussian density functions. SOMs are applied to initialize and train the mixtures to give a smooth and faithful presentation of the feature vector space defined by the corresponding training samples. The SOM maps similar feature vect...
Mixture of Experts Regression Modeling by Deterministic Annealing
 IEEE Transactions on Signal Processing
, 1997
"... We propose a new learning algorithm for regression modeling. The method is especially suitable for optimizing neural network structures that are amenable to a statistical description as mixture models. These include mixture of experts, hierarchical mixture of experts (HME), and normalized radial bas ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
We propose a new learning algorithm for regression modeling. The method is especially suitable for optimizing neural network structures that are amenable to a statistical description as mixture models. These include mixture of experts, hierarchical mixture of experts (HME), and normalized radial basis functions (NRBF). Unlike recent maximum likelihood (ML) approaches, we directly minimize the (squared) regression error. We use the probabilistic framework as means to define an optimization method that avoids many shallow local minima on the complex cost surface. Our method is based on deterministic annealing (DA), where the entropy of the system is gradually reduced, with the expected regression cost (energy) minimized at each entropy level. The corresponding Lagrangian is the system's "freeenergy," and this annealing process is controlled by variation of the Lagrange multiplier, which acts as a "temperature" parameter. The new method consistently and substantially outperformed the com...
The Impact Of Speech Recognition On Speech Synthesis
, 2002
"... Speech synthesis has changed dramatically in the past few years to have a corpusbased focus, borrowing heavily from advances in automatic speech recognition. In this paper, we survey technology in speech recognition systems and how it translates (or doesn't translate) to speech synthesis syste ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
Speech synthesis has changed dramatically in the past few years to have a corpusbased focus, borrowing heavily from advances in automatic speech recognition. In this paper, we survey technology in speech recognition systems and how it translates (or doesn't translate) to speech synthesis systems. We further speculate on future areas where ASR may impact synthesis and vice versa.
Advanced Training Methods And New Network Topologies For Hybrid MMiConnectionist/hMM Speech Recognition Systems
 in IEEE Int. Conference on Acoustics, Speech, and Signal Processing (ICASSP
, 1997
"... This paper deals with the construction and optimization of a hybrid speech recognition system that consists of a combination of a neural vector quantizer (VQ) and discrete HMMs. In our investigations an integration of VQ based classification in the continuous classifier framework is given and some c ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
(Show Context)
This paper deals with the construction and optimization of a hybrid speech recognition system that consists of a combination of a neural vector quantizer (VQ) and discrete HMMs. In our investigations an integration of VQ based classification in the continuous classifier framework is given and some constraints are derived that must hold for the pdfs in the discrete pattern classifier context. Furthermore it is shown that for ML training of the whole system the VQ parameters must be estimated according to the MMI criterion. A novel training method based on gradient search for Neural Networks that serve as optimal VQ is derived. This allows faster training of arbitrary network topologies compared to the traditional MMINN training. An integration of multilayer MMINNs as VQ in the hybrid discrete HMM based speech recognizer leads to a large improvement compared to other supervised and unsupervised single layer VQ systems. For the speaker independent Resource Management database the constr...
Bayesian inference for wind field retrieval
 Neurocomputing
, 2000
"... In many problems in spatial statistics it is necessary to infer a global problem solution by combining local models. A principled approach to this problem is to develop a global probabilistic model for the relationships between local variables and to use this as the prior in a Bayesian inference pro ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
(Show Context)
In many problems in spatial statistics it is necessary to infer a global problem solution by combining local models. A principled approach to this problem is to develop a global probabilistic model for the relationships between local variables and to use this as the prior in a Bayesian inference procedure. We show how a Gaussian process with hyperparameters estimated from Numerical Weather Prediction Models yields meteorologically convincing wind elds. We use neural networks to make local estimates of wind vector probabilities. The resulting inference problem cannot be solved analytically, but Markov Chain Monte Carlo methods allow us to retrieve accurate wind elds.
Combining Neural Networks and Belief Networks for Image Segmentation
, 1998
"... In this paper we are concerned with segmenting an image into a number of predefined classes. We show how to fuse together local predictions for the class labels with a prior model of segmentations using the scaledlikelihood method. The prior model is based on a treestructured belief network. B ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
In this paper we are concerned with segmenting an image into a number of predefined classes. We show how to fuse together local predictions for the class labels with a prior model of segmentations using the scaledlikelihood method. The prior model is based on a treestructured belief network. Both the neural network and belief network were trained on a set of training images, and then the combined system was used to make predictions on a set of test images. We show that the combined neural network/belief network classifier gives improved prediction accuracy on 9 out of the 11 classes. 1 Introduction Neural networks have been used very successfully in a wide variety of domains for performing classification or regression tasks. A characteristic of most currently successful applications is that the input patterns are either independent (as in static pattern classification) or related over time, rather than being spatially distributed. To extend the use of neural networks to sp...
An Incremental SpeakerAdaptation Technique For Hybrid HmmMlp Recognizer
 Recognizer, Proceedings ICSLP 96
, 1996
"... One of the problems of the speakerindependent continuous speech recognition systems is their inability to cope with the interspeaker variability. When we find test speakers with different characteristics from the ones presented in the training pool we observe a large degradation on the system perf ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
One of the problems of the speakerindependent continuous speech recognition systems is their inability to cope with the interspeaker variability. When we find test speakers with different characteristics from the ones presented in the training pool we observe a large degradation on the system performance. To overcome this problem speakeradaptation techniques may be used to provide near speakerdependent accuracy. In this work we present a speakeradaptation technique applied to a hybrid HMMMLP system for large vocabulary, continuous speech recognition. This technique is based on an architecture that employs a trainable Linear Input Network (LIN) to map the speaker specific features input vectors to the speakerindependent system. This speakeradaptation technique will be evaluatedin an incremental speakeradaptationtask using the Wall Street Journal (WSJ) database. Both supervised and unsupervised modes are evaluated. The results show that speakeradaptation within the hybrid framew...