Results 1 -
5 of
5
An Application of Recurrent Nets to Phone Probability Estimation
- IEEE Transactions on Neural Networks
, 1994
"... This paper presents an application of recurrent networks for phone probability estimation in large vocabulary speech recognition. The need for efficient exploitation of context information is discussed ..."
Abstract
-
Cited by 165 (8 self)
- Add to MetaCart
This paper presents an application of recurrent networks for phone probability estimation in large vocabulary speech recognition. The need for efficient exploitation of context information is discussed
Large Vocabulary Continuous Speech Recognition: a Review
- of INCIS Project, Schedule 6 in (Small
, 1996
"... This article will discuss the principles and architecture of current LVR systems and identify the key issues affecting their future deployment. To illustrate the various points raised, the Cambridge University HTK system will be described. This is a modern design giving state-of-the-art performance ..."
Abstract
-
Cited by 62 (1 self)
- Add to MetaCart
This article will discuss the principles and architecture of current LVR systems and identify the key issues affecting their future deployment. To illustrate the various points raised, the Cambridge University HTK system will be described. This is a modern design giving state-of-the-art performance and it is typical of the current generation of recognition systems. 2 System Overview
Large margin hidden markov models for speech recognition
, 2005
"... In this work, motivated by large margin classifiers in machine learning, we propose a novel method to estimate continuous density hidden Markov model (CDHMM) for speech recognition according to the principle of maximizing the minimum muti-class separation margin. The approach is named as large margi ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
In this work, motivated by large margin classifiers in machine learning, we propose a novel method to estimate continuous density hidden Markov model (CDHMM) for speech recognition according to the principle of maximizing the minimum muti-class separation margin. The approach is named as large margin HMM. Firstly, we show this type of large margin HMM estimation problem can be formulated as a constrained minimax optimization problem. Secondly, by imposing different constraints to the minimax problem, we propose three solutions to the large margin HMM estimation problem, namely the iterative localized optimization method, the constrained joint optimization method and the semidefinite pro-gramming (SDP) method. These new training methods are evaluated in the isolated E-set recognition task using ISOLET database and the TIDIGITS connected digit string recog-nition task. Experimental results clearly show that the large margin HMMs consistently outperform the conventional HMM training methods. It has been consistently observed that the large margin training method yields significant recognition error rate reduction even on top of some popular discriminative training methods.
Prosody Takes Over: Towards A Prosodically Guided Dialog System
, 1994
"... The domain of the speech recognition and dialog system EVAR is train time table inquiry. We observed that in real human--human dialogs when the officer transmits the information, the customer very often interrupts. Many of these interruptions are just repetitions of the time of day given by the offi ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
The domain of the speech recognition and dialog system EVAR is train time table inquiry. We observed that in real human--human dialogs when the officer transmits the information, the customer very often interrupts. Many of these interruptions are just repetitions of the time of day given by the officer. The functional role of these interruptions is often determined by prosodic cues only. An important result of experiments where naive persons used the EVAR system is that it is hard to follow the train connection given via speech synthesis. In this case it is even more important than in human-human dialogs that the user has the opportunity to interact during the answer phase. Therefore we extended the dialog module to allow the user to repeat the time of day and we added a prosody module guiding the continuation of the dialog by analyzing the intonation contour of this utterance.
In spite of the fact that speech exhibits features that cannot be represented by a first-order Markov model, Hidden Markov Models (HMMs) of speech units
"... this paper, semi-continuous HMMs (SCHMMs) (Bellagarda & Nahamoo 89; Huang & Jack 89) and continuous densities HMMs (CDHMMs) will be considered in conjunction with networks trained with the generalized delta rule (Rumelhart et al 86). It will be shown how to perform a joint global optimi ation of bot ..."
Abstract
- Add to MetaCart
this paper, semi-continuous HMMs (SCHMMs) (Bellagarda & Nahamoo 89; Huang & Jack 89) and continuous densities HMMs (CDHMMs) will be considered in conjunction with networks trained with the generalized delta rule (Rumelhart et al 86). It will be shown how to perform a joint global optimi ation of both the ANN and the HMM parameter estimation. In the proposed algorithm, the gradient of the optimization criterion with respect to the transformed observations is computed for the HMM system. The HMM can be trained with traditional methods (Rabiner 89) with which the gradient of an optimization criterion is computed. This gradient is sent to the ANN for the estimation of the weight associated to each connection of the network. No assumption need to be made or constraints imposed on the network outputs, except that the network output distribution should be modeled by a mixture of multivariate gaussians. Since training of HMMs is usually much faster than ANN training, we consider how to initialize the ANN in order to start from parameter values that are not too far from those obtained after training. Multiple ANNs are combined and an incremental design method is described in which specialized networks are integrated to the recognition system in order to improve its performance. Relate or Interesting papers have been published recently, describing attempts at com-

