Results 1  10
of
52
A tutorial on hidden markov models and selected applications in speech recognition
 Proceedings of the IEEE
, 1989
"... Although initially introduced and studied in the late 1960s and early 1970s, statistical methods of Markov source or hidden Markov modeling have become increasingly popular in the last several years. There are two strong reasons why this has occurred. First the models are very rich in mathematical s ..."
Abstract

Cited by 4251 (1 self)
 Add to MetaCart
Although initially introduced and studied in the late 1960s and early 1970s, statistical methods of Markov source or hidden Markov modeling have become increasingly popular in the last several years. There are two strong reasons why this has occurred. First the models are very rich in mathematical structure and hence can form the theoretical basis for use in a wide range of applications. Second the models, when applied properly, work very well in practice for several important applications. In this paper we attempt to carefully and methodically review the theoretical aspects of this type of statistical modeling and show how they have been applied to selected problems in machine recognition of speech. I.
Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
, 1995
"... ..."
Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains
 IEEE Transactions on Speech and Audio Processing
, 1994
"... In this paper a framework for maximum a posteriori (MAP) estimation of hidden Markov models (HMM) is presented. Three key issues of MAP estimation, namely the choice of prior distribution family, the specification of the parameters of prior densities and the evaluation of the MAP estimates, are addr ..."
Abstract

Cited by 491 (39 self)
 Add to MetaCart
In this paper a framework for maximum a posteriori (MAP) estimation of hidden Markov models (HMM) is presented. Three key issues of MAP estimation, namely the choice of prior distribution family, the specification of the parameters of prior densities and the evaluation of the MAP estimates, are addressed. Using HMMs with Gaussian mixture state observation densities as an example, it is assumed that the prior densities for the HMM parameters can be adequately represented as a product of Dirichlet and normalWishart densities. The classical maximum likelihood estimation algorithms, namely the forwardbackward algorithm and the segmental kmeans algorithm, are expanded and MAP estimation formulas are developed. Prior density estimation issues are discussed for two classes of applications: parameter smoothing and model adaptation, and some experimental results are given illustrating the practical interest of this approach. Because of its adaptive nature, Bayesian learning is shown to serve as a unified approach for a wide range of speech recognition applications
Realtime american sign language recognition using desk and wearable computer based video
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1998
"... We present two realtime hidden Markov modelbased systems for recognizing sentencelevel continuous American Sign Language (ASL) using a single camera to track the user’s unadorned hands. The first system observes the user from a desk mounted camera and achieves 92 percent word accuracy. The secon ..."
Abstract

Cited by 444 (23 self)
 Add to MetaCart
We present two realtime hidden Markov modelbased systems for recognizing sentencelevel continuous American Sign Language (ASL) using a single camera to track the user’s unadorned hands. The first system observes the user from a desk mounted camera and achieves 92 percent word accuracy. The second system mounts the camera in a cap worn by the user and achieves 98 percent accuracy (97 percent with an unrestricted grammar). Both experiments use a 40word lexicon.
Visual Recognition of American Sign Language Using Hidden Markov Models
, 1995
"... Using hidden Markov models (HMM's), an unobstrusive single view camera system is developed that can recognize hand gestures, namely, a subset of American Sign Language (ASL). Previous systems have concentrated on finger spelling or isolated word recognition, often using tethered electronic gloves fo ..."
Abstract

Cited by 281 (14 self)
 Add to MetaCart
Using hidden Markov models (HMM's), an unobstrusive single view camera system is developed that can recognize hand gestures, namely, a subset of American Sign Language (ASL). Previous systems have concentrated on finger spelling or isolated word recognition, often using tethered electronic gloves for input. We achieve high recognition rates for full sentence ASL using only visual cues. A forty word lexicon consisting of personal pronouns, verbs, nouns, and adjectives is used to create 494 randomly constructed five word sentences that are signed by the subject to the computer. The data is separated into a 395 sentence training set and an independent 99 sentence test set. While signing, the 2D position, orientation, and eccentricity of bounding ellipses of the hands are tracked in real time with the assistance of solidly colored gloves. Simultaneous recognition and segmentation of the resultant stream of feature vectors occurs five times faster than real time on an HP 735. With a strong ...
SemiTied Covariance Matrices For Hidden Markov Models
 IEEE Transactions on Speech and Audio Processing
, 1999
"... There is normally a simple choice made in the form of the covariance matrix to be used with continuousdensity HMMs. Either a diagonal covariance matrix is used, with the underlying assumption that elements of the feature vector are independent, or a full or blockdiagonal matrix is used, where all ..."
Abstract

Cited by 181 (27 self)
 Add to MetaCart
There is normally a simple choice made in the form of the covariance matrix to be used with continuousdensity HMMs. Either a diagonal covariance matrix is used, with the underlying assumption that elements of the feature vector are independent, or a full or blockdiagonal matrix is used, where all or some of the correlations are explicitly modelled. Unfortunately when using full or blockdiagonal covariance matrices there tends to be a dramatic increase in the number of parameters per Gaussian component, limiting the number of components which may be robustly estimated. This paper introduces a new form of covariance matrix which allows a few \full" covariance matrices to be shared over many distributions, whilst each distribution maintains its own \diagonal" covariance matrix. In contrast to other schemes which have hypothesised a similar form, this technique ts within the standard maximumlikelihood criterion used for training HMMs. The new form of covariance matrix is evaluated on a largevocabulary speechrecognition task. In initial experiments the performance of the standard system was achieved using approximately half the number of parameters. Moreover, a 10% reduction in word error rate compared to a standard system can be achieved with less than a 1% increase in the number of parameters and little increase in recognition time. 2 1
Learning Topological Maps with Weak Local Odometric Information
 IN PROCEEDINGS OF IJCAI97. IJCAI, INC
, 1997
"... Topological maps provide a useful abstraction for robotic navigation and planning. Although stochastic maps can theoretically be learned using the BaumWelch algorithm, without strong prior constraint on the structure of the model it is slow to converge, requires a great deal of data, and is o ..."
Abstract

Cited by 133 (4 self)
 Add to MetaCart
Topological maps provide a useful abstraction for robotic navigation and planning. Although stochastic maps can theoretically be learned using the BaumWelch algorithm, without strong prior constraint on the structure of the model it is slow to converge, requires a great deal of data, and is often stuck in local minima. In this paper, we consider a special case of hidden Markov models for robotnavigation environments, in which states are associated with points in a metric configuration space. We assume that the robot has some odometric ability to measure relative transformations between its configurations. Such odometry is typically not precise enough to suffice for building a global map, but it does give valuable local information about relations between adjacent states. We present an extension of the BaumWelch algorithm that takes advantage of this local odometric information, yielding faster convergence to better solutions with less data.
A MaximumLikelihood Approach to Stochastic Matching for Robust Speech Recognition
 IEEE Transactions on Speech and Audio Processing
, 1996
"... is granted. A MaximumLikelihood Approach to Stochastic Matching for Robust Speech Recognition Ananth Sankar 2 and ChinHui Lee Speech Research Department AT&T Bell Laboratories Murray Hill, NJ 07974 1 Introduction Recently there has been much interest in the problem of improving the performanc ..."
Abstract

Cited by 107 (14 self)
 Add to MetaCart
is granted. A MaximumLikelihood Approach to Stochastic Matching for Robust Speech Recognition Ananth Sankar 2 and ChinHui Lee Speech Research Department AT&T Bell Laboratories Murray Hill, NJ 07974 1 Introduction Recently there has been much interest in the problem of improving the performance of automatic speech recognition (ASR) systems in adverse environments. When there is a mismatch between the training and testing environments, ASR systems suffer a degradation in performance. The goal of robust speech recognition is to remove the effect of this mismatch so as to bring the recognition performance as close as possible to the matched conditions. In speech recognition, the speech is usually modeled by a set of hidden Markov models (HMM) X . During recognition the observed utterance Y is decoded using these models. Due to the mismatch between training and testing conditions, this often results in a degradation in performance compared to the matched conditions. The mismatch b...
Speaker Adaptation Using Constrained Estimation of Gaussian Mixtures
 IEEE Transactions on Speech and Audio Processing
, 1995
"... A recent trend in automatic speech recognition systems is the use of continuous mixturedensity hidden Markov models (HMMs). Despite the good recognition performance that these systems achieve on average in large vocabulary applications, there is a large variability in performance across speakers. P ..."
Abstract

Cited by 90 (2 self)
 Add to MetaCart
A recent trend in automatic speech recognition systems is the use of continuous mixturedensity hidden Markov models (HMMs). Despite the good recognition performance that these systems achieve on average in large vocabulary applications, there is a large variability in performance across speakers. Performance degrades dramatically when the user is radically different from the training population. A popular technique that can improve the performance and robustness of a speech recognition system is adapting speech models to the speaker, and more generally to the channel and the task. In continuous mixturedensity HMMs the number of component densities is typically very large, and it may not be feasible to acquire a sufficient amount of adaptation data for robust maximumlikelihood estimates. To solve this problem, we propose a constrained estimation technique for Gaussian mixture densities. The algorithm is evaluated on the largevocabulary Wall Street Journal corpus for both ...
Modeling and Prediction of Human Behavior
 Neural Computation
, 1995
"... We describe our research toward building systems that include a complex, multistate model of human dynamic behavior. This can allow us to predict human behavior over short periods of time, in order to create control systems that intelligently complement the human's action. To accomplish this requir ..."
Abstract

Cited by 60 (8 self)
 Add to MetaCart
We describe our research toward building systems that include a complex, multistate model of human dynamic behavior. This can allow us to predict human behavior over short periods of time, in order to create control systems that intelligently complement the human's action. To accomplish this requires inferring the internal state of the human, and then correctly adapting the remainder of the system to achieve optimal performance. We describe methods for achieving this goal, and report an initial experiment in which we were able to achieve 95% accuracy at predicting automobile driver's actions from their initial preparatory movements. 1 Introduction Our approach is to modeling human behavior is to consider the human as a Markov device with a (possibly large) number of internal `mental' states, each with its own particular control behavior, and interstate transition probabilities (e.g., in a car the states might be passing, following, turning, etc.). A simple example of this type of h...