Results 1 -
7 of
7
Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains
- IEEE Transactions on Speech and Audio Processing
, 1994
"... In this paper a framework for maximum a posteriori (MAP) estimation of hidden Markov models (HMM) is presented. Three key issues of MAP estimation, namely the choice of prior distribution family, the specification of the parameters of prior densities and the evaluation of the MAP estimates, are addr ..."
Abstract
-
Cited by 372 (36 self)
- Add to MetaCart
In this paper a framework for maximum a posteriori (MAP) estimation of hidden Markov models (HMM) is presented. Three key issues of MAP estimation, namely the choice of prior distribution family, the specification of the parameters of prior densities and the evaluation of the MAP estimates, are addressed. Using HMMs with Gaussian mixture state observation densities as an example, it is assumed that the prior densities for the HMM parameters can be adequately represented as a product of Dirichlet and normal-Wishart densities. The classical maximum likelihood estimation algorithms, namely the forward-backward algorithm and the segmental k-means algorithm, are expanded and MAP estimation formulas are developed. Prior density estimation issues are discussed for two classes of applications: parameter smoothing and model adaptation, and some experimental results are given illustrating the practical interest of this approach. Because of its adaptive nature, Bayesian learning is shown to serve as a unified approach for a wide range of speech recognition applications
Bayesian Learning for Hidden Markov Model with Gaussian Mixture State Observation Densities
"... An investigation into the use of Bayesian learning of the parameters of a multivariate Gaussian mixture density has been carried out. In a framework of continuous density hidden Markov model (CDHMM), Bayesian learning serves as a unified approach for parameter smoothing, speaker adaptation, speaker ..."
Abstract
-
Cited by 32 (16 self)
- Add to MetaCart
An investigation into the use of Bayesian learning of the parameters of a multivariate Gaussian mixture density has been carried out. In a framework of continuous density hidden Markov model (CDHMM), Bayesian learning serves as a unified approach for parameter smoothing, speaker adaptation, speaker clustering and corrective training. The goal is to enhance model robustness in a CDHMM-based speech recognition system so as to improve performance. Our approach is to use Bayesian learning to incorporate prior knowledge into the training process in the form of prior densities of the HMM parameters. The theoretical basis for this procedure is presented and results applying it to parameter smoothing, speaker adaptation, speaker clustering, and corrective training are given.
Bayesian Learning of Gaussian Mixture Densities for Hidden Markov Models
- Proc. DARPA Speech and Natural Language Workshop
, 1991
"... An investigation into the use of Bayesian learning of the parameters of a multivariate Gaussian mixture density has been carried out. In a continuous density hidden Markov model (CDHMM) framework, Bayesian learning serves as a unified approach for parameter smoothing, speaker adaptation, speaker cl ..."
Abstract
-
Cited by 22 (8 self)
- Add to MetaCart
An investigation into the use of Bayesian learning of the parameters of a multivariate Gaussian mixture density has been carried out. In a continuous density hidden Markov model (CDHMM) framework, Bayesian learning serves as a unified approach for parameter smoothing, speaker adaptation, speaker clustering, and corrective training. The goal of this study is to enhance model robustness in a CDHMM-based speech recognition system so as to improve performance. Our approach is to use Bayesian learning to incorporate prior knowledge into the CDHMM training process in the form of prior densities of the HMM parameters. The theoretical basis for this procedure is presented and preliminary results applying to HMM parameter smoothing, speaker adaptation, and speaker clustering are given. Performance improvements were observed on tests using the DARPA RM task. For speaker adaptation, under a supervised learning mode with 2 minutes of speaker-specific training data, a 31% reduction in word error r...
uWave: Accelerometer-based Personalized Gesture Recognition and Its Applications
"... Abstract—The proliferation of accelerometers on consumer electronics has brought an opportunity for interaction based on gestures or physical manipulation of the devices. We present uWave, an efficient recognition algorithm for such interaction using a single three-axis accelerometer. Unlike statist ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Abstract—The proliferation of accelerometers on consumer electronics has brought an opportunity for interaction based on gestures or physical manipulation of the devices. We present uWave, an efficient recognition algorithm for such interaction using a single three-axis accelerometer. Unlike statistical methods, uWave requires a single training sample for each gesture pattern and allows users to employ personalized gestures and physical manipulations. We evaluate uWave using a large gesture library with over 4000 samples collected from eight users over an elongated period of time for a gesture vocabulary with eight gesture patterns identified by a Nokia research. It shows that uWave achieves 98.6 % accuracy, competitive with statistical methods that require significantly more training samples. Our evaluation data set is the largest and most extensive in published studies, to the best of our knowledge. We also present applications of uWave in gesture-based user authentication and interaction with three-dimensional mobile user interfaces using user created gestures. Keywords-gesture recognition, acceleration, dynamic time warping, personalized gesture I.
A Framework for Indexing Human Actions in Video
, 2008
"... Abstract. Several researchers have addressed the problem of human action recognition using a variety of algorithms. An underlying assumption in most of these algorithms is that action boundaries are already known in a test video sequence. In this paper, we propose a fast method for continuous human ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Several researchers have addressed the problem of human action recognition using a variety of algorithms. An underlying assumption in most of these algorithms is that action boundaries are already known in a test video sequence. In this paper, we propose a fast method for continuous human action recognition in a video sequence. We propose the use of a low dimensional feature vector which consists of (a) the projections of the width profile of the actor on to a Discrete Cosine Transform (DCT) basis and (b) simple spatio-temporal features. We use an earlier proposed average-template with multiple features for modelling human actions and combine it with One-pass Dynamic Programing (DP) algorithm for continuous action recognition. This model accounts for intra-class variability in the way an action is performed. Furthermore, we demonstrate a way to perform noise robust recognition by creating a noise match condition between the train and the test data. The effectiveness of our method is demonstrated by conducting experiments on the IXMAS dataset of persons performing various actions and an outdoor Action database collected by us. 1
An Unsupervised Framework for Action Recognition Using Actemes
"... Abstract. In speech recognition, phonemes have demonstrated their efficacy to model the words of a language. While they are well defined for languages, their extension to human actions is not straightforward. In this paper, we study such an extension and propose an unsupervised framework to find pho ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. In speech recognition, phonemes have demonstrated their efficacy to model the words of a language. While they are well defined for languages, their extension to human actions is not straightforward. In this paper, we study such an extension and propose an unsupervised framework to find phoneme-like units for actions, which we call actemes, using 3D data and without any prior assumptions. To this purpose, build on an earlier proposed framework in speech literature to automatically find actemes in the training data. We experimentally show that actions defined in terms of actemes and actions defined by whole units give similar recognition results. We define actions out of the training set in terms of these actemes to see whether the actemes generalize to unseen actions. The results show that although the acteme definitions of the actions are not always semantically meaningful, they yield optimal recognition accuracy and constitute a promising direction of research for action modeling. 1
*with equal contribution
"... The proliferation of low power, low cost accelerometers on consumer electronics has brought an opportunity to personalize gesture-based interaction. We present uWave, an efficient personalized gesture recognizer based on a 3-D accelerometer. The core technical components of uWave include quantizatio ..."
Abstract
- Add to MetaCart
The proliferation of low power, low cost accelerometers on consumer electronics has brought an opportunity to personalize gesture-based interaction. We present uWave, an efficient personalized gesture recognizer based on a 3-D accelerometer. The core technical components of uWave include quantization of accelerometer readings, dynamic time warping and template adaptation. Unlike statistical methods, uWave requires a single training sample and allows users to employ personalized gestures. Our evaluation is based on a large gesture library with over 4000 samples collected from eight users. It shows that uWave achieves 98.6 % accuracy, competitive with statistical methods which require significantly more training samples.

