Results 1  10
of
110
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Abstract

Cited by 170 (3 self)
 Add to MetaCart
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
Convergence of a stochastic approximation version of the EM algorithm
, 1997
"... The Expectation Maximization (EM) algorithm is a powerful computational technique for locating maxima of functions... ..."
Abstract

Cited by 86 (8 self)
 Add to MetaCart
The Expectation Maximization (EM) algorithm is a powerful computational technique for locating maxima of functions...
Bayesian Methods for Hidden Markov Models  Recursive Computing in the 21st Century
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) use ..."
Abstract

Cited by 86 (8 self)
 Add to MetaCart
Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) used in practice can be improved by incorporating established recursive algorithms. The most important is a set of forwardbackward recursions calculating conditional distributions of the hidden states given observed data and model parameters. We show how to use the recursive algorithms in an MCMC context and demonstrate mathematical and empirical results showing a Gibbs sampler using the forwardbackward recursions mixes more rapidly than another sampler often used for HMM's. We introduce an augmented variables technique for obtaining unique state labels in HMM's and finite mixture models. We show how recursive computing allows statistically efficient use of MCMC output when estimating the hidden states. We directly calculate the posterior distribution of the hidden chain's state space size by MCMC, circumventing asymptotic arguments underlying the Bayesian information criterion, which is shown to be inappropriate for a frequently analyzed data set in the HMM literature. The use of loglikelihood for assessing MCMC convergence is illustrated, and posterior predictive checks are used to investigate application specific questions of model adequacy.
Mixed memory Markov models: decomposing complex stochastic processes as mixtures of simpler ones
, 1998
"... . We study Markov models whose state spaces arise from the Cartesian product of two or more discrete random variables. We show how to parameterize the transition matrices of these models as a convex combinationor mixtureof simpler dynamical models. The parameters in these models admit a simple ..."
Abstract

Cited by 62 (1 self)
 Add to MetaCart
. We study Markov models whose state spaces arise from the Cartesian product of two or more discrete random variables. We show how to parameterize the transition matrices of these models as a convex combinationor mixtureof simpler dynamical models. The parameters in these models admit a simple probabilistic interpretation and can be fitted iteratively by an ExpectationMaximization (EM) procedure. We derive a set of generalized BaumWelch updates for factorial hidden Markov models that make use of this parameterization. We also describe a simple iterative procedure for approximately computing the statistics of the hidden states. Throughout, we give examples where mixed memory models provide a useful representation of complex stochastic processes. Keywords: Markov models, mixture models, discrete time series 1. Introduction The modeling of time series is a fundamental problem in machine learning, with widespread applications. These include speech recognition (Rabiner, 1989), natu...
Bayesian Clustering by Dynamics
, 2001
"... This paper introduces a Bayesian method for clustering dynamic processes. The method models dynamics as Markov chains and then applies an agglomerative clustering procedure to discover the most probable set of clusters capturing different dynamics. To increase efficiency, the method uses an entropy ..."
Abstract

Cited by 56 (7 self)
 Add to MetaCart
This paper introduces a Bayesian method for clustering dynamic processes. The method models dynamics as Markov chains and then applies an agglomerative clustering procedure to discover the most probable set of clusters capturing different dynamics. To increase efficiency, the method uses an entropybased heuristic search strategy. A controlled experiment suggests that the method is very accurate when applied to articial time series in a broad range of conditions and, when applied to clustering sensor data from mobile robots, it produces clusters that are meaningful in the domain of application.
Bayesian inference in hidden Markov models through reversible jump Markov chain Monte Carlo
 Journal of the Royal Statistical Society, Series B
"... Hidden Markov models form an extension of mixture models providing a flexible class of models exhibiting dependence and a possibly large degree of variability. In this paper we show how reversible jump Markov chain Monte Carlo techniques can be used to estimate the parameters as well as the number o ..."
Abstract

Cited by 50 (4 self)
 Add to MetaCart
Hidden Markov models form an extension of mixture models providing a flexible class of models exhibiting dependence and a possibly large degree of variability. In this paper we show how reversible jump Markov chain Monte Carlo techniques can be used to estimate the parameters as well as the number of components of a hidden Markov model in a Bayesian framework. We employ a mixture of zero mean normal distributions as our main example and apply this model to three sets of data from finance, meteorology and geomagnetism, respectively. AMS 1991 classification. Primary 62M09. Secondary 62F15, 62P05. Key words. Hidden Markov model, Bayesian inference, model selection, Markov chain Monte Carlo. 1 Introduction Hidden Markov models (HMMs) have been used in many areas as convenient representations of weakly dependent heterogeneous phenomena; a few examples are econometrics (Hamilton, 1989; Chib, 1996; Krolzig, 1997; Billio et al., 1998), finance (Ryd'en et al., 1998), biology (Leroux and Puter...
Estimating a StateSpace Model from Point Process Observations
, 2003
"... A widely used signal processing paradigm is the statespace model. The statespace model is defined by two equations: an observation equation that describes how the hidden state or latent process is observed and a state equation that defines the evolution of the process through time. Inspired by neu ..."
Abstract

Cited by 39 (4 self)
 Add to MetaCart
A widely used signal processing paradigm is the statespace model. The statespace model is defined by two equations: an observation equation that describes how the hidden state or latent process is observed and a state equation that defines the evolution of the process through time. Inspired by neurophysiology experiments in which neural spiking activity is induced by an implicit (latent) stimulus, we develop an algorithm to estimate a statespace model observed through point process measurements. We represent the latent process modulating the neural spiking activity as a gaussian autoregressive model driven by an external stimulus. Given the latent process, neural spiking activity is characterized as a general point process defined by its conditional intensity function. We develop an approximate expectationmaximization (EM) algorithm to estimate the unobservable statespace process, its parameters, and the parameters of the point process. The EM algorithm combines a point process recursive nonlinear filter algorithm, the fixed interval smoothing algorithm, and the statespace covariance algorithm to compute the complete data log likelihood efficiently. We use a KolmogorovSmirnov test based on the timerescaling theorem to evaluate agreement between the model and point process data. We illustrate the model with two simulated data examples: an ensemble of Poisson neurons driven by a common stimulus and a single neuron whose conditional intensity function is approximated as a local Bernoulli process.
Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime
 ANN. STATIST
, 2004
"... An autoregressive process with Markov regime is an autoregressive process for which the regression function at each time point is given by a nonobservable Markov chain. In this paper we consider the asymptotic properties of the maximum likelihood estimator in a possibly nonstationary process of this ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
An autoregressive process with Markov regime is an autoregressive process for which the regression function at each time point is given by a nonobservable Markov chain. In this paper we consider the asymptotic properties of the maximum likelihood estimator in a possibly nonstationary process of this kind for which the hidden state space is compact but not necessarily finite. Consistency and asymptotic normality are shown to follow from uniform exponential forgetting of the initial distribution for the hidden Markov chain conditional on the observations.
What HMMs can do
, 2002
"... Since their inception over thirty years ago, hidden Markov models (HMMs) have have become the predominant methodology for automatic speech recognition (ASR) systems — today, most stateoftheart speech systems are HMMbased. There have been a number of ways to explain HMMs and to list their capabil ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
Since their inception over thirty years ago, hidden Markov models (HMMs) have have become the predominant methodology for automatic speech recognition (ASR) systems — today, most stateoftheart speech systems are HMMbased. There have been a number of ways to explain HMMs and to list their capabilities, each of these ways having both advantages and disadvantages. In an effort to better understand what HMMs can do, this tutorial analyzes HMMs by exploring a novel way in which an HMM can be defined, namely in terms of random variables and conditional independence assumptions. We prefer this definition as it allows us to reason more throughly about the capabilities of HMMs. In particular, it is possible to deduce that there are, in theory at least, no theoretical limitations to the class of probability distributions representable by HMMs. This paper concludes that, in search of a model to supersede the HMM for ASR, we should rather than trying to correct for HMM limitations in the general case, new models should be found based on their potential for better parsimony, computational requirements, and noise insensitivity.