Results 1  10
of
63
Variational learning for switching statespace models
 Neural Computation
, 1998
"... We introduce a new statistical model for time series which iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes. This model combines and generalizes two of the most widely used stochastic time series models  hidden Ma ..."
Abstract

Cited by 142 (6 self)
 Add to MetaCart
We introduce a new statistical model for time series which iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes. This model combines and generalizes two of the most widely used stochastic time series models  hidden Markov models and linear dynamical systems  and is closely related to models that are widely used in the control and econometrics literatures. It can also be derived by extending the mixture of experts neural network (Jacobs et al., 1991) to its fully dynamical version, in which both expert and gating networks are recurrent. Inferring the posterior probabilities of the hidden states of this model is computationally intractable, and therefore the exact Expectation Maximization (EM) algorithm cannot be applied. However, we present a variational approximation that maximizes a lower bound on the log likelihood and makes use of both the forwardbackward recursions for hidden Markov models and the Kalman lter recursions for linear dynamical systems. We tested the algorithm both on artificial data sets and on a natural data set of respiration force from a patient with sleep apnea. The results suggest that variational approximations are a viable method for inference and learning in switching statespace models.
Learning dynamic Bayesian networks
 Adaptive Processing of Sequences and Data Structures
, 1998
"... Bayesian networks are directed acyclic graphs that represent dependencies between variables in a probabilistic model. Many time series models, including the hidden Markov models (HMMs) used in speech recognition and Kalman filter models used in filtering and control applications, can be viewed as ex ..."
Abstract

Cited by 124 (0 self)
 Add to MetaCart
Bayesian networks are directed acyclic graphs that represent dependencies between variables in a probabilistic model. Many time series models, including the hidden Markov models (HMMs) used in speech recognition and Kalman filter models used in filtering and control applications, can be viewed as examples of dynamic Bayesian networks. We first provide a brief tutorial on learning and Bayesian networks. We then present some dynamic Bayesian networks that can capture much richer structure than HMMs and Kalman filters, including spatial and temporal multiresolution structure, distributed hidden state representations, and multiple switching linear regimes. While exact probabilistic inference is intractable in these networks, one can obtain tractable variational approximations which call as subroutines the forwardbackward and Kalman filter recursions. These approximations can be used to learn the model parameters...
Two methods for improving performance of an HMM and their application for gene finding
, 1997
"... A hidden Markov model for gene finding consists of submodels for coding regions, splice sites, introns, intergenic regions and possibly more. It is described how to estimate the model as a whole from labeled sequences instead of estimating the individual parts independently from subsequences. It is ..."
Abstract

Cited by 120 (7 self)
 Add to MetaCart
A hidden Markov model for gene finding consists of submodels for coding regions, splice sites, introns, intergenic regions and possibly more. It is described how to estimate the model as a whole from labeled sequences instead of estimating the individual parts independently from subsequences. It is argued that the standard maximum likelihood estimation criterion is not optimal for training such a model. Instead of maximizing the probability of the DNA sequence, one should maximize the probability of the correct prediction. Such a criterion, called conditional maximum likelihood, is used for the gene finder `HMMgene '. A new (approximative) algorithm is described, which finds the most probable prediction summed over all paths yielding the same prediction. We show that these methods contribute significantly to the high performance of HMMgene. Keywords: Hidden Markov model, gene finding, maximum likelihood, statistical sequence analysis. Introduction As the genome projects evolve autom...
Exploiting Tractable Substructures in Intractable Networks
 Advances in Neural Information Processing Systems 8
, 1995
"... We develop a refined mean field approximation for inference and learning in probabilistic neural networks. Our mean field theory, unlike most, does not assume that the units behave as independent degrees of freedom; instead, it exploits in a principled way the existence of large substructures that a ..."
Abstract

Cited by 99 (12 self)
 Add to MetaCart
We develop a refined mean field approximation for inference and learning in probabilistic neural networks. Our mean field theory, unlike most, does not assume that the units behave as independent degrees of freedom; instead, it exploits in a principled way the existence of large substructures that are computationally tractable. To illustrate the advantages of this framework, we show how to incorporate weak higher order interactions into a firstorder hidden Markov model, treating the corrections (but not the first order structure) within mean field theory. 1 INTRODUCTION Learning the parameters in a probabilistic neural network may be viewed as a problem in statistical estimation. In networks with sparse connectivity (e.g. trees and chains), there exist efficient algorithms for the exact probabilistic calculations that support inference and learning. In general, however, these calculations are intractable, and approximations are required. Mean field theory provides a framework for app...
Bayesian Methods for Hidden Markov Models  Recursive Computing in the 21st Century
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) use ..."
Abstract

Cited by 86 (8 self)
 Add to MetaCart
Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) used in practice can be improved by incorporating established recursive algorithms. The most important is a set of forwardbackward recursions calculating conditional distributions of the hidden states given observed data and model parameters. We show how to use the recursive algorithms in an MCMC context and demonstrate mathematical and empirical results showing a Gibbs sampler using the forwardbackward recursions mixes more rapidly than another sampler often used for HMM's. We introduce an augmented variables technique for obtaining unique state labels in HMM's and finite mixture models. We show how recursive computing allows statistically efficient use of MCMC output when estimating the hidden states. We directly calculate the posterior distribution of the hidden chain's state space size by MCMC, circumventing asymptotic arguments underlying the Bayesian information criterion, which is shown to be inappropriate for a frequently analyzed data set in the HMM literature. The use of loglikelihood for assessing MCMC convergence is illustrated, and posterior predictive checks are used to investigate application specific questions of model adequacy.
A nonhomogeneous hidden Markov model for precipitation occurrence
 Applied Statistics
, 1999
"... Improvements in computing power, data gathering and our understanding of atmospheric dynamics have lead to the availability of spatially and temporally extensive sets of data on the atmospheric processes that a ect precipitation. However, these two processes (atmospheric circulation and precipitatio ..."
Abstract

Cited by 49 (1 self)
 Add to MetaCart
Improvements in computing power, data gathering and our understanding of atmospheric dynamics have lead to the availability of spatially and temporally extensive sets of data on the atmospheric processes that a ect precipitation. However, these two processes (atmospheric circulation and precipitation) operate on very di erent spatial scales. Recently, considerable e ort has been devoted to developing \downscaling" models which condition local precipitation on broadscale atmospheric circulation. In this article, we develop a stochastic model for relating precipitation occurrences at multiple rain gauge stations to atmospheric circulation patterns. The proposed model is an example of a nonhomogeneous hidden Markov model, and generalizes existing downscaling models in the literature. The model assumes that atmospheric circulation can be classi ed into a small number of (unobserved) discrete patterns (called \weather states"). The weather states are assumed to follow a Markov chain in which the transition probabilities depend on observable characteristics of the atmosphere (e.g. mean sealevel pressure). Precipitation is assumed to be conditionally temporally, but not spatially, independent given the weather state. An autologistic
Switching StateSpace Models
 King’s College Road, Toronto M5S 3H5
, 1996
"... We introduce a statistical model for times series data with nonlinear dynamics which iteratively segments the data into regimes with approximately linear dynamics and learns the parameters of each of those regimes. This model combines and generalizes two of the most widely used stochastic time se ..."
Abstract

Cited by 41 (2 self)
 Add to MetaCart
We introduce a statistical model for times series data with nonlinear dynamics which iteratively segments the data into regimes with approximately linear dynamics and learns the parameters of each of those regimes. This model combines and generalizes two of the most widely used stochastic time series modelsthe hidden Markov model and the linear dynamical systemand is related to models that are widely used in the control and econometrics literatures. It can also be derived by extending the mixture of experts neural network model (Jacobs et al., 1991) to its fully dynamical version, in which both expert and gating networks are recurrent. Inferring the posterior probabilities of the hidden states of this model is computationally intractable, and therefore the exact Expectation Maximization (EM) alogithm cannot be applied. However, we present a variational approximation which maximizes a lower bound on the log likelihood and makes use of both the forwardbackward recursio...
Learning pedestrian models for silhouette refinement
 In Proc. IEEE Conf. on Computer Vision
, 2003
"... We present a modelbased method for accurate extraction of pedestrian silhouettes from video sequences. Our approach is based on two assumptions, 1) there is a common appearance to all pedestrians, and 2) each individual looks like him/herself over a short amount of time. These assumptions allow us ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
We present a modelbased method for accurate extraction of pedestrian silhouettes from video sequences. Our approach is based on two assumptions, 1) there is a common appearance to all pedestrians, and 2) each individual looks like him/herself over a short amount of time. These assumptions allow us to learn pedestrian models that encompass both a pedestrian population appearance and the individual appearance variations. Using our models, we are able to produce pedestrian silhouettes that have fewer noise pixels and missing parts. We apply our silhouette extraction approach to the NIST gait data set and show that under the gait recognition task, our modelbased sulhouettes result in much higher recognition rates than silhouettes directly extracted from background subtraction, or any nonmodelbased smoothing schemes. 1.
Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime
 ANN. STATIST
, 2004
"... An autoregressive process with Markov regime is an autoregressive process for which the regression function at each time point is given by a nonobservable Markov chain. In this paper we consider the asymptotic properties of the maximum likelihood estimator in a possibly nonstationary process of this ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
An autoregressive process with Markov regime is an autoregressive process for which the regression function at each time point is given by a nonobservable Markov chain. In this paper we consider the asymptotic properties of the maximum likelihood estimator in a possibly nonstationary process of this kind for which the hidden state space is compact but not necessarily finite. Consistency and asymptotic normality are shown to follow from uniform exponential forgetting of the initial distribution for the hidden Markov chain conditional on the observations.
Asymptotics of the Maximum Likelihood Estimator for general Hidden Markov Models
 Bernoulli
, 2001
"... In this paper, we consider the consistency and asymptotic normality of the maximum likelihood estimator for a possibly non stationary Hidden Markov Model where the hidden state space is a separable and compact space non necessarily finite, and both the transition kernel of the hidden chain and the c ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
In this paper, we consider the consistency and asymptotic normality of the maximum likelihood estimator for a possibly non stationary Hidden Markov Model where the hidden state space is a separable and compact space non necessarily finite, and both the transition kernel of the hidden chain and the conditional distribution of the observations depend on a parameter `. For identifiable models, consistency and asymptotic normality of the maximum likelihood estimator is shown to follow from exponential forgetting properties of the state prediction filter and geometric ergodicity of suitably extended Markov chains. Keywords: asymptotic normality, consistency, geometric ergodicity, Hidden Markov Models, identifiability, Maximum Likelihood Estimation. 1 Introduction. Hidden Markov Models (HMMs) form a wide class of discretetime stochastic processes, used in different areas such as speech recognition (Juang and Rabiner 1989), neurophysiology (Fredkin and Rice 1987), biology (Churchill 1989), ...