Results 1  10
of
183
A tutorial on particle filters for online nonlinear/nonGaussian Bayesian tracking
 IEEE TRANSACTIONS ON SIGNAL PROCESSING
, 2002
"... Increasingly, for many application areas, it is becoming important to include elements of nonlinearity and nonGaussianity in order to model accurately the underlying dynamics of a physical system. Moreover, it is typically crucial to process data online as it arrives, both from the point of view o ..."
Abstract

Cited by 1137 (2 self)
 Add to MetaCart
Increasingly, for many application areas, it is becoming important to include elements of nonlinearity and nonGaussianity in order to model accurately the underlying dynamics of a physical system. Moreover, it is typically crucial to process data online as it arrives, both from the point of view of storage costs as well as for rapid adaptation to changing signal characteristics. In this paper, we review both optimal and suboptimal Bayesian algorithms for nonlinear/nonGaussian tracking problems, with a focus on particle filters. Particle filters are sequential Monte Carlo methods based on point mass (or “particle”) representations of probability densities, which can be applied to any statespace model and which generalize the traditional Kalman filtering methods. Several variants of the particle filter such as SIR, ASIR, and RPF are introduced within a generic framework of the sequential importance sampling (SIS) algorithm. These are discussed and compared with the standard EKF through an illustrative example.
A Unifying Review of Linear Gaussian Models
, 1999
"... Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observa ..."
Abstract

Cited by 260 (17 self)
 Add to MetaCart
Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observations and derivations made by many previous authors and introducing a new way of linking discrete and continuous state models using a simple nonlinearity. Through the use of other nonlinearities, we show how independent component analysis is also a variation of the same basic generative model. We show that factor analysis and mixtures of gaussians can be implemented in autoencoder neural networks and learned using squared error plus the same regularization term. We introduce a new model for static data, known as sensible principal component analysis, as well as a novel concept of spatially adaptive observation noise. We also review some of the literature involving global and local mixtures of the basic models and provide pseudocode for inference and learning for all the basic models.
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Abstract

Cited by 170 (3 self)
 Add to MetaCart
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
Post'87 Crash Fears in the S&P 500 Futures Option Market
, 1998
"... Postcrash distributions inferred from S ..."
Parameter estimation for linear dynamical systems
, 1996
"... Linear systems have been used extensively in engineering to model and control the behavior of dynamical systems. In this note, we present the Expectation Maximization (EM) algorithm for estimating the parameters of linear systems (Shumway and Sto er, 1982). We also point out the relationship between ..."
Abstract

Cited by 156 (7 self)
 Add to MetaCart
Linear systems have been used extensively in engineering to model and control the behavior of dynamical systems. In this note, we present the Expectation Maximization (EM) algorithm for estimating the parameters of linear systems (Shumway and Sto er, 1982). We also point out the relationship between linear dynamical systems, factor analysis, and hidden Markov models.
Variational learning for switching statespace models
 Neural Computation
, 1998
"... We introduce a new statistical model for time series which iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes. This model combines and generalizes two of the most widely used stochastic time series models  hidden Ma ..."
Abstract

Cited by 142 (6 self)
 Add to MetaCart
We introduce a new statistical model for time series which iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes. This model combines and generalizes two of the most widely used stochastic time series models  hidden Markov models and linear dynamical systems  and is closely related to models that are widely used in the control and econometrics literatures. It can also be derived by extending the mixture of experts neural network (Jacobs et al., 1991) to its fully dynamical version, in which both expert and gating networks are recurrent. Inferring the posterior probabilities of the hidden states of this model is computationally intractable, and therefore the exact Expectation Maximization (EM) algorithm cannot be applied. However, we present a variational approximation that maximizes a lower bound on the log likelihood and makes use of both the forwardbackward recursions for hidden Markov models and the Kalman lter recursions for linear dynamical systems. We tested the algorithm both on artificial data sets and on a natural data set of respiration force from a patient with sleep apnea. The results suggest that variational approximations are a viable method for inference and learning in switching statespace models.
A Monte Carlo Approach to Nonnormal and Nonlinear StateSpace Modeling
, 1992
"... this article then is to develop methodology for modeling the nonnormality of the ut, the vt, or both. A second departure from the model specification ( 1 ) is to allow for unknown variances in the state or observational equation, as well as for unknown parameters in the transition matrices Ft and Ht ..."
Abstract

Cited by 126 (14 self)
 Add to MetaCart
this article then is to develop methodology for modeling the nonnormality of the ut, the vt, or both. A second departure from the model specification ( 1 ) is to allow for unknown variances in the state or observational equation, as well as for unknown parameters in the transition matrices Ft and Ht. As a third generalization we allow for nonlinear model structures; that is, X t = ft(Xtl) q Ut, and Yt = ht(xt) + vt, t = 1, ..., n, (2) whereft( ) and ht(. ) are given, but perhaps also depend on some unknown parameters. The experimenter may wish to entertain a variety of error distributions. Our goal throughout the article is an analysis for general statespace models that does not resort to convenient assumptions at the expense of model adequacy
Learning dynamic Bayesian networks
 Adaptive Processing of Sequences and Data Structures
, 1998
"... Bayesian networks are directed acyclic graphs that represent dependencies between variables in a probabilistic model. Many time series models, including the hidden Markov models (HMMs) used in speech recognition and Kalman filter models used in filtering and control applications, can be viewed as ex ..."
Abstract

Cited by 124 (0 self)
 Add to MetaCart
Bayesian networks are directed acyclic graphs that represent dependencies between variables in a probabilistic model. Many time series models, including the hidden Markov models (HMMs) used in speech recognition and Kalman filter models used in filtering and control applications, can be viewed as examples of dynamic Bayesian networks. We first provide a brief tutorial on learning and Bayesian networks. We then present some dynamic Bayesian networks that can capture much richer structure than HMMs and Kalman filters, including spatial and temporal multiresolution structure, distributed hidden state representations, and multiple switching linear regimes. While exact probabilistic inference is intractable in these networks, one can obtain tractable variational approximations which call as subroutines the forwardbackward and Kalman filter recursions. These approximations can be used to learn the model parameters...
Propagation Algorithms for Variational Bayesian Learning
 In Advances in Neural Information Processing Systems 13
, 2001
"... Variational approximations are becoming a widespread tool for Bayesian learning of graphical models. We provide some theoretical results for the variational updates in a very general family of conjugateexponential graphical models. We show how the belief propagation and the junction tree algorithms ..."
Abstract

Cited by 110 (14 self)
 Add to MetaCart
Variational approximations are becoming a widespread tool for Bayesian learning of graphical models. We provide some theoretical results for the variational updates in a very general family of conjugateexponential graphical models. We show how the belief propagation and the junction tree algorithms can be used in the inference step of variational Bayesian learning. Applying these results to the Bayesian analysis of linearGaussian statespace models we obtain a learning procedure that exploits the Kalman smoothing propagation, while integrating over all model parameters. We demonstrate how this can be used to infer the hidden state dimensionality of the statespace model in a variety of synthetic problems and one real highdimensional data set.
Probability product kernels
 Journal of Machine Learning Research
, 2004
"... The advantages of discriminative learning algorithms and kernel machines are combined with generative modeling using a novel kernel between distributions. In the probability product kernel, data points in the input space are mapped to distributions over the sample space and a general inner product i ..."
Abstract

Cited by 104 (7 self)
 Add to MetaCart
The advantages of discriminative learning algorithms and kernel machines are combined with generative modeling using a novel kernel between distributions. In the probability product kernel, data points in the input space are mapped to distributions over the sample space and a general inner product is then evaluated as the integral of the product of pairs of distributions. The kernel is straightforward to evaluate for all exponential family models such as multinomials and Gaussians and yields interesting nonlinear kernels. Furthermore, the kernel is computable in closed form for latent distributions such as mixture models, hidden Markov models and linear dynamical systems. For intractable models, such as switching linear dynamical systems, structured meanfield approximations can be brought to bear on the kernel evaluation. For general distributions, even if an analytic expression for the kernel is not feasible, we show a straightforward sampling method to evaluate it. Thus, the kernel permits discriminative learning methods, including support vector machines, to exploit the properties, metrics and invariances of the generative models we infer from each datum. Experiments are shown using multinomial models for text, hidden Markov models for biological data sets and linear dynamical systems for time series data.