Results 1 - 10
of
87
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have bee ..."
Abstract
-
Cited by 393 (4 self)
- Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models
- Neural Computation
, 2001
"... A Bayesian ensemble learning method is introduced for unsupervised extraction of dynamic processes from noisy data. The data are assumed to be generated by an unknown nonlinear mapping from unknown factors. The dynamics of the factors are modeled using a nonlinear statespace model. The nonlinear map ..."
Abstract
-
Cited by 77 (32 self)
- Add to MetaCart
A Bayesian ensemble learning method is introduced for unsupervised extraction of dynamic processes from noisy data. The data are assumed to be generated by an unknown nonlinear mapping from unknown factors. The dynamics of the factors are modeled using a nonlinear statespace model. The nonlinear mappings in the model are represented using multilayer perceptron networks. The proposed method is computationally demanding, but it allows the use of higher dimensional nonlinear latent variable models than other existing approaches. Experiments with chaotic data show that the new method is able to blindly estimate the factors and the dynamic process which have generated the data. It clearly outperforms currently available nonlinear prediction techniques in this very di#cult test problem.
Inference in Hybrid Networks: Theoretical Limits and Practical Algorithms
- In UAI
, 2001
"... An important subclass of hybrid Bayesian networks ..."
Monte Carlo Methods for Tempo Tracking and Rhythm Quantization
- JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2003
"... We present a probabilistic generarive model for timing deviations in expressive music performance. The structure of the proposed model is equivalent to a switching state space model. The switch variables correspond to discrete note locations as in a musical score. The continuous hidden variables ..."
Abstract
-
Cited by 44 (7 self)
- Add to MetaCart
We present a probabilistic generarive model for timing deviations in expressive music performance. The structure of the proposed model is equivalent to a switching state space model. The switch variables correspond to discrete note locations as in a musical score. The continuous hidden variables denote the tempo. We formulate two well known music recognition problems, namely tempo tracking and automatic transcription (rhythm quantization) as filtering and maximum a posteriori (MAP) state estimation tasks. Ex- act computation of posterior features such as the MAP state is intractable in this model class, so we introduce Monte Carlo methods for integration and optimization. We compare Markov Chain Monte Carlo (MCMC) methods (such as Gibbs sampling, simulated annealing and iterative improvement) and sequential Monte Carlo methods (particle filters). Our simulation results suggest better results with sequential methods. The methods can be applied in both online and batch scenarios such as tempo tracking and transcription and are thus potentially useful in a number of music applications such as adaptive automatic accompaniment, score typesetting and music information retrieval.
Expectation propagation for approximate inference in dynamic Bayesian networks
- In Proceedings UAI
, 2002
"... We describe expectation propagation for approximate inference in dynamic Bayesian networks as a natural extension of Pearl's exact belief propagation. Expectation propagation is a greedy algorithm, converges in many practical cases, but not always. We derive a double-loop algorithm, guaranteed to co ..."
Abstract
-
Cited by 41 (9 self)
- Add to MetaCart
We describe expectation propagation for approximate inference in dynamic Bayesian networks as a natural extension of Pearl's exact belief propagation. Expectation propagation is a greedy algorithm, converges in many practical cases, but not always. We derive a double-loop algorithm, guaranteed to converge to a local minimum of a Bethe free energy. Furthermore, we show that stable fixed points of (damped) expectation propagation correspond to local minima of this free energy, but that the converse need not be the case. We illustrate the algorithms by applying them to switching linear dynamical systems and discuss implications for approximate inference in general Bayesian networks.
Hybrid Bayesian Networks for Reasoning about Complex Systems
, 2002
"... Many real-world systems are naturally modeled as hybrid stochastic processes, i.e., stochastic processes that contain both discrete and continuous variables. Examples include speech recognition, target tracking, and monitoring of physical systems. The task is usually to perform probabilistic inferen ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
Many real-world systems are naturally modeled as hybrid stochastic processes, i.e., stochastic processes that contain both discrete and continuous variables. Examples include speech recognition, target tracking, and monitoring of physical systems. The task is usually to perform probabilistic inference, i.e., infer the hidden state of the system given some noisy observations. For example, we can ask what is the probability that a certain word was pronounced given the readings of our microphone, what is the probability that a submarine is trying to surface given our sonar data, and what is the probability of a valve being open given our pressure and flow readings. Bayesian networks are
Planning by Probabilistic Inference
- Proc. of the 9th Int. Workshop on Artificial Intelligence and Statistics
, 2003
"... This paper presents and demonstrates a new approach to the problem of planning under uncertainty. Actions are treated as hidden variables, with their own prior distributions, in a probabilistic generative model involving actions and states. Planning is done by computing the posterior distribut ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
This paper presents and demonstrates a new approach to the problem of planning under uncertainty. Actions are treated as hidden variables, with their own prior distributions, in a probabilistic generative model involving actions and states. Planning is done by computing the posterior distribution over actions, conditioned on reaching the goal state within a specified number of steps. Under the new formulation, the toolbox of inference techniques be brought to bear on the planning problem. This paper focuses on problems with discrete actions and states, and discusses some extensions.
Observability and Identifiability of Jump Linear Systems
- In Proc. of IEEE Conference on Decision and Control
, 2002
"... We analyze the observability of the continuous and discrete states of a class of linear hybrid systems. We derive rank conditions that the structural parameters of the model must satisfy in order for filtering and smoothing algorithms to operate correctly. We also study the identifiability of the mo ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
We analyze the observability of the continuous and discrete states of a class of linear hybrid systems. We derive rank conditions that the structural parameters of the model must satisfy in order for filtering and smoothing algorithms to operate correctly. We also study the identifiability of the model parameters by characterizing the set of models that produce the same output measurements. Finally, when the data are generated by a model in the class, we give conditions under which the true model can be identified.
Modeling, clustering, and segmenting video with mixtures of dynamic textures
- PAMI
, 2008
"... A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of v ..."
Abstract
-
Cited by 30 (12 self)
- Add to MetaCart
A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectation-maximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, timeseries clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (for example, fire, steam, water, vehicle and pedestrian traffic, and so forth). When compared with state-of-the-art methods in motion segmentation, including both temporal texture methods and traditional representations (for example, optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes.
Hierarchical Models of Variance Sources
- SIGNAL PROCESSING
, 2003
"... In many models, variances are assumed to be constant although this assumption is often unrealistic in practice. Joint modelling of means and variances is di#cult in many learning approaches, because it can lead into infinite probability densities. We show that a Bayesian variational technique which ..."
Abstract
-
Cited by 28 (12 self)
- Add to MetaCart
In many models, variances are assumed to be constant although this assumption is often unrealistic in practice. Joint modelling of means and variances is di#cult in many learning approaches, because it can lead into infinite probability densities. We show that a Bayesian variational technique which is sensitive to probability mass instead of density is able to jointly model both variances and means. We consider a model structure where a Gaussian variable, called variance node, controls the variance of another Gaussian variable. Variance nodes make it possible to build hierarchical models for both variances and means. We report experiments with artificial data which demonstrate the ability of the learning algorithm to find variance sources explaining and characterizing well the variances in the multidimensional data. Experiments with biomedical MEG data show that variance sources are present in real-world signals.

