Results 1  10
of
35
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 563 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Mean Field Theory for Sigmoid Belief Networks
 Journal of Artificial Intelligence Research
, 1996
"... We develop a mean field theory for sigmoid belief networks based on ideas from statistical mechanics. ..."
Abstract

Cited by 116 (12 self)
 Add to MetaCart
We develop a mean field theory for sigmoid belief networks based on ideas from statistical mechanics.
Tutorial on Variational Approximation Methods
 In Advanced Mean Field Methods: Theory and Practice
, 2000
"... We provide an introduction to the theory and use of variational methods for inference and estimation in the context of graphical models. Variational methods become useful as ecient approximate methods when the structure of the graph model no longer admits feasible exact probabilistic calculations. T ..."
Abstract

Cited by 73 (1 self)
 Add to MetaCart
We provide an introduction to the theory and use of variational methods for inference and estimation in the context of graphical models. Variational methods become useful as ecient approximate methods when the structure of the graph model no longer admits feasible exact probabilistic calculations. The emphasis of this tutorial is on illustrating how inference and estimation problems can be transformed into variational form along with describing the resulting approximation algorithms and their properties insofar as these are currently known. 1 Introduction The term variational methods refers to a large collection of optimization techniques. The classical context for these methods involves nding the extremum of an integral depending on an unknown function and its derivatives. This classical de nition, however, and the accompanying calculus of variation no longer adequately characterizes modern variational methods. Modern variational approaches have become indispensable tools in...
Gaussian Processes  A Replacement for Supervised Neural Networks?
"... These lecture notes are based on the work of Neal (1996), Williams and ..."
Abstract

Cited by 51 (0 self)
 Add to MetaCart
These lecture notes are based on the work of Neal (1996), Williams and
Treereweighted belief propagation algorithms and approximate ML estimation by pseudomoment matching
 In AISTATS
, 2003
"... In previous work [10], we presented a class of upper bounds on the log partition function of an arbitrary undirected graphical model based on solving a convex variational problem. Here we develop a class of local messagepassing algorithms, which we call treereweighted belief propagation, for ..."
Abstract

Cited by 51 (4 self)
 Add to MetaCart
In previous work [10], we presented a class of upper bounds on the log partition function of an arbitrary undirected graphical model based on solving a convex variational problem. Here we develop a class of local messagepassing algorithms, which we call treereweighted belief propagation, for ef ciently computing the value of these upper bounds, as well as the associated pseudomarginals.
Ensemble learning for independent component analysis
 in Advances in Independent Component Analysis
, 2000
"... i Abstract This thesis is concerned with the problem of Blind Source Separation. Specifically we considerthe Independent Component Analysis (ICA) model in which a set of observations are modelled by xt = Ast: (1) where A is an unknown mixing matrix and st is a vector of hidden source components atti ..."
Abstract

Cited by 49 (2 self)
 Add to MetaCart
i Abstract This thesis is concerned with the problem of Blind Source Separation. Specifically we considerthe Independent Component Analysis (ICA) model in which a set of observations are modelled by xt = Ast: (1) where A is an unknown mixing matrix and st is a vector of hidden source components attime t. The ICA problem is to find the sources given only a set of observations. In chapter 1, the blind source separation problem is introduced. In chapter 2 the methodof Ensemble Learning is explained. Chapter 3 applies Ensemble Learning to the ICA model and chapter 4 assesses the use of Ensemble Learning for model selection.Chapters 57 apply the Ensemble Learning ICA algorithm to data sets from physics (a medical imaging data set consisting of images of a tooth), biology (data sets from cDNAmicroarrays) and astrophysics (Planck image separation and galaxy spectra separation).
A variational approach to Bayesian logistic regression models and their extensions
, 1996
"... We consider a logistic regression model with a Gaussian prior distribution over the parameters. We show that accurate variational techniques can be used to obtain a closed form posterior distribution over the parameters given the data thereby yielding a posterior predictive model. The results are st ..."
Abstract

Cited by 45 (2 self)
 Add to MetaCart
We consider a logistic regression model with a Gaussian prior distribution over the parameters. We show that accurate variational techniques can be used to obtain a closed form posterior distribution over the parameters given the data thereby yielding a posterior predictive model. The results are straightforwardly extended to (binary) belief networks. For the belief networks we also derive closed form parameter posteriors in the presence of missing values. We show finally that the dual of the regression problem gives a latent variable density model the variational formulation of which leads to exactly solvable EM updates.
Gaussian Process Classification for Segmenting and Annotating Sequences
 In Proceedings of the International Conference on Machine Learning (ICML
, 2004
"... Many realworld classification tasks involve the prediction of multiple, interdependent class labels. A prototypical case of this sort deals with prediction of a sequence of labels for a sequence of observations. Such problems arise naturally in the context of annotating and segmenting observ ..."
Abstract

Cited by 34 (5 self)
 Add to MetaCart
Many realworld classification tasks involve the prediction of multiple, interdependent class labels. A prototypical case of this sort deals with prediction of a sequence of labels for a sequence of observations. Such problems arise naturally in the context of annotating and segmenting observation sequences.