Results 1  10
of
17
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 564 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
A Guide to the Literature on Learning Probabilistic Networks From Data
, 1996
"... This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the ..."
Abstract

Cited by 172 (0 self)
 Add to MetaCart
This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the different methodological communities, such as Bayesian, description length, and classical statistics. Basic concepts for learning and Bayesian networks are introduced and methods are then reviewed. Methods are discussed for learning parameters of a probabilistic network, for learning the structure, and for learning hidden variables. The presentation avoids formal definitions and theorems, as these are plentiful in the literature, and instead illustrates key concepts with simplified examples. Keywords Bayesian networks, graphical models, hidden variables, learning, learning structure, probabilistic networks, knowledge discovery. I. Introduction Probabilistic networks or probabilistic gra...
Probabilistic independence networks for hidden Markov probability models
, 1996
"... Graphical techniques for modeling the dependencies of random variables have been explored in a variety of different areas including statistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics. Formalisms for manipulating these models have been develop ..."
Abstract

Cited by 167 (12 self)
 Add to MetaCart
Graphical techniques for modeling the dependencies of random variables have been explored in a variety of different areas including statistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics. Formalisms for manipulating these models have been developed relatively independently in these research communities. In this paper we explore hidden Markov models (HMMs) and related structures within the general framework of probabilistic independence networks (PINs). The paper contains a selfcontained review of the basic principles of PINs. It is shown that the wellknown forwardbackward (FB) and Viterbi algorithms for HMMs are special cases of more general inference algorithms for arbitrary PINs. Furthermore, the existence of inference and estimation algorithms for more general graphical models provides a set of analysis tools for HMM practitioners who wish to explore a richer class of HMM structures. Examples of relatively complex models to handle sensor fusion and coarticulation in speech recognition are introduced and treated within the graphical model framework to illustrate the advantages of the general approach.
Thin Junction Trees
 Advances in Neural Information Processing Systems 14
, 2001
"... We present an algorithm that induces a class of models with thin junction treesmodels that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the l ..."
Abstract

Cited by 50 (1 self)
 Add to MetaCart
We present an algorithm that induces a class of models with thin junction treesmodels that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the learning process. This allows both an efficient implementation of an iterative scaling parameter estimation algorithm and also ensures that inference can be performed efficiently with the final model. We illustrate the approach with applications in handwritten digit recognition and DNA splice site detection.
Approximate inference and constrained optimization
 In 19th UAI
, 2003
"... Loopy and generalized belief propagation are popular algorithms for approximate inference in Markov random fields and Bayesian networks. Fixed points of these algorithms correspond to extrema of the Bethe and Kikuchi free energy (Yedidia et al., 2001). However, belief propagation does not always con ..."
Abstract

Cited by 50 (7 self)
 Add to MetaCart
Loopy and generalized belief propagation are popular algorithms for approximate inference in Markov random fields and Bayesian networks. Fixed points of these algorithms correspond to extrema of the Bethe and Kikuchi free energy (Yedidia et al., 2001). However, belief propagation does not always converge, which motivates approaches that explicitly minimize the Kikuchi/Bethe free energy, such as CCCP (Yuille, 2002) and UPS (Teh and Welling, 2002). Here we describe a class of algorithms that solves this typically nonconvex constrained minimization problem through a sequence of convex constrained minimizations of upper bounds on the Kikuchi free energy. Intuitively one would expect tighter bounds to lead to faster algorithms, which is indeed convincingly demonstrated in our simulations. Several ideas are applied to obtain tight convex bounds that yield dramatic speedups over CCCP. 1
Passing And Bouncing Messages For Generalized Inference
, 2001
"... this paper we will propose an algorithm which combines BP and IPF into one message 1 More precisely, the fixed points of loopy BP are the stationary points of Bethe free energy. However, in practise it turns out that it converges to minima 1 passing algorithm on a tree. The new set of messages re ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
this paper we will propose an algorithm which combines BP and IPF into one message 1 More precisely, the fixed points of loopy BP are the stationary points of Bethe free energy. However, in practise it turns out that it converges to minima 1 passing algorithm on a tree. The new set of messages reduces to BP messages on the internal nodes of the tree. However, when a messages reaches a constrained marginal (other than a "hard evidence" node), it will be "bounced back", while being changed in the process. When the constrained marginal is a delta function (hard observation), the returned message is independent of the incoming message in the usual way. The algorithm requires a scheduling of messages such that the information from a bounced message has reached the node where the next bounce takes place. Unlike BP on a tree (but like IPF on a tree), the combined algorithm does not converge within a finite number of iterations
Bethe Free Energy and Contrastive Divergence Approximations for Undirected Graphical Models
, 2003
"... Bethe Free Energy and Contrastive Divergence Approximations for Undirected Graphical Models Yee Whye Teh Doctorate of Philosophy Graduate Department of Computer Science University of Toronto 2003 As the machine learning community tackles more complex and harder problems, the graphical models ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Bethe Free Energy and Contrastive Divergence Approximations for Undirected Graphical Models Yee Whye Teh Doctorate of Philosophy Graduate Department of Computer Science University of Toronto 2003 As the machine learning community tackles more complex and harder problems, the graphical models needed to solve such problems become larger and more complicated. As a result performing inference and learning exactly for such graphical models become ever more expensive, and approximate inference and learning techniques become ever more prominent.
Proposed design for gR, a graphical models toolkit for R
, 2003
"... This document is an extension of the talk I gave at the gR meeting in Aalborg on 19 September 2003. It outlines a proposed design for gR, a graphical models library for R. This design is similar to the design of BNT, but is much more general, in that it supports undirected models and chain graphs, a ..."
Abstract
 Add to MetaCart
This document is an extension of the talk I gave at the gR meeting in Aalborg on 19 September 2003. It outlines a proposed design for gR, a graphical models library for R. This design is similar to the design of BNT, but is much more general, in that it supports undirected models and chain graphs, and allows parameters to be represented as random variables (Bayesian modeling).