Results 1  10
of
19
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 564 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
The Bayesian Structural EM Algorithm
, 1998
"... In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete datathat is, in the presence of missing values or hidden variables. In a recent paper, I in ..."
Abstract

Cited by 220 (12 self)
 Add to MetaCart
In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete datathat is, in the presence of missing values or hidden variables. In a recent paper, I introduced an algorithm called Structural EM that combines the standard Expectation Maximization (EM) algorithm, which optimizes parameters, with structure search for model selection. That algorithm learns networks based on penalized likelihood scores, which include the BIC/MDL score and various approximations to the Bayesian score. In this paper, I extend Structural EM to deal directly with Bayesian model selection. I prove the convergence of the resulting algorithm and show how to apply it for learning a large class of probabilistic models, including Bayesian networks and some variants thereof.
The Bayes Net Toolbox for MATLAB
 Computing Science and Statistics
, 2001
"... The Bayes Net Toolbox (BNT) is an opensource Matlab package for directed graphical models. BNT supports many kinds of nodes (probability distributions), exact and approximate inference, parameter and structure learning, and static and dynamic models. BNT is widely used in teaching and research: the ..."
Abstract

Cited by 176 (2 self)
 Add to MetaCart
The Bayes Net Toolbox (BNT) is an opensource Matlab package for directed graphical models. BNT supports many kinds of nodes (probability distributions), exact and approximate inference, parameter and structure learning, and static and dynamic models. BNT is widely used in teaching and research: the web page has received over 28,000 hits since May 2000. In this paper, we discuss a broad spectrum of issues related to graphical models (directed and undirected), and describe, at a highlevel, how BNT was designed to cope with them all. We also compare BNT to other software packages for graphical models, and to the nascent OpenBayes effort.
Modelling gene expression data using dynamic bayesian networks
, 1999
"... Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of ..."
Abstract

Cited by 157 (1 self)
 Add to MetaCart
Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of Weaver et al. [WWS99] — are all special cases of a general class of models called Dynamic Bayesian Networks (DBNs). The advantages of DBNs include the ability to model stochasticity, to incorporate prior knowledge, and to handle hidden variables and missing data in a principled way. This paper provides a review of techniques for learning DBNs. Keywords: Genetic networks, boolean networks, Bayesian networks, neural networks, reverse engineering, machine learning. 1
Learning shapeclasses using a mixture of treeunions
 IEEE Trans. PAMI
, 2006
"... Abstract—This paper poses the problem of treeclustering as that of fitting a mixture of tree unions to a set of sample trees. The treeunions are structures from which the individual data samples belonging to a cluster can be obtained by edit operations. The distribution of observed tree nodes in ea ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
Abstract—This paper poses the problem of treeclustering as that of fitting a mixture of tree unions to a set of sample trees. The treeunions are structures from which the individual data samples belonging to a cluster can be obtained by edit operations. The distribution of observed tree nodes in each cluster sample is assumed to be governed by a Bernoulli distribution. The clustering method is designed to operate when the correspondences between nodes are unknown and must be inferred as part of the learning process. We adopt a minimum description length approach to the problem of fitting the mixture model to data. We make maximumlikelihood estimates of the Bernoulli parameters. The treeunions and the mixing proportions are sought so as to minimize the description length criterion. This is the sum of the negative logarithm of the Bernoulli distribution, and a messagelength criterion that encodes both the complexity of the uniontrees and the number of mixture components. We locate node correspondences by minimizing the edit distance with the current tree unions, and show that the edit distance is linked to the description length criterion. The method can be applied to both unweighted and weighted trees. We illustrate the utility of the resulting algorithm on the problem of classifying 2D shapes using a shock graph representation. Index Terms—Structural learning, tree clustering, mixture modelinq, minimum description length, model codes, shock graphs. 1
Learning Bayes net structure from sparse data sets
, 2001
"... There are essentially two kinds of approaches for learning the structure of Bayesian Networks (BNs) from data. The first approach tries to find a graph which satis es all the constraints implied by the empirical conditional independencies measured in the data [PV91, SGS00a, Shi00]. The second approa ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
There are essentially two kinds of approaches for learning the structure of Bayesian Networks (BNs) from data. The first approach tries to find a graph which satis es all the constraints implied by the empirical conditional independencies measured in the data [PV91, SGS00a, Shi00]. The second approach searches through the space of models (either DAGs or PDAGs), and uses some scoring metric (typically Bayesian or some approximation, such as BIC/MDL) to evaluate the models [CH92, Hec95, Hec98, Kra98], typically returning the highest scoring model found. Our main interest is in learning BN structure from gene expression data [FLNP00, HGJY01, MM99, SGS00b]. In domains such as this, where the ratio of the number of observations to the number of variables is low (i.e., when we have sparse data), selecting a threshold for the conditional independence (CI) tests can be tricky, and repeated use of such tests can lead to inconsistencies [DD99]. Bayesian s...
Exploiting parameter domain knowledge for learning in Bayesian networks
 Carnegie Mellon University
, 2005
"... implied, of any sponsoring institution, the U.S. government or any other entity. ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
implied, of any sponsoring institution, the U.S. government or any other entity.
GraftingLight: Fast, Incremental Feature Selection and Structure Learning of Markov Random Fields
"... Feature selection is an important task in order to achieve better generalizability in high dimensional learning, and structure learning of Markov random fields (MRFs) can automatically discover the inherent structures underlying complex data. Both problems can be cast as solving an ℓ1norm regulariz ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Feature selection is an important task in order to achieve better generalizability in high dimensional learning, and structure learning of Markov random fields (MRFs) can automatically discover the inherent structures underlying complex data. Both problems can be cast as solving an ℓ1norm regularized parameter estimation problem. To solve such an ℓ1regularized estimation problem, the existing Grafting [16] method can avoid doing inference on dense graphs in structure learning by incrementally selecting new features. However, Grafting performs a greedy step of optimizing over free parameters once new features are included. This greedy strategy results in low efficiency when parameter learning is itself nontrivial, such as in MRFs, in which parameter learning depends on an expensive subroutine to calculate gradients. The complexity of calculating gradients in MRFs is typically exponential to the size of maximal cliques. In this paper, we present a fast algorithm called GraftingLight to solve the ℓ1norm regularized maximum likelihood estimation of MRFs for efficient feature selection and structure learning. GraftingLight iteratively performs onestep of orthantwise gradient descent over free parameters and selects new features. This lazy strategy is guaranteed to converge to the global optimum and can effectively select significant features. On both synthetic and real data sets, we show that GraftingLight is much more efficient than Grafting for both feature selection and structure learning, and performs comparably with the optimal batch method for feature selection but is much more efficient and accurate for structure learning of MRFs.