Results 1  10
of
227
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 598 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Learning the structure of dynamic probabilistic networks
, 1998
"... Dynamic probabilistic networks are a compact representation of complex stochastic processes. In this paper we examine how to learn the structure of a DPN from data. We extend structure scoring rules for standard probabilistic networks to the dynamic case, and show how to search for structure when so ..."
Abstract

Cited by 229 (13 self)
 Add to MetaCart
Dynamic probabilistic networks are a compact representation of complex stochastic processes. In this paper we examine how to learn the structure of a DPN from data. We extend structure scoring rules for standard probabilistic networks to the dynamic case, and show how to search for structure when some of the variables are hidden. Finally, we examine two applications where such a technology might be useful: predicting and classifying dynamic behaviors, and learning causal orderings in biological processes. We provide empirical results that demonstrate the applicability of our methods in both domains. 1
The Bayes Net Toolbox for MATLAB
 Computing Science and Statistics
, 2001
"... The Bayes Net Toolbox (BNT) is an opensource Matlab package for directed graphical models. BNT supports many kinds of nodes (probability distributions), exact and approximate inference, parameter and structure learning, and static and dynamic models. BNT is widely used in teaching and research: the ..."
Abstract

Cited by 186 (1 self)
 Add to MetaCart
The Bayes Net Toolbox (BNT) is an opensource Matlab package for directed graphical models. BNT supports many kinds of nodes (probability distributions), exact and approximate inference, parameter and structure learning, and static and dynamic models. BNT is widely used in teaching and research: the web page has received over 28,000 hits since May 2000. In this paper, we discuss a broad spectrum of issues related to graphical models (directed and undirected), and describe, at a highlevel, how BNT was designed to cope with them all. We also compare BNT to other software packages for graphical models, and to the nascent OpenBayes effort.
Modelling gene expression data using dynamic bayesian networks
, 1999
"... Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of ..."
Abstract

Cited by 170 (1 self)
 Add to MetaCart
(Show Context)
Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of Weaver et al. [WWS99] — are all special cases of a general class of models called Dynamic Bayesian Networks (DBNs). The advantages of DBNs include the ability to model stochasticity, to incorporate prior knowledge, and to handle hidden variables and missing data in a principled way. This paper provides a review of techniques for learning DBNs. Keywords: Genetic networks, boolean networks, Bayesian networks, neural networks, reverse engineering, machine learning. 1
Inferring Parameters and Structure of Latent Variable Models by Variational Bayes
, 1999
"... Current methods for learning graphical models with latent variables and a fixed structure estimate optimal values for the model parameters. Whereas this approach usually produces overfitting and suboptimal generalization performance, carrying out the Bayesian program of computing the full posterior ..."
Abstract

Cited by 149 (1 self)
 Add to MetaCart
(Show Context)
Current methods for learning graphical models with latent variables and a fixed structure estimate optimal values for the model parameters. Whereas this approach usually produces overfitting and suboptimal generalization performance, carrying out the Bayesian program of computing the full posterior distributions over the parameters remains a difficult problem. Moreover, learning the structure of models with latent variables, for which the Bayesian approach is crucial, is yet a harder problem. In this paper I present the Variational Bayes framework, which provides a solution to these problems. This approach approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner without resorting to sampling methods. Unlike in the Laplace approximation, these posteriors are generally nonGaussian and no Hessian needs to be computed. The resulting algorithm generalizes the standard Expectation Maximization a...
Learning with mixtures of trees
 Journal of Machine Learning Research
, 2000
"... This paper describes the mixturesoftrees model, a probabilistic model for discrete multidimensional domains. Mixturesoftrees generalize the probabilistic trees of Chow and Liu [6] in a different and complementary direction to that of Bayesian networks. We present efficient algorithms for learnin ..."
Abstract

Cited by 117 (2 self)
 Add to MetaCart
(Show Context)
This paper describes the mixturesoftrees model, a probabilistic model for discrete multidimensional domains. Mixturesoftrees generalize the probabilistic trees of Chow and Liu [6] in a different and complementary direction to that of Bayesian networks. We present efficient algorithms for learning mixturesoftrees models in maximum likelihood and Bayesian frameworks. We also discuss additional efficiencies that can be obtained when data are “sparse, ” and we present data structures and algorithms that exploit such sparseness. Experimental results demonstrate the performance of the model for both density estimation and classification. We also discuss the sense in which treebased classifiers perform an implicit form of feature selection, and demonstrate a resulting insensitivity to irrelevant attributes.
Probabilistic classification and clustering in relational data
 In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence
, 2001
"... Supervised and unsupervised learning methods have traditionally focused on data consisting of independent instances of a single type. However, many realworld domains are best described by relational models in which instances of multiple types are related to each other in complex ways. For example, ..."
Abstract

Cited by 111 (4 self)
 Add to MetaCart
Supervised and unsupervised learning methods have traditionally focused on data consisting of independent instances of a single type. However, many realworld domains are best described by relational models in which instances of multiple types are related to each other in complex ways. For example, in a scientific paper domain, papers are related to each other via citation, and are also related to their authors. In this case, the label of one entity (e.g., the topic of the paper) is often correlated with the labels of related entities. We propose a general class of models for classification and clustering in relational domains that capture probabilistic dependencies between related instances. We show how to learn such models efficiently from data. We present empirical results on two real world data sets. Our experiments in a transductive classification setting indicate that accuracy can be significantly improved by modeling relational dependencies. Our algorithm automatically induces a very natural behavior, where our knowledge about one instance helps us classify related ones, which in turn help us classify others. In an unsupervised setting, our models produced coherent clusters with a very natural interpretation, even for instance types that do not have any attributes. 1
Learning Bayesian Networks from Data: An InformationTheory Based Approach
, 2001
"... This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional indepe ..."
Abstract

Cited by 101 (4 self)
 Add to MetaCart
This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional independence (CI) tests in typical cases. We provide precise conditions that specify when these algorithms are guaranteed to be correct as well as empirical evidence (from real world applications and simulation tests) that demonstrates that these systems work efficiently and reliably in practice.
Modeling Dependencies in ProteinDNA Binding Sites
, 2003
"... The availability of whole genome sequences and highthroughput genomic assays opens the door for in silico analysis of transcription regulation. This includes methods for discovering and characterizing the binding sites of DNAbinding proteins, such as transcription factors. A common representation ..."
Abstract

Cited by 92 (2 self)
 Add to MetaCart
(Show Context)
The availability of whole genome sequences and highthroughput genomic assays opens the door for in silico analysis of transcription regulation. This includes methods for discovering and characterizing the binding sites of DNAbinding proteins, such as transcription factors. A common representation of transcription factor binding sites is aposition specific score matrix (PSSM). This representation makes the strong assumption that binding site positions are independent of each other. In this work, we explore Bayesian network representations of binding sites that provide different tradeoffs between complexity (number of parameters) and the richness of dependencies between positions. We develop the formal machinery for learning such models from data and for estimating the statistical significance of putative binding sites. We then evaluate the ramifications of these richer representations in characterizing binding site motifs and predicting their genomic locations. We show that these richer representations improve over the PSSM model in both tasks.
The maxmin hillclimbing bayesian network structure learning algorithm
 Machine Learning
, 2006
"... Abstract. We present a new algorithm for Bayesian network structure learning, called MaxMin HillClimbing (MMHC). The algorithm combines ideas from local learning, constraintbased, and searchandscore techniques in a principled and effective way. It first reconstructs the skeleton of a Bayesian n ..."
Abstract

Cited by 81 (7 self)
 Add to MetaCart
(Show Context)
Abstract. We present a new algorithm for Bayesian network structure learning, called MaxMin HillClimbing (MMHC). The algorithm combines ideas from local learning, constraintbased, and searchandscore techniques in a principled and effective way. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesianscoring greedy hillclimbing search to orient the edges. In our extensive empirical evaluation MMHC outperforms on average and in terms of various metrics several prototypical and stateoftheart algorithms, namely the PC, Sparse Candidate, Three Phase Dependency Analysis, Optimal Reinsertion, Greedy Equivalence Search, and Greedy Search. These are the first empirical results simultaneously comparing most of the major Bayesian network algorithms against each other. MMHC offers certain theoretical advantages, specifically over the Sparse Candidate algorithm, corroborated by our experiments. MMHC and detailed results of our study are publicly available at