• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Asymptotic model selection for directed networks with hidden variables (1996)

by Dan Geiger, David Heckerman, Christopher Meek
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 32
Next 10 →

A tutorial on learning with Bayesian networks

by David Heckerman - Learning in Graphical Models , 1995
"... A companion set of lecture slides is available at ..."
Abstract - Cited by 710 (4 self) - Add to MetaCart
A companion set of lecture slides is available at

Dynamic Bayesian Networks: Representation, Inference and Learning

by Kevin Patrick Murphy , 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have bee ..."
Abstract - Cited by 393 (4 self) - Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data. In particular, the main novel technical contributions of this thesis are as follows: a way of representing Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.

Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables

by David Maxwell Chickering, David Heckerman - Machine Learning , 1997
"... We discuss Bayesian methods for learning Bayesian networks when data sets are incomplete. In particular, we examine asymptotic approximations for the marginal likelihood of incomplete data given a Bayesian network. We consider the Laplace approximation and the less accurate but more efficient BIC/MD ..."
Abstract - Cited by 155 (9 self) - Add to MetaCart
We discuss Bayesian methods for learning Bayesian networks when data sets are incomplete. In particular, we examine asymptotic approximations for the marginal likelihood of incomplete data given a Bayesian network. We consider the Laplace approximation and the less accurate but more efficient BIC/MDL approximation. We also consider approximations proposed by Draper (1993) and Cheeseman and Stutz (1995). These approximations are as efficient as BIC/MDL, but their accuracy has not been studied in any depth. We compare the accuracy of these approximations under the assumption that the Laplace approximation is the most accurate. In experiments using synthetic data generated from discrete naive-Bayes models having a hidden root node, we find that (1) the BIC/MDL measure is the least accurate, having a bias in favor of simple models, and (2) the Draper and CS measures are the most accurate. 1

Adaptive Probabilistic Networks with Hidden Variables

by John Binder, Daphne Koller, Stuart Russell, Keiji Kanazawa, Padhraic Smyth - Machine Learning , 1997
"... . Probabilistic networks (also known as Bayesian belief networks) allow a compact description of complex stochastic relationships among several random variables. They are rapidly becoming the tool of choice for uncertain reasoning in artificial intelligence. In this paper, we investigate the problem ..."
Abstract - Cited by 133 (10 self) - Add to MetaCart
. Probabilistic networks (also known as Bayesian belief networks) allow a compact description of complex stochastic relationships among several random variables. They are rapidly becoming the tool of choice for uncertain reasoning in artificial intelligence. In this paper, we investigate the problem of learning probabilistic networks with known structure and hidden variables. This is an important problem, because structure is much easier to elicit from experts than numbers, and the world is rarely fully observable. We present a gradient-based algorithmand show that the gradient can be computed locally, using information that is available as a byproduct of standard probabilistic network inference algorithms. Our experimental results demonstrate that using prior knowledge about the structure, even with hidden variables, can significantly improve the learning rate of probabilistic networks. We extend the method to networks in which the conditional probability tables are described using a ...

Modelling gene expression data using dynamic bayesian networks

by Kevin Murphy, Saira Mian , 1999
"... Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of ..."
Abstract - Cited by 119 (1 self) - Add to MetaCart
Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of Weaver et al. [WWS99] — are all special cases of a general class of models called Dynamic Bayesian Networks (DBNs). The advantages of DBNs include the ability to model stochasticity, to incorporate prior knowledge, and to handle hidden variables and missing data in a principled way. This paper provides a review of techniques for learning DBNs. Keywords: Genetic networks, boolean networks, Bayesian networks, neural networks, reverse engineering, machine learning. 1

A Bayesian Approach to Causal Discovery

by David Heckerman, Christopher Meek, Gregory Cooper , 1997
"... We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraint-based approach. Both approaches rely on the Causal Markov assumption, but the two differ significantly in theory and practice. An important difference between the approaches is that t ..."
Abstract - Cited by 64 (1 self) - Add to MetaCart
We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraint-based approach. Both approaches rely on the Causal Markov assumption, but the two differ significantly in theory and practice. An important difference between the approaches is that the constraint-based approach uses categorical information about conditional-independence constraints in the domain, whereas the Bayesian approach weighs the degree to which such constraints hold. As a result, the Bayesian approach has three distinct advantages over its constraint-based counterpart. One, conclusions derived from the Bayesian approach are not susceptible to incorrect categorical decisions about independence facts that can occur with data sets of finite size. Two, using the Bayesian approach, finer distinctions among model structures---both quantitative and qualitative---can be made. Three, information from several models can be combined to make better inferences and to better ...

Discriminative vs Informative Learning

by Y. Dan Rubinstein , 1998
"... ..."
Abstract - Cited by 57 (0 self) - Add to MetaCart
Abstract not found

Stratified Exponential Families: Graphical Models and Model Selection

by Dan Geiger, David Heckerman, Henry King, Christopher Meek, Redmond Wa - Annals of Statistics , 1998
"... We provide a classification of graphical models according to their representation as exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical (DAG) models and chain graphs with no hidden variables, including DAG mod ..."
Abstract - Cited by 41 (3 self) - Add to MetaCart
We provide a classification of graphical models according to their representation as exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical (DAG) models and chain graphs with no hidden variables, including DAG models with several families of local distributions, are curved exponential families (CEFs) and graphical models with hidden variables are stratified exponential families (SEFs). A SEF is a finite union of CEFs of various dimensions satisfying some regularity conditions. The main results of this paper are that graphical models are SEFs and that many graphical models are not CEFs. That is, roughly speaking, graphical models when viewed as exponential families correspond to a set of smooth manifolds of various dimensions and usually not to a single smooth manifold. These results are discussed in the context of model selection. Keywords : Bayesian networks, graphical models, hidden variables, cur...

Optimization by learning and simulation of Bayesian and Gaussian networks

by P. Larrañaga, R. Etxeberria, J. A. Lozano, J.M. Peña, J. M. Pe~na , 1999
"... Estimation of Distribution Algorithms (EDA) constitute an example of stochastics heuristics based on populations of individuals every of which encode the possible solutions to the optimization problem. These populations of individuals evolve in succesive generations as the search progresses -- organ ..."
Abstract - Cited by 34 (6 self) - Add to MetaCart
Estimation of Distribution Algorithms (EDA) constitute an example of stochastics heuristics based on populations of individuals every of which encode the possible solutions to the optimization problem. These populations of individuals evolve in succesive generations as the search progresses -- organized in the same way as most evolutionary computation heuristics. In opposition to most evolutionary computation paradigms which consider the crossing and mutation operators as essential tools to generate new populations, EDA replaces those operators by the estimation and simulation of the joint probability distribution of the selected individuals. In this work, after making a review of the different approaches based on EDA for problems of combinatorial optimization as well as for problems of optimization in continuous domains, we propose new approaches based on the theory of probabilistic graphical models to solve problems in both domains. More precisely, we propose to adapt algorit...

Statistical Themes and Lessons for Data Mining

by Clark Glymour, David Madigan, Daryl Pregibon, Padhraic Smyth , 1997
"... Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statist ..."
Abstract - Cited by 30 (3 self) - Add to MetaCart
Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statistical themes and lessons that are directly relevant to data mining and attempts to identify opportunities where close cooperation between the statistical and computational communities might reasonably provide synergy for further progress in data analysis.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University