• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Asymptotic model selection for directed networks with hidden variables (1996)

by Dan Geiger, David Heckerman, Christopher Meek
Add To MetaCart

Tools

Sorted by:
Results 11 - 20 of 32
Next 10 →

Learning mixtures of DAG models

by Bo Thiesson, Christopher Meek, David Maxwell Chickering, David Heckerman , 1997
"... We describe computationally efficient methods for learning mixtures in which each component is a directed acyclic graphical model (mixtures of DAGs or MDAGs). We argue that simple search-and-score algorithms are infeasible for a variety of problems, and introduce a feasible approach in which paramet ..."
Abstract - Cited by 24 (2 self) - Add to MetaCart
We describe computationally efficient methods for learning mixtures in which each component is a directed acyclic graphical model (mixtures of DAGs or MDAGs). We argue that simple search-and-score algorithms are infeasible for a variety of problems, and introduce a feasible approach in which parameter and structure search is interleaved and expected data is treated as real data. Our approach can be viewed as a combination of (1) the Cheeseman–Stutz asymptotic approximation for model posterior probability and (2) the Expectation–Maximization algorithm. We evaluate our procedure for selecting among MDAGs on synthetic and real examples. 1

Asymptotic Model Selection for Naive Bayesian Networks

by Dmitry Rusakov, Dan Geiger - In Proc. of the 18th Conference on Uncertainty in Artificial Intelligence (UAI-02 , 2002
"... We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features. ..."
Abstract - Cited by 23 (1 self) - Add to MetaCart
We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features.

Models and Selection Criteria for Regression and Classification

by David Heckerman, Christopher Meek - Uncertainty in Arificial Intelligence 13 , 1997
"... When performing regression or classification, we are interested in the conditional probability distribution for an outcome or class variable Y given a set of explanatory or input variables X. We consider Bayesian models for this task. In particular, we examine a special class of models, which we ca ..."
Abstract - Cited by 20 (2 self) - Add to MetaCart
When performing regression or classification, we are interested in the conditional probability distribution for an outcome or class variable Y given a set of explanatory or input variables X. We consider Bayesian models for this task. In particular, we examine a special class of models, which we call Bayesian regression/classification (BRC) models, that can be factored into independent conditional (yjx) and input (x) models. These models are convenient, because the conditional model (the portion of the full model that we care about) can be analyzed by itself. We examine the practice of transforming arbitrary Bayesian models to BRC models, and argue that this practice is often inappropriate because it ignores prior knowledge that may be important for learning. In addition, we examine Bayesian methods for learning models from data. We discuss two criteria for Bayesian model selection that are appropriate for repression/classification: one described by Spiegelhalter et al. (1993), and an...

Bayesian Estimation and Testing of Structural Equation Models

by Richard Scheines, Herbert Hoijtink, Anne Boomsma - Psychometrika , 1999
"... The Gibbs sampler can be used to obtain samples of arbitrary size from the posterior distribution over the parameters of a structural equation model (SEM) given covariance data and a prior distribution over the parameters. Point estimates, standard deviations and interval estimates for the parameter ..."
Abstract - Cited by 20 (4 self) - Add to MetaCart
The Gibbs sampler can be used to obtain samples of arbitrary size from the posterior distribution over the parameters of a structural equation model (SEM) given covariance data and a prior distribution over the parameters. Point estimates, standard deviations and interval estimates for the parameters can be computed from these samples. If the prior distribution over the parameters is uninformative, the posterior is proportional to the likelihood, and asymptotically the inferences based on the Gibbs sample are the same as those based on the maximum likelihood solution, e.g., output from LISREL or EQS. In small samples, however, the likelihood surface is not Gaussian and in some cases contains local maxima. Nevertheless, the Gibbs sample comes from the correct posterior distribution over the parameters regardless of the sample size and the shape of the likelihood surface. With an informative prior distribution over the parameters, the posterior can be used to make inferences about the parameters of underidentified models, as we illustrate on a simple errors-in-variables model.

Latent Variable Models for Neural Data Analysis

by Maneesh Sahani , 1999
"... The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 1011 neurons, each making an average of 10 3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. ..."
Abstract - Cited by 17 (3 self) - Add to MetaCart
The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 1011 neurons, each making an average of 10 3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis. It is divided

Graphical Models and Exponential Families

by Dan Geiger, Christopher Meek , 1998
"... We provide a classification of graphical models according to their representation as subfamilies of exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical models and chain graphs with no hidden variables, incl ..."
Abstract - Cited by 16 (1 self) - Add to MetaCart
We provide a classification of graphical models according to their representation as subfamilies of exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical models and chain graphs with no hidden variables, including Bayesian networks with several families of local distributions, are curved exponential families (CEFs) and graphical models with hidden variables are stratified exponential families (SEFs). An SEF is a finite union of CEFs satisfying a frontier condition. In addition, we illustrate how one can automatically generate independence and non-independence constraints on the distributions over the observable variables implied by a Bayesian network with hidden variables. The relevance of these results for model selection is examined. 1 Introduction A graphical model is a family of probability distributions. The set of distributions associated with a graphical model are usually define...

The Posterior Probability of Bayes Nets with Strong Dependences

by Gernot D. Kleiter - Soft Computing , 1999
"... Stochastic independence is an idealized relationship located at one end of a continuum of values measuring degrees of dependence. Modeling real world systems, we are often not interested in the distinction between exact independence and any degree of dependence, but between weak ignorable and strong ..."
Abstract - Cited by 14 (1 self) - Add to MetaCart
Stochastic independence is an idealized relationship located at one end of a continuum of values measuring degrees of dependence. Modeling real world systems, we are often not interested in the distinction between exact independence and any degree of dependence, but between weak ignorable and strong substantial dependence. Good models map significant deviance from independence and neglect approximate independence or dependence weaker than a noise threshold. This intuition is applied to learning the structure of Bayes nets from data. We determine the conditional posterior probabilities of structures given that the degree of dependence at each of their nodes exceeds a critical noise level. Deviance from independence is measured by mutual information. Arc probabilities are determined by the amount of mutual information the neighbors contribute to a node, is greater than a critical minimum deviance from independence. A Ø 2 approximation for the probability density function of mutual info...

Dimension Correction for Hierarchical Latent Class Models

by Tomas Kocka, Nevin L. Zhang , 2002
"... Model complexity is an important factor to consider when selecting among graphical models. When all variables are observed, the complexity of a model can be measured by its standard dimension, i.e. the number of independent parameters. When hidden variables are present, however, standard dime ..."
Abstract - Cited by 13 (5 self) - Add to MetaCart
Model complexity is an important factor to consider when selecting among graphical models. When all variables are observed, the complexity of a model can be measured by its standard dimension, i.e. the number of independent parameters. When hidden variables are present, however, standard dimension might no longer be appropriate.

Towards a More Efficient Evolutionary Induction of Bayesian Networks

by Carlos Cotta, Jorge Muruzabal - Parallel Problem Solving from Nature VII , 2002
"... Bayesian networks (BNs) constitute a useful tool to model the joint distribution of a set of random variables of interest. This paper is concerned with the network induction problem. We propose a number of hybrid recombination operators for extracting BNs from data. These hybrid operators make use o ..."
Abstract - Cited by 10 (6 self) - Add to MetaCart
Bayesian networks (BNs) constitute a useful tool to model the joint distribution of a set of random variables of interest. This paper is concerned with the network induction problem. We propose a number of hybrid recombination operators for extracting BNs from data. These hybrid operators make use of phenotypic information in order to guide the processing of information during recombination. The performance of these new operators is analyzed with respect to that of their genotypic counterparts. It is shown that these hybrid operators provide notably improved and rather robust results. Some remarks on the future of the area are also laid out.

Learning hybrid Bayesian networks from data

by Stefano Monti, Gregory F. Cooper , 1998
"... We illustrate two different methodologies for learning Hybrid Bayesian networks, that is, Bayesian networks containing both continuous and discrete variables, from data. The two methodologies differ in the way of handling continuous data when learning the Bayesian network structure. The first method ..."
Abstract - Cited by 9 (1 self) - Add to MetaCart
We illustrate two different methodologies for learning Hybrid Bayesian networks, that is, Bayesian networks containing both continuous and discrete variables, from data. The two methodologies differ in the way of handling continuous data when learning the Bayesian network structure. The first methodology uses discretized data to learn the Bayesian network structure, and the original non-discretized data for the parameterization of the learned structure. The second methodology uses non-discretized data both to learn the Bayesian network structure and its parameterization. For the direct handling of continuous data, we propose the use of artificial neural networks as probability estimators, to be used as an integral part of the scoring metric defined to search the space of Bayesian network structures. With both methodologies, we assume the availability of a complete dataset, with no missing values or hidden variables. We report experimental results aimed at comparing the two methodologies. These results provide evidence that learning with discretized data presents advantages both in terms of efficiency and in terms of accuracy of the learned models over the alternative approach of using non-discretized data.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University