Results 1  10
of
79
A Bayesian method for the induction of probabilistic networks from data
 MACHINE LEARNING
, 1992
"... This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computerassisted hypothesis testing, automated scientific discovery, and automated construction of probabili ..."
Abstract

Cited by 1140 (28 self)
 Add to MetaCart
This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computerassisted hypothesis testing, automated scientific discovery, and automated construction of probabilistic expert systems. We extend the basic method to handle missing data and hidden (latent) variables. We show how to perform probabilistic inference by averaging over the inferences of multiple belief networks. Results are presented of a preliminary evaluation of an algorithm for constructing a belief network from a database of cases. Finally, we relate the methods in this paper to previous work, and we discuss open problems.
Knowledge Discovery in Databases: an Overview
, 1992
"... this article. 07384602/92/$4.00 1992 AAAI 58 AI MAGAZINE for the 1990s (Silberschatz, Stonebraker, and Ullman 1990) ..."
Abstract

Cited by 400 (3 self)
 Add to MetaCart
this article. 07384602/92/$4.00 1992 AAAI 58 AI MAGAZINE for the 1990s (Silberschatz, Stonebraker, and Ullman 1990)
From data mining to knowledge discovery in databases
 AI Magazine
, 1996
"... ■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases ..."
Abstract

Cited by 354 (0 self)
 Add to MetaCart
(Show Context)
■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular realworld applications, specific datamining techniques, challenges involved in realworld applications of knowledge discovery, and current and future research directions in the field. Across a wide variety of fields, data are
Model selection and accounting for model uncertainty in graphical models using Occam's window
, 1993
"... We consider the problem of model selection and accounting for model uncertainty in highdimensional contingency tables, motivated by expert system applications. The approach most used currently is a stepwise strategy guided by tests based on approximate asymptotic Pvalues leading to the selection o ..."
Abstract

Cited by 293 (46 self)
 Add to MetaCart
We consider the problem of model selection and accounting for model uncertainty in highdimensional contingency tables, motivated by expert system applications. The approach most used currently is a stepwise strategy guided by tests based on approximate asymptotic Pvalues leading to the selection of a single model; inference is then conditional on the selected model. The sampling properties of such a strategy are complex, and the failure to take account of model uncertainty leads to underestimation of uncertainty about quantities of interest. In principle, a panacea is provided by the standard Bayesian formalism which averages the posterior distributions of the quantity of interest under each of the models, weighted by their posterior model probabilities. Furthermore, this approach is optimal in the sense of maximising predictive ability. However, this has not been used in practice because computing the posterior model probabilities is hard and the number of models is very large (often greater than 1011). We argue that the standard Bayesian formalism is unsatisfactory and we propose an alternative Bayesian approach that, we contend, takes full account of the true model uncertainty byaveraging overamuch smaller set of models. An efficient search algorithm is developed for nding these models. We consider two classes of graphical models that arise in expert systems: the recursive causal models and the decomposable
A Theory Of Inferred Causation
, 1991
"... This paper concerns the empirical basis of causation, and addresses the following issues: 1. the clues that might prompt people to perceive causal relationships in uncontrolled observations. 2. the task of inferring causal models from these clues, and 3. whether the models inferred tell us anything ..."
Abstract

Cited by 215 (35 self)
 Add to MetaCart
This paper concerns the empirical basis of causation, and addresses the following issues: 1. the clues that might prompt people to perceive causal relationships in uncontrolled observations. 2. the task of inferring causal models from these clues, and 3. whether the models inferred tell us anything useful about the causal mechanisms that underly the observations. We propose a minimalmodel semantics of causation, and show that, contrary to common folklore, genuine causal influences can be distinguished from spurious covariations following standard norms of inductive reasoning. We also establish a sound characterization of the conditions under which such a distinction is possible. We provide an effective algorithm for inferred causation and show that, for a large class of data the algorithm can uncover the direction of causal influences as defined above. Finally, we address the issue of nontemporal causation.
A Guide to the Literature on Learning Probabilistic Networks From Data
, 1996
"... This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the ..."
Abstract

Cited by 179 (0 self)
 Add to MetaCart
This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the different methodological communities, such as Bayesian, description length, and classical statistics. Basic concepts for learning and Bayesian networks are introduced and methods are then reviewed. Methods are discussed for learning parameters of a probabilistic network, for learning the structure, and for learning hidden variables. The presentation avoids formal definitions and theorems, as these are plentiful in the literature, and instead illustrates key concepts with simplified examples. Keywords Bayesian networks, graphical models, hidden variables, learning, learning structure, probabilistic networks, knowledge discovery. I. Introduction Probabilistic networks or probabilistic gra...
Graphical Models, Causality, And Intervention
, 1993
"... tion of belief networks is given in [4]. 2 In [3], the graphs were called "causal networks," for which the authors were criticised; they have agreed to refrain from using the word "causal." In the current paper, Spiegelhalter etal. deemphasize the causal interpretation of the a ..."
Abstract

Cited by 102 (34 self)
 Add to MetaCart
tion of belief networks is given in [4]. 2 In [3], the graphs were called "causal networks," for which the authors were criticised; they have agreed to refrain from using the word "causal." In the current paper, Spiegelhalter etal. deemphasize the causal interpretation of the arcs in favor of the "irrelevance" interpretation (page 4). I think this retreat is regrettable for two reasons: first, causal associations are the primary source of judgments about irrelevance and, second, rejecting the causal interpretation of arcs prevents us from using graphical models for making legitimate predictions about the effect of actions. Such predictions are indispensable in applications such as treatment management and patient monitoring. the causal model also tells us how these probabilities would change as a result of external interventions in the system. For this reason, causal models (or "structural models" as they are often called) have been the target of relent
Systems for Knowledge Discovery in Databases
 IEEE Transactions On Knowledge And Data Engineering
, 1993
"... The automated discovery of knowledge in databases is becoming increasingly important as the world's wealth of data continues to grow exponentially. Knowledgediscovery systems face challenging problems from realworld databases which tend to be dynamic, incomplete, redundant, noisy, sparse, and ..."
Abstract

Cited by 101 (8 self)
 Add to MetaCart
The automated discovery of knowledge in databases is becoming increasingly important as the world's wealth of data continues to grow exponentially. Knowledgediscovery systems face challenging problems from realworld databases which tend to be dynamic, incomplete, redundant, noisy, sparse, and very large. This paper addresses these problems and describes some techniques for handling them. A model of an idealized knowledgediscovery system is presented as a reference for studying and designing new systems. This model is used in the comparison of three systems: CoverStory, EXPLORA, and the Knowledge Discovery Workbench. The deficiencies of existing systems relative to the model reveal several open problems for future research.
Construction of Bayesian Network Structures From Data: A Brief Survey and an Efficient Algorithm
, 1995
"... Previous algorithms for the recovery of Bayesian belief network structures from data have been either highly dependent on conditional independence (CI) tests, or have required on ordering on the nodes to be supplied by the user. We present an algorithm that integrates these two approaches: CI tests ..."
Abstract

Cited by 78 (8 self)
 Add to MetaCart
Previous algorithms for the recovery of Bayesian belief network structures from data have been either highly dependent on conditional independence (CI) tests, or have required on ordering on the nodes to be supplied by the user. We present an algorithm that integrates these two approaches: CI tests are used to generate an ordering on the nodes from the database, which is then used to recover the underlying Bayesian network structure using a nonCltestbased method. Results of the evaluation of the algorithm on a number of databases (e.g., ALARM, LED, and SOYBEAN) are presented. We also discuss some algorithm performance issues and open problems.
Structure Identification in Relational Data
, 1997
"... This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is give ..."
Abstract

Cited by 78 (2 self)
 Add to MetaCart
This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is given a precise formal definition similar to that of learnability. Using this framework, we then explore if a tractable procedure exists for deciding whether a given relation is decomposable into a constraint network or a CNF theory with desirable topology and, if the answer is positive, identifying the desired decomposition. Finally, we