Results 1 -
9 of
9
Statistical Themes and Lessons for Data Mining
, 1997
"... Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statist ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statistical themes and lessons that are directly relevant to data mining and attempts to identify opportunities where close cooperation between the statistical and computational communities might reasonably provide synergy for further progress in data analysis.
Aspects Of Graphical Models Connected With Causality
, 1993
"... This paper demonstrates the use of graphs as a mathematical tool for expressing independenices, and as a formal language for communicating and processing causal information in statistical analysis. We show how complex information about external interventions can be organized and represented graphica ..."
Abstract
-
Cited by 13 (10 self)
- Add to MetaCart
This paper demonstrates the use of graphs as a mathematical tool for expressing independenices, and as a formal language for communicating and processing causal information in statistical analysis. We show how complex information about external interventions can be organized and represented graphically and, conversely, how the graphical representation can be used to facilitate quantitative predictions of the effects of interventions. We first review the Markovian account of causation and show that directed acyclic graphs (DAGs) offer an economical scheme for representing conditional independence assumptions and for deducing and displaying all the logical consequences of such assumptions. We then introduce the manipulative account of causation and show that any DAG defines a simple transformation which tells us how the probability distribution will change as a result of external interventions in the system. Using this transformation it is possible to quantify, from non-experimental data...
Automating Path Analysis for Building Causal Models from Data
- Proc. 10th Intl. Conf. on Machine Learning
, 1993
"... Path analysis is a generalization of multiple linear regression that builds models with causal interpretations. It is an exploratory or discovery procedure for finding causal structure in correlational data. Recently, we have applied statistical methods such as path analysis to the problem of bui ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Path analysis is a generalization of multiple linear regression that builds models with causal interpretations. It is an exploratory or discovery procedure for finding causal structure in correlational data. Recently, we have applied statistical methods such as path analysis to the problem of building models of AI programs, which are generally complex and poorly understood.
Adviser
, 2004
"... ii ABSTRACT OF THESIS STATE TRANSITION DIAGRAM DEPENDENCY DETECTION I present an algorithm that builds state transition diagrams out of event traces, in order to find causal relationships between the various events in these traces. The main application of this algorithm is high-level debugging (for ..."
Abstract
- Add to MetaCart
ii ABSTRACT OF THESIS STATE TRANSITION DIAGRAM DEPENDENCY DETECTION I present an algorithm that builds state transition diagrams out of event traces, in order to find causal relationships between the various events in these traces. The main application of this algorithm is high-level debugging (for situations where it is difficult or impossible to replicate a specific instance of a failure). But many other uses, such as market prediction, credit card fraud tracking, and data mining, are also possible. The algorithm is the latest in a family of statistics-based techniques for modeling process behavior, called Dependency Detection. It collects relatively short, significant sequences (snapshots), to generate an integrated, abstract overview model of the analyzed process. Also, detailed performance and accuracy evaluations of the algorithm are presented.
Likelihood-based Causal Inference
, 34
"... A method is given which uses subject matter assumptions to discriminate recursive models and thus point toward possible causal explanations. The assumptions alone do not specify any order among the variables --- rather just a theoretical absence of direct association. We show how these assumptions, ..."
Abstract
- Add to MetaCart
A method is given which uses subject matter assumptions to discriminate recursive models and thus point toward possible causal explanations. The assumptions alone do not specify any order among the variables --- rather just a theoretical absence of direct association. We show how these assumptions, while not specifying any ordering, can when combined with the data through the likelihood function yield information about an underlying recursive order. We derive details of the method for multi-normal random variables. 4.1 INTRODUCTION Starting from Sewall Wright (1934), directed graphs have been used to represent structures in which variables `cause' or `influence' other variables. Nodes of the graph are used to represent variables and an arrow from one variable to another indicates that the first has a direct causal influence on the second, an influence not blocked by holding constant others considered. If the graphs are restricted to directed acyclic graphs (DAGs) by prohibiting direct...
Viewing and Updating Belief Networks via World Wide Web
, 1999
"... The paper presents a new method and the corresponding program of presentation of bayesian belief networks. The belief network can be viewed and updated via World Wide Web. Consistency checks are possible. Edge removal and insertion operations are done in an `intelligent way' that is corrections o ..."
Abstract
- Add to MetaCart
The paper presents a new method and the corresponding program of presentation of bayesian belief networks. The belief network can be viewed and updated via World Wide Web. Consistency checks are possible. Edge removal and insertion operations are done in an `intelligent way' that is corrections of valuations are carried out automatically in a user-friendly way. The corresponding program is implemented as a Java applet at the front end, and is backed by some Java applications at the server site. Knowledge representation, knowledge acquisition from the user, belief networks. 1 Introduction Bayesian networks (also called belief networks or bayesian belief networks) encode properties of probability distributions using directed acyclic graphs (dag). Their usage is spread among many disciplines such as Artificial Intelligence [14], Decision Analysis [9], [17], Economics [31], Genetics [32], Philosophy [7], and Statistics [11], [22]. Bayesian networks are popular due to existence of...
A Causal Calculus
"... Given an arbitrary causal graph, some of whose nodes are observable and some unobservable, the problem is to determine whether the causal effect of one variable on another can be computed from the joint distribution over the observables and, if the answer is positive, to derive a formula for the ..."
Abstract
- Add to MetaCart
Given an arbitrary causal graph, some of whose nodes are observable and some unobservable, the problem is to determine whether the causal effect of one variable on another can be computed from the joint distribution over the observables and, if the answer is positive, to derive a formula for the causal effect. We introduce a calculus which, using a step by step reduction of probabilistic expressions, derives the desired formulas. 1 1 Introduction Networks employing directed acyclic graphs (DAGs) can be used to provide either 1. an economical scheme for representing conditional independence assumptions and joint distribution functions, or 2. a graphical language for representing causal influences. Although the professed motivation for investigating such models lies primarily in the second category, [Wright, 1921, Blalock, 1971, Simon, 1954, Pearl 1988], causal inferences have been treated very cautiously in the statistical literature [Lauritzen & Spiegelhalter 1988, Cox 1992,...
Data Signatures For Validation And Evaluation Of Temporal Associations
"... Discovering relations among domain variables from data plays an important role in automating and/or assisting the process of constructing domain models and validating existing ones. An important kind of relation is the temporal association between domain variables. While a straightforward applicatio ..."
Abstract
- Add to MetaCart
Discovering relations among domain variables from data plays an important role in automating and/or assisting the process of constructing domain models and validating existing ones. An important kind of relation is the temporal association between domain variables. While a straightforward application of correlation analysis may be insufficient for uncovering these relationships, we propose an approach that attempts to identify similarities in the patterns across data sets. These patterns enable us to capture the temporal associations among the variables of a model. 1 INTRODUCTION Modeling temporal associations among a set of variables requires an explicit representation of both covariation and temporal precedence among the variables. Building such a dynamic model is more suitable than a static model for the representation of a dynamic system. Nevertheless, the level of difficulty involved in the construction of a dynamic model increases dramatically over the construction of a static ...
Partial Dependency Separation - A
- Demonstratio Mathematica. Vol XXXII No
, 1999
"... Spirtes, Glymour and Scheines [19] formulated a Conjecture that a direct dependence test and a head-to-head meeting test would suffice to construe directed acyclic graph decompositions of a joint probability distribution (bayesian network) for which Pearl's d-separation [2] applies. This Conjectu ..."
Abstract
- Add to MetaCart
Spirtes, Glymour and Scheines [19] formulated a Conjecture that a direct dependence test and a head-to-head meeting test would suffice to construe directed acyclic graph decompositions of a joint probability distribution (bayesian network) for which Pearl's d-separation [2] applies. This Conjecture was later shown to be a direct conse- quence of a result of Pearl and Verma [21], cited as Theorem 1 in [13], see also Theorem 3.4. in [20]).

