Results 1  10
of
13
Statistical Themes and Lessons for Data Mining
, 1997
"... Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statist ..."
Abstract

Cited by 32 (3 self)
 Add to MetaCart
Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statistical themes and lessons that are directly relevant to data mining and attempts to identify opportunities where close cooperation between the statistical and computational communities might reasonably provide synergy for further progress in data analysis.
Independency relationships and learning algorithms for singly connected networks
, 1998
"... Graphical structures such as Bayesian networks or M arkov networks are very useful tools for representing irrelevance or independency relationships, and they may be used to efficiently perform reasoning tasks. Singly connected networks are important specific cases where there is no more than one un ..."
Abstract

Cited by 18 (10 self)
 Add to MetaCart
Graphical structures such as Bayesian networks or M arkov networks are very useful tools for representing irrelevance or independency relationships, and they may be used to efficiently perform reasoning tasks. Singly connected networks are important specific cases where there is no more than one undirected path connecting each pair of variables. The aim of this paper is to investigate the kind of properties that a dependency model must verify in order to be equivalent to a singly connected graph structure, as a way of driving automated discovery and construction of singly connected networks in data. The main results are the characterizations of those dependency models which are isomorphic to singly connected graphs (either via the dseparation criterion for directed acyclic graphs or via the separation criterion for undirected graphs), as well as the development of efficient algorithms for learning singly connected graph representations of dependency models.
Aspects Of Graphical Models Connected With Causality
, 1993
"... This paper demonstrates the use of graphs as a mathematical tool for expressing independenices, and as a formal language for communicating and processing causal information in statistical analysis. We show how complex information about external interventions can be organized and represented graphica ..."
Abstract

Cited by 13 (10 self)
 Add to MetaCart
This paper demonstrates the use of graphs as a mathematical tool for expressing independenices, and as a formal language for communicating and processing causal information in statistical analysis. We show how complex information about external interventions can be organized and represented graphically and, conversely, how the graphical representation can be used to facilitate quantitative predictions of the effects of interventions. We first review the Markovian account of causation and show that directed acyclic graphs (DAGs) offer an economical scheme for representing conditional independence assumptions and for deducing and displaying all the logical consequences of such assumptions. We then introduce the manipulative account of causation and show that any DAG defines a simple transformation which tells us how the probability distribution will change as a result of external interventions in the system. Using this transformation it is possible to quantify, from nonexperimental data...
Automating Path Analysis for Building Causal Models from Data
 Proc. 10th Intl. Conf. on Machine Learning
, 1993
"... Path analysis is a generalization of multiple linear regression that builds models with causal interpretations. It is an exploratory or discovery procedure for finding causal structure in correlational data. Recently, we have applied statistical methods such as path analysis to the problem of bui ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Path analysis is a generalization of multiple linear regression that builds models with causal interpretations. It is an exploratory or discovery procedure for finding causal structure in correlational data. Recently, we have applied statistical methods such as path analysis to the problem of building models of AI programs, which are generally complex and poorly understood.
Courses of Action Development and Evaluation
 Proceedings for the 1998 Command and Control Research and Technology Symposium, Naval Postgraduate School
, 1998
"... This paper describes a set of procedures that will enhance the analysis, synthesis, and execution of courses of action (COA). The paper presents a set of formal methods for extending the capability of probabilistic models (influence nets) to produce rigorous mathematical models that reveal the impac ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
This paper describes a set of procedures that will enhance the analysis, synthesis, and execution of courses of action (COA). The paper presents a set of formal methods for extending the capability of probabilistic models (influence nets) to produce rigorous mathematical models that reveal the impact of the sequence and timing of actionable events on the outcome and effects desired in a situation. By incorporating timing information, such a model can be converted to a Discrete Event System (DES) model in the form of a Colored Petri Net. The DES model, when run as a simulation, can reveal the changes in the likelihood of the desired effects over time for any timed sequence of actionable events that comprise a COA. The paper presents DES analysis techniques that can generate all of the possible sequences of probability values of the outcome given any COA without simulation. Procedures are presented to select desirable sequences from the set of all sequences and determine the temporal relationship among the actionable events that will generate a selected sequence of probability values. 1.
Adviser
, 2004
"... ii ABSTRACT OF THESIS STATE TRANSITION DIAGRAM DEPENDENCY DETECTION I present an algorithm that builds state transition diagrams out of event traces, in order to find causal relationships between the various events in these traces. The main application of this algorithm is highlevel debugging (for ..."
Abstract
 Add to MetaCart
ii ABSTRACT OF THESIS STATE TRANSITION DIAGRAM DEPENDENCY DETECTION I present an algorithm that builds state transition diagrams out of event traces, in order to find causal relationships between the various events in these traces. The main application of this algorithm is highlevel debugging (for situations where it is difficult or impossible to replicate a specific instance of a failure). But many other uses, such as market prediction, credit card fraud tracking, and data mining, are also possible. The algorithm is the latest in a family of statisticsbased techniques for modeling process behavior, called Dependency Detection. It collects relatively short, significant sequences (snapshots), to generate an integrated, abstract overview model of the analyzed process. Also, detailed performance and accuracy evaluations of the algorithm are presented.
Likelihoodbased Causal Inference
, 34
"... A method is given which uses subject matter assumptions to discriminate recursive models and thus point toward possible causal explanations. The assumptions alone do not specify any order among the variables  rather just a theoretical absence of direct association. We show how these assumptions, ..."
Abstract
 Add to MetaCart
A method is given which uses subject matter assumptions to discriminate recursive models and thus point toward possible causal explanations. The assumptions alone do not specify any order among the variables  rather just a theoretical absence of direct association. We show how these assumptions, while not specifying any ordering, can when combined with the data through the likelihood function yield information about an underlying recursive order. We derive details of the method for multinormal random variables. 4.1 INTRODUCTION Starting from Sewall Wright (1934), directed graphs have been used to represent structures in which variables `cause' or `influence' other variables. Nodes of the graph are used to represent variables and an arrow from one variable to another indicates that the first has a direct causal influence on the second, an influence not blocked by holding constant others considered. If the graphs are restricted to directed acyclic graphs (DAGs) by prohibiting direct...
Viewing and Updating Belief Networks via World Wide Web
, 1999
"... The paper presents a new method and the corresponding program of presentation of bayesian belief networks. The belief network can be viewed and updated via World Wide Web. Consistency checks are possible. Edge removal and insertion operations are done in an `intelligent way' that is corrections o ..."
Abstract
 Add to MetaCart
The paper presents a new method and the corresponding program of presentation of bayesian belief networks. The belief network can be viewed and updated via World Wide Web. Consistency checks are possible. Edge removal and insertion operations are done in an `intelligent way' that is corrections of valuations are carried out automatically in a userfriendly way. The corresponding program is implemented as a Java applet at the front end, and is backed by some Java applications at the server site. Knowledge representation, knowledge acquisition from the user, belief networks. 1 Introduction Bayesian networks (also called belief networks or bayesian belief networks) encode properties of probability distributions using directed acyclic graphs (dag). Their usage is spread among many disciplines such as Artificial Intelligence [14], Decision Analysis [9], [17], Economics [31], Genetics [32], Philosophy [7], and Statistics [11], [22]. Bayesian networks are popular due to existence of...
A Causal Calculus
"... Given an arbitrary causal graph, some of whose nodes are observable and some unobservable, the problem is to determine whether the causal effect of one variable on another can be computed from the joint distribution over the observables and, if the answer is positive, to derive a formula for the ..."
Abstract
 Add to MetaCart
Given an arbitrary causal graph, some of whose nodes are observable and some unobservable, the problem is to determine whether the causal effect of one variable on another can be computed from the joint distribution over the observables and, if the answer is positive, to derive a formula for the causal effect. We introduce a calculus which, using a step by step reduction of probabilistic expressions, derives the desired formulas. 1 1 Introduction Networks employing directed acyclic graphs (DAGs) can be used to provide either 1. an economical scheme for representing conditional independence assumptions and joint distribution functions, or 2. a graphical language for representing causal influences. Although the professed motivation for investigating such models lies primarily in the second category, [Wright, 1921, Blalock, 1971, Simon, 1954, Pearl 1988], causal inferences have been treated very cautiously in the statistical literature [Lauritzen & Spiegelhalter 1988, Cox 1992,...
Data Signatures For Validation And Evaluation Of Temporal Associations
"... Discovering relations among domain variables from data plays an important role in automating and/or assisting the process of constructing domain models and validating existing ones. An important kind of relation is the temporal association between domain variables. While a straightforward applicatio ..."
Abstract
 Add to MetaCart
Discovering relations among domain variables from data plays an important role in automating and/or assisting the process of constructing domain models and validating existing ones. An important kind of relation is the temporal association between domain variables. While a straightforward application of correlation analysis may be insufficient for uncovering these relationships, we propose an approach that attempts to identify similarities in the patterns across data sets. These patterns enable us to capture the temporal associations among the variables of a model. 1 INTRODUCTION Modeling temporal associations among a set of variables requires an explicit representation of both covariation and temporal precedence among the variables. Building such a dynamic model is more suitable than a static model for the representation of a dynamic system. Nevertheless, the level of difficulty involved in the construction of a dynamic model increases dramatically over the construction of a static ...