Results 1  10
of
19
Structure and Strength in Causal Induction
"... We present a framework for the rational analysis of elemental causal induction – learning about the existence of a relationship between a single cause and effect – based upon causal graphical models. This framework makes precise the distinction between causal structure and causal strength: the diffe ..."
Abstract

Cited by 106 (32 self)
 Add to MetaCart
We present a framework for the rational analysis of elemental causal induction – learning about the existence of a relationship between a single cause and effect – based upon causal graphical models. This framework makes precise the distinction between causal structure and causal strength: the difference between asking whether a causal relationship exists and asking how strong that causal relationship might be. We show that two leading rational models of elemental causal induction, ∆P and causal power, both estimate causal strength, and introduce a new rational model, causal support, that assesses causal structure. Causal support predicts several key phenomena of causal induction that cannot be accounted for by other rational models, which we explore through a series of experiments. These phenomena include the complex interaction between ∆P and the baserate probability of the effect in the absence of the cause, sample size effects, inferences from incomplete contingency tables, and causal learning from rates. Causal support also provides a better account of a number of existing datasets than either ∆P or causal power.
Theorybased causal induction
 In
, 2003
"... Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various s ..."
Abstract

Cited by 37 (15 self)
 Add to MetaCart
Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various settings, from diverse forms of data: observations of the cooccurrence frequencies between causes and effects, interactions between physical objects, or patterns of spatial or temporal coincidence. These different modes of learning are typically thought of as distinct psychological processes and are rarely studied together, but at heart they present the same inductive challenge—identifying the unobservable mechanisms that generate observable relations between variables, objects, or events, given only sparse and limited data. We present a computationallevel analysis of this inductive problem and a framework for its solution, which allows us to model all these forms of causal learning in a common language. In this framework, causal induction is the product of domaingeneral statistical inference guided by domainspecific prior knowledge, in the form of an abstract causal theory. We identify 3 key aspects of abstract prior knowledge—the ontology of entities, properties, and relations that organizes a domain; the plausibility of specific causal relationships; and the functional form of those relationships—and show how they provide the constraints that people need to induce useful causal models from sparse data.
Randomized Quantile Residuals
 J. Computat. Graph. Statist
, 1996
"... In this paper we give a general definition of residuals for regression models with independent responses. Our definition produces residuals which are exactly normal, apart from sampling variability in the estimated parameters, by inverting the fitted distribution function for each response value and ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
In this paper we give a general definition of residuals for regression models with independent responses. Our definition produces residuals which are exactly normal, apart from sampling variability in the estimated parameters, by inverting the fitted distribution function for each response value and finding the equivalent standard normal quantile. Our definition includes some randomization to achieve continuous residuals when the response variable is discrete. Quantile residuals are easily computed in computer packages such as SAS, SPlus, GLIM or LispStat, and allow residual analyses to be carried out in many commonly occurring situations in which the customary definitions of residuals fail. Quantile residuals are applied in this paper to three example data sets. Keywords: deviance residual; exponential regression; generalized linear model; logistic regression; normal probability plot; Pearson residual. 1 Introduction Residuals, and especially plots of residuals, play a central role ...
Extending the Cochran rule for the comparison of word frequencies between corpora
 In Proceedings of the 7th International Conference on Statistical analysis of textual data (JADT 2004
, 2004
"... We first describe a number of interrelated issues that need to be considered by the researcher when comparing frequencies of linguistic features in two or more corpora. We then describe the chisquared and loglikelihood tests used in previous research for the comparison of word frequencies. Our fo ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
We first describe a number of interrelated issues that need to be considered by the researcher when comparing frequencies of linguistic features in two or more corpora. We then describe the chisquared and loglikelihood tests used in previous research for the comparison of word frequencies. Our focus, in this paper, is on the issue of reliability of the statistical tests, and we describe simulation experiments to compare the reliability of the chisquared and loglikelihood statistics under conditions of differentsized corpora and probability of a word occurring in text. We observe that the Cochran rule provides a good guide to accuracy of both statistics in general, but in some cases it needs to be extended. We conclude by recommending higher cutoff values for the Cochran rule at the 5%, 1 % and 0.1 % levels. In order to extend applicability of the frequency comparisons to expected values of 1 or more, use of the loglikelihood statistic is preferred over the chisquared statistic, at the 0.01 % level. The tradeoff for corpus linguists is that the new critical value is 15.13.
An approximation to the distribution of finite sample size mutual information estimates
 ICC
, 2004
"... Abstract — In this paper, the distribution of mutual information between two discrete random variables is approximated by means of a secondorder Taylor series expansion. Approximative expressions for the distribution of mutual information (MI) between independent random variables, conditional MI be ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract — In this paper, the distribution of mutual information between two discrete random variables is approximated by means of a secondorder Taylor series expansion. Approximative expressions for the distribution of mutual information (MI) between independent random variables, conditional MI between conditionally independent variables, and MI between (weakly) dependent random variables are derived. These distributions are functions of the available sample size and the number of realisations of the random variables only; knowledge of the variables ’ PMF is not required. The results are verified numerically for various cases. Exemplary application ideas in statistics and communications engineering are proposed. I.
A Framework for Exploring Categorical Data
"... In this paper, we present a framework for categorical data analysis which allows such data sets to be explored using a rich set of techniques that are only applicable to continuous data sets. We introduce the concept of separability statistics in the context of exploratory categorical data analysis. ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
In this paper, we present a framework for categorical data analysis which allows such data sets to be explored using a rich set of techniques that are only applicable to continuous data sets. We introduce the concept of separability statistics in the context of exploratory categorical data analysis. We show how these statistics can be used as a way to map categorical data to continuous space given a labeled reference data set. This mapping enables visualization of categorical data using techniques that are applicable to continuous data. We show that in the transformed continuous space, the performance of the standard knn based outlier detection technique is comparable to the performance of the knn based outlier detection technique using the best of the similarity measures designed for categorical data. The proposed framework can also be used to devise similarity measures best suited for a particular type of data set. 1
Correlated pattern mining in quantitative databases
 In Proceedings of the 9th IEEE International Conference on Data Mining
, 2009
"... We study mining correlations from quantitative databases and show that this is a more effective approach than mining associations to discover useful patterns. We propose the novel notion of Quantitative Correlated Pattern (QCP), which is founded on two formal concepts, mutual information and allcon ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We study mining correlations from quantitative databases and show that this is a more effective approach than mining associations to discover useful patterns. We propose the novel notion of Quantitative Correlated Pattern (QCP), which is founded on two formal concepts, mutual information and allconfidence. We first devise a normalization on mutual information and apply it to the problem of QCP mining to capture the dependency between the attributes. We further adopt allconfidence as a quality measure to ensure, at a finer granularity, the dependency between the attributes with specific quantitative intervals. We also propose an effective supervised method that combines the consecutive intervals of the quantitative attributes based on mutual information, such that the interval combining is guided by the dependency between the attributes. We develop an algorithm, QCoMine, to mine QCPs efficiently by utilizing normalized mutual information and allconfidence to perform bilevel pruning. We also identify the redundancy existing in the set of QCPs and propose effective techniques to eliminate the redundancy. Our extensive experiments on both real and synthetic datasets verify the efficiency of QCoMine and the quality of the QCPs. The experimental results also justify the effectiveness of our proposed techniques for redundancy elimination. To further demonstrate the usefulness and the quality of QCPs, we study an application of QCPs to classification. We demonstrate that the classifier built on the QCPs achieves higher classification accuracy than the stateoftheart classifiers built on association rules.
Elemental Causal Induction
"... We present a framework for the rational analysis of elemental causal induction  learning about the existence of a relationship between a single cause and effect  based upon causal graphical models. This framework makes precise the intuitive distinction between causal structure and causal strengt ..."
Abstract
 Add to MetaCart
We present a framework for the rational analysis of elemental causal induction  learning about the existence of a relationship between a single cause and effect  based upon causal graphical models. This framework makes precise the intuitive distinction between causal structure and causal strength: the difference between asking whether or not a causal relationship exists, and asking how strong that causal relationship might be. We show that the two leading rational models of elemental causal induction, #P and causal power, both estimate causal strength, and introduce a new rational model, causal support, that assesses causal structure. Causal support provides a better account of a large number of existing datasets than either #P or causal power. It also predicts several phenomena that cannot be accounted for by other models, which we explore through a series of experiments. These phenomena include the complex interaction between #P and the baserate probability of the effect in the absence of the cause, sample size effects, inferences from incomplete contingency tables, and causal learning from rates.
USE OF CONTINGENCY TABLES TO VALUE VARIABLES FOR SPATIAL MODELS
"... An expressive and comprehensive situation picture is necessary for a reliable decision making in various application fields. The domain knowledge, however, is often too complex to be handled individually, and thus geographical information systems (GIS) with powerful modeling tools are nowadays avail ..."
Abstract
 Add to MetaCart
(Show Context)
An expressive and comprehensive situation picture is necessary for a reliable decision making in various application fields. The domain knowledge, however, is often too complex to be handled individually, and thus geographical information systems (GIS) with powerful modeling tools are nowadays availed to support the process. Data to be considered are becoming available at an increasing speed and level of detail, thus the challenge of obtaining a useful resulting model lies in utilization of suitable methods. In our research, we deal with a systematic risk assessment model for Helsinki Fire&Rescue services. The model shall serve as a basis for preparedness of fire brigades. In this paper, we aim to use contingency tables, which are known from statistics, to assist valuing new variables for the developing risk model. In the case study, we analyze spatial relationships between the incident data points and distribution of population age. Derived information shall be implemented into the spatial model, which is the basis for further risk modeling process. The methods for the analysis of spatial data suggested in this paper support reliability of the risk model and advance understanding of how GIScientist can contribute to the process of decision making. 1.
Corresponding author:
"... effect of strabismus on object detection in the ring scotoma of a monocular bioptic telescope ..."
Abstract
 Add to MetaCart
effect of strabismus on object detection in the ring scotoma of a monocular bioptic telescope