Results 1 - 10
of
28
Approximate Bayes Factors and Accounting for Model Uncertainty in Generalized Linear Models
, 1993
"... Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors ..."
Abstract
-
Cited by 79 (28 self)
- Add to MetaCart
Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors is suggested, both to represent the situation where there is not much prior information, and to assess the sensitivity of the results to the prior distribution. The methods can be used when the dispersion parameter is unknown, when there is overdispersion, to compare link functions, and to compare error distributions and variance functions. The methods can be used to implement the Bayesian approach to accounting for model uncertainty. I describe an application to inference about relative risks in the presence of control factors where model uncertainty is large and important. Software to implement the
The Prediction of Faulty Classes Using Object-Oriented Design Metrics
, 1999
"... Contemporary evidence suggests that most field faults in software applications are found in a smafi percentage of the software's components. This means that if these faulty software components can be detected early in the development project's life cycle, mitigating actions can be taken, such as a ..."
Abstract
-
Cited by 35 (2 self)
- Add to MetaCart
Contemporary evidence suggests that most field faults in software applications are found in a smafi percentage of the software's components. This means that if these faulty software components can be detected early in the development project's life cycle, mitigating actions can be taken, such as a redesign. For object-oriented applications, prediction models using design metrics can be used to identify faulty classes early on. In this paper we report on a study that used object-oriented design metrics to construct such prediction models. The study used data collected from one version of a commercial Java application for constructing a prediction model. The model was then validated on a subsequent release of the same application. Our results indicate that the prediction model has a high accuracy. Furthermore, we found that an export coupling metric had the strongest association with faultproneness, indicating a structural feature that may be symptomatic of a class with a high probability of latent faults.
Probabilities of Causation: Bounds and Identification
- Annals of Mathematics and Artificial Intelligence
, 2000
"... This paper deals with the problem of estimating the probability of causation, that is, the probability that one event was the real cause of another, in a given scenario. Starting from structural-semantical definitions of the probabilities of necessary or sufficient causation (or both), we show h ..."
Abstract
-
Cited by 12 (10 self)
- Add to MetaCart
This paper deals with the problem of estimating the probability of causation, that is, the probability that one event was the real cause of another, in a given scenario. Starting from structural-semantical definitions of the probabilities of necessary or sufficient causation (or both), we show how to bound these quantities from data obtained in experimental and observational studies, under general assumptions concerning the data-generating process. In particular, we strengthen the results of Pearl (1999) by presenting sharp bounds based on combined experimental and nonexperimental data under no process assumptions, as well as under the mild assumptions of exogeneity (no confounding) and monotonicity (no prevention). These results delineate more precisely the basic assumptions that must be made before statistical measures such as the excess-risk-ratio could be used for assessing attributional quantities such as the probability of causation. 1
Why There Is No Statistical Test For Confounding, Why Many Think There Is, And Why They Are Almost Right
, 1998
"... this paper is to bring to the attention of investigators several basic limitations of the associational criterion. We will show that the associational criterion does not ensure unbiased e#ect estimates, nor does it follow from the requirement of unbiasedness. After demonstrating, by examples, the ab ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
this paper is to bring to the attention of investigators several basic limitations of the associational criterion. We will show that the associational criterion does not ensure unbiased e#ect estimates, nor does it follow from the requirement of unbiasedness. After demonstrating, by examples, the absence of logical connections between the statistical and the causal notions of confounding, we will de#ne a stronger notion of unbiasedness, called stable unbiasedness, relative to which a modi#ed statistical criterion will be shown necessary and su#cient. The necessary part will then yield a practical test for stable unbiasedness which, remarkably, does not require knowledge of all potential confounders in a problem. Finally,wewill argue that the prevailing practice of substituting statistical criteria for the e#ect-based de#nition of confounding is not entirely misguided, because stable unbiasedness is in fact what investigators have been and should be aiming to achieve, and stable unbiasedness is what statistical criteria can test.
Probabilities of causation: Three counterfactual interpretations and their identification
- SYNTHESE
, 1999
"... According to common judicial standard, judgment in favor of plaintiff should be made if and only if it is "more probable than not" that the defendant's action was the cause for the plaintiff's damage (or death). This paper provides formal semantics, based on structural models of counterfactuals, ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
According to common judicial standard, judgment in favor of plaintiff should be made if and only if it is "more probable than not" that the defendant's action was the cause for the plaintiff's damage (or death). This paper provides formal semantics, based on structural models of counterfactuals, for the probability that event x was a necessary or sufficient cause (or both) of another event y. The paper then explicates conditions under which the probability of necessary (or sufficient) causation can be learned from statistical data, and shows how data from both experimental and nonexperimental studies can be combined to yield information that neither study alone can provide. Finally,weshow that necessity and sufficiency are two independent aspects of causation, and that both should be invoked in the construction of causal explanations for specific scenarios.
Validation Of Object-Oriented Metrics
, 1999
"... Many object-oriented metrics have been proposed, and at least fourteen empirical validations of these metrics have been performed. However, recently it was noted that without controlling for the effect of class size in a validation study, the impact of a metric may be exaggerated. It thus becomes ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Many object-oriented metrics have been proposed, and at least fourteen empirical validations of these metrics have been performed. However, recently it was noted that without controlling for the effect of class size in a validation study, the impact of a metric may be exaggerated. It thus becomes necessary to re-validate contemporary object-oriented metrics after controlling for size. In this paper we perform a validation study on a telecommunications C++ system. We investigate 24 metrics proposed by Chidamber and Kemerer and Briand et al.. Our dependent variable was the incidence of faults due to field failures (fault-proneness). Our results indicate that out of the 24 metrics (covering coupling, cohesion, inheritance, and complexity), only four are actually related to faults after controlling for class size, and that only two of these are useful for the construction of prediction models. The two selected metrics measure coupling. The best prediction model exhibits high accuracy.
A metabolome pipeline: from concept to data to knowledge
, 2005
"... Metabolomics, like other omics methods, produces huge datasets of biological variables, often accompanied by the necessary metadata. However, regardless of the form in which these are produced they are merely the ground substance for assisting us in answering biological questions. In this short tuto ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Metabolomics, like other omics methods, produces huge datasets of biological variables, often accompanied by the necessary metadata. However, regardless of the form in which these are produced they are merely the ground substance for assisting us in answering biological questions. In this short tutorial review and position paper we seek to set out some of the elements of ‘‘best practice’ ’ in the optimal acquisition of such data, and in the means by which they may be turned into reliable knowledge. Many of these steps involve the solution of what amount to combinatorial optimization problems, and methods developed for these, especially those based on evolutionary computing, are proving valuable. This is done in terms of a ‘‘pipeline’ ’ that goes from the design of good experiments, through instrumental optimization, data storage and manipulation, the chemometric data processing methods in common use, and the necessary means of validation and cross-validation for giving conclusions that are credible and likely to be robust when applied in comparable circumstances to samples not used in their generation.
Causal Inference in the Health Sciences: A Conceptual Introduction
- Health Services and Outcomes Research Methodology
, 2001
"... This paper provides a conceptual introduction to causal inference, aimed to assist health services researchers benefit from recent advances in this area. The paper stresses the paradigmatic shifts that must be undertaken in moving from traditional statistical analysis to causal analysis of multivari ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper provides a conceptual introduction to causal inference, aimed to assist health services researchers benefit from recent advances in this area. The paper stresses the paradigmatic shifts that must be undertaken in moving from traditional statistical analysis to causal analysis of multivariate data. Special emphasis is placed on the assumptions that underlie all causal inferences, the languages used in formulating those assumptions, and the conditional nature of causal claims inferred from nonexperimental studies. These emphases are illustrated through a brief survey of recent results, including the control of confounding, corrections for noncompliance, and a symbiosis between counterfactual and graphical methods of analysis.
Leukaemia and residence near electricity transmission equipment: a case-control study
- Br. J. Cancer
, 1989
"... Summary A population-based case-control study of leukaemia and residential proximity to electricity supply equipment has been carried out in south-east England. A total of 771 leukaemias was studied, matched for age, sex, year of diagnosis and district of residence to 1,432 controls registered with ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Summary A population-based case-control study of leukaemia and residential proximity to electricity supply equipment has been carried out in south-east England. A total of 771 leukaemias was studied, matched for age, sex, year of diagnosis and district of residence to 1,432 controls registered with a solid tumour excluding lymphoma; 231 general population controls aged 18 and over from one part of the study area were also used. The potential for residential exposure to power frequency magnetic fields from power-lines and transformer substations was assessed indirectly from the distance, type and loading of the equipment near each subject's residence. Only 0.6 % of subjects lived within 100 m of an overhead power-line, and the risk of leukaemia relative to cancer controls for residence within 100 m was 1.45 (95 % confidence interval (CI) 0.54-3.88); within 50 m the relative risk was 2.0 but with a wider confidence interval (95 % CI 0.4-9.0). Over 40 % of subjects lived within 100 m of a substation, for which the relative risk of leukaemia was 0.99. Residence within 25 m carried a risk of 1.3 (95 % CI 0.8-2.0). Weighted exposure indices incorporating measures of the current load carried by the substations did not materially alter these risks estimates. For persons aged less than 18 the relative risk of leukaemia from residence within 50 m of a substation was higher than in adults (RR = 1.5, 95 % CI 0.7-3.4). Epidemiological evidence suggests a possible leukaemogenic effect in man from exposure to electromagnetic fields in the extremely low frequency range (ELF, 0-300 Hz), which includes the usual public electricity power supply frequencies (50-60 Hz). Three case-control studies have shown a two- to three-fold increase in leukaemia risk in persons who lived close to electricity power-lines and supply equipment (Wertheimer

