Results 1 
9 of
9
Updating Probabilities
, 2002
"... As examples such as the Monty Hall puzzle show, applying conditioning to update a probability distribution on a "naive space", which does not take into account the protocol used, can often lead to counterintuitive results. Here we examine why. A criterion known as CAR ("coarsening a ..."
Abstract

Cited by 52 (6 self)
 Add to MetaCart
As examples such as the Monty Hall puzzle show, applying conditioning to update a probability distribution on a "naive space", which does not take into account the protocol used, can often lead to counterintuitive results. Here we examine why. A criterion known as CAR ("coarsening at random") in the statistical literature characterizes when "naive" conditioning in a naive space works. We show that the CAR condition holds rather infrequently, and we provide a procedural characterization of it, by giving a randomized algorithm that generates all and only distributions for which CAR holds. This substantially extends previous characterizations of CAR. We also consider more generalized notions of update such as Jeffrey conditioning and minimizing relative entropy (MRE). We give a generalization of the CAR condition that characterizes when Jeffrey conditioning leads to appropriate answers, and show that there exist some very simple settings in which MRE essentially never gives the right results. This generalizes and interconnects previous results obtained in the literature on CAR and MRE.
Ignorability for categorical data
 The Annals of Statistics
"... We study the problem of ignorability in likelihoodbased inference from incomplete categorical data. Two versions of the coarsened at random assumption (car) are distinguished, their compatibility with the parameter distinctness assumption is investigated and several conditions for ignorability that ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
We study the problem of ignorability in likelihoodbased inference from incomplete categorical data. Two versions of the coarsened at random assumption (car) are distinguished, their compatibility with the parameter distinctness assumption is investigated and several conditions for ignorability that do not require an extra parameter distinctness assumption are established. It is shown that car assumptions have quite different implications depending on whether the underlying completedata model is saturated or parametric. In the latter case, car assumptions can become inconsistent with observed data. 1. Introduction. In a sequence of papers Rubin [15], Heitjan and Rubin [11] and Heitjan [9, 10] have investigated the question under what conditions a mechanism that causes observed data to be incomplete or, more generally, coarse, can be ignored in the statistical analysis of the data. The key condition that has been identified is that the data should be missing at
Conditional Independence
, 1997
"... This article has been prepared as an entry for the Wiley Encyclopedia of Statistical Sciences (Update). It gives a brief overview of fundamental properties and applications of conditional independence. ESS Update A. P. Dawid Conditional Independence Ancillarity; axioms; graphical models; markov pro ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
This article has been prepared as an entry for the Wiley Encyclopedia of Statistical Sciences (Update). It gives a brief overview of fundamental properties and applications of conditional independence. ESS Update A. P. Dawid Conditional Independence Ancillarity; axioms; graphical models; markov properties; sufficiency. The concepts of independence and conditional independence (CI) between random variables originate in Probability Theory, where they are introduced as properties of an underlying probability measure P on the sample space (see CONDITIONAL PROBABILITY AND EXPECTATION). Much of traditional Probability Theory and Statistics involves analysis of distributions having such properties: for example, limit theorems for independent and identically distributed variables, or the theory of MARKOV PROCESSES. More recently, it has become apparent that it is fruitful to treat conditional independence (and its special case independence) as a primitive concept, with an intuitive meaning, ...
On testing the missing at random assumption
 In Proceedings of the 17th European Conference on Machine Learning (ECML2006
, 2006
"... Abstract. Most approaches to learning from incomplete data are based on the assumption that unobserved values are missing at random (mar). While the mar assumption, as such, is not testable, it can become testable in the context of other distributional assumptions, e.g. the naive Bayes assumption. I ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Abstract. Most approaches to learning from incomplete data are based on the assumption that unobserved values are missing at random (mar). While the mar assumption, as such, is not testable, it can become testable in the context of other distributional assumptions, e.g. the naive Bayes assumption. In this paper we investigate a method for testing the mar assumption in the presence of other distributional constraints. We present methods to (approximately) compute a test statistic consisting of the ratio of two profile likelihood functions. This requires the optimization of the likelihood under no assumptions on the missingness mechanism, for which we use our recently proposed AI & M algorithm. We present experimental results on synthetic data that show that our approximate test statistic is a good indicator for whether data is mar relative to the given distributional assumptions. 1
The AI & M procedure for learning from incomplete data
 IN R. DECHTER AND T. RICHARDSON (EDS.), PROCEEDINGS OF THE TWENTYSECOND CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2006
, 2006
"... We investigate methods for parameter learning from incomplete data that is not missing at random. Likelihoodbased methods then require the optimization of a profile likelihood that takes all possible missingness mechanisms into account. Optimizing this profile likelihood poses two main difficulties ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We investigate methods for parameter learning from incomplete data that is not missing at random. Likelihoodbased methods then require the optimization of a profile likelihood that takes all possible missingness mechanisms into account. Optimizing this profile likelihood poses two main difficulties: multiple (local) maxima, and its very highdimensional parameter space. In this paper a new method is presented for optimizing the profile likelihood that addresses the second difficulty: in the proposed AI&M (adjusting imputation and maximization) procedure the optimization is performed by operations in the space of data completions, rather than directly in the parameter space of the profile likelihood. We apply the AI&M method to learning parameters for Bayesian networks. The method is compared against conservative inference, which takes into account each possible data completion, and against EM. The results indicate that likelihoodbased inference is still feasible in the case of unknown missingness mechanisms, and that conservative inference is unnecessarily weak. On the other hand, our results also provide evidence that the EM algorithm is still quite effective when the data is not missing at random.
Ignorability in statistical and probabilistic inference
 Journal of Artificial Intelligence Research
, 2005
"... When dealing with incomplete data in statistical learning, or incomplete observations in probabilistic inference, one needs to distinguish the fact that a certain event is observed from the fact that the observed event has happened. Since the modeling and computational complexities entailed by maint ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
When dealing with incomplete data in statistical learning, or incomplete observations in probabilistic inference, one needs to distinguish the fact that a certain event is observed from the fact that the observed event has happened. Since the modeling and computational complexities entailed by maintaining this proper distinction are often prohibitive, one asks for conditions under which it can be safely ignored. Such conditions are given by the missing at random (mar) and coarsened at random (car) assumptions. In this paper we provide an indepth analysis of several questions relating to mar/car assumptions. Main purpose of our study is to provide criteria by which one may evaluate whether a car assumption is reasonable for a particular data collecting or observational process. This question is complicated by the fact that several distinct versions of mar/car assumptions exist. We therefore first provide an overview over these different versions, in which we highlight the distinction between distributional and coarsening variable induced versions. We show that distributional versions are less restrictive and sufficient for most applications. We then address from two different perspectives the question of when the mar/car assumption is warranted. First we provide a “static ” analysis that characterizes the admissibility of the car assumption in terms of the support structure of the joint probability distribution of complete data and incomplete observations. Here we obtain an equivalence characterization that improves and extends a recent result by Grünwald and Halpern. We then turn to a “procedural ” analysis that characterizes the admissibility of the car assumption in terms of procedural models for the actual data (or observation) generating process. The main result of this analysis is that the stronger coarsened completely at random (ccar) condition is arguably the most reasonable assumption, as it alone corresponds to data coarsening procedures that satisfy a natural robustness property. 1.
To appear in The Annals of Statistics 1 Ignorability for categorical data
"... We study the problem of ignorability in likelihoodbased inference from incomplete categorical data. Two versions of the coarsened at random assumption (car) are distinguished, their compatibility with the parameter distinctness assumption is investigated, and several conditions for ignorability tha ..."
Abstract
 Add to MetaCart
We study the problem of ignorability in likelihoodbased inference from incomplete categorical data. Two versions of the coarsened at random assumption (car) are distinguished, their compatibility with the parameter distinctness assumption is investigated, and several conditions for ignorability that do not require an extra parameter distinctness assumption are established. It is shown that car assumptions have quite different implications depending on whether the underlying completedata model is saturated or parametric. In the latter case, car assumptions can become inconsistent with observed data.
In Proceedings of UAI06 1 The AI&M Procedure for Learning from Incomplete Data
"... We investigate methods for parameter learning from incomplete data that is not missing at random. Likelihoodbased methods then require the optimization of a profile likelihood that takes all possible missingness mechanisms into account. Optimizing this profile likelihood poses two main difficulties ..."
Abstract
 Add to MetaCart
We investigate methods for parameter learning from incomplete data that is not missing at random. Likelihoodbased methods then require the optimization of a profile likelihood that takes all possible missingness mechanisms into account. Optimizing this profile likelihood poses two main difficulties: multiple (local) maxima, and its very highdimensional parameter space. In this paper a new method is presented for optimizing the profile likelihood that addresses the second difficulty: in the proposed AI&M (adjusting imputation and maximization) procedure the optimization is performed by operations in the space of data completions, rather than directly in the parameter space of the profile likelihood. We apply the AI&M method to learning parameters for Bayesian networks. The method is compared against conservative inference, which takes into account each possible data completion, and against EM. The results indicate that likelihoodbased inference is still feasible in the case of unknown missingness mechanisms, and that conservative inference is unnecessarily weak. On the other hand, our results also provide evidence that the EM algorithm is still quite effective when the data is not missing at random. 1