Results 1  10
of
29
Updating Probabilities
, 2002
"... As examples such as the Monty Hall puzzle show, applying conditioning to update a probability distribution on a "naive space", which does not take into account the protocol used, can often lead to counterintuitive results. Here we examine why. A criterion known as CAR ("coarsening at random") in t ..."
Abstract

Cited by 53 (6 self)
 Add to MetaCart
As examples such as the Monty Hall puzzle show, applying conditioning to update a probability distribution on a "naive space", which does not take into account the protocol used, can often lead to counterintuitive results. Here we examine why. A criterion known as CAR ("coarsening at random") in the statistical literature characterizes when "naive" conditioning in a naive space works. We show that the CAR condition holds rather infrequently, and we provide a procedural characterization of it, by giving a randomized algorithm that generates all and only distributions for which CAR holds. This substantially extends previous characterizations of CAR. We also consider more generalized notions of update such as Jeffrey conditioning and minimizing relative entropy (MRE). We give a generalization of the CAR condition that characterizes when Jeffrey conditioning leads to appropriate answers, and show that there exist some very simple settings in which MRE essentially never gives the right results. This generalizes and interconnects previous results obtained in the literature on CAR and MRE.
Adjusting for nonignorable dropout using semiparametric nonresponse models (with discussion
 Journal of the American Statistical Association
, 1999
"... Consider a study whose design calls for the study subjects to be followed from enrollment (time t = 0) to time t = T,at which point a primary endpoint of interest Y is to be measured. The design of the study also calls for measurements on a vector V(t) of covariates to be made at one or more times t ..."
Abstract

Cited by 39 (10 self)
 Add to MetaCart
Consider a study whose design calls for the study subjects to be followed from enrollment (time t = 0) to time t = T,at which point a primary endpoint of interest Y is to be measured. The design of the study also calls for measurements on a vector V(t) of covariates to be made at one or more times t during the interval [0,T). We are interested in making inferences about the marginal mean µ0 of Y when some subjects drop out of the study at random times Q prior to the common fixed end of followup time T. The purpose of this article is to show how to make inferences about µ0 when the continuous dropout time Q is modeled semiparametrically and no restrictions are placed on the joint distribution of the outcome and other measured variables. In particular, we consider two models for the conditional hazard of dropout given ( ¯ V(T), Y), where ¯ V(t) denotes the history of the process V(t) through time t, t ∈ [0,T). In the first model, we assume that λQ(t  ¯ V(T), Y) = λ0(t  ¯ V(t)) exp(α0Y), where α0 is a scalar parameter and λ0(t  ¯ V(t)) is an unrestricted positive function of t and the process ¯ V(t). When the process ¯ V(t) is high dimensional, estimation in this model is not feasible with moderate sample sizes, due to the curse of dimensionality. For such situations, we consider a second model that imposes the additional restriction that λ0(t  ¯ V(t)) = λ0(t) exp(γ ′ 0W(t)), where λ0(t) is an unspecified baseline hazard function, W(t) = w(t, ¯ V(t)), w(·, ·) is a known function that maps (t, ¯ V(t)) to Rq, and γ0 is a q × 1 unknown parameter vector. When α0 � = 0, then dropout is nonignorable. On account of identifiability problems, joint estimation of the mean µ0 of Y and the selection bias parameter α0 may be difficult or impossible. Therefore, we propose regarding the selection bias parameter α0 as known, rather than estimating it from the data. We then perform a sensitivity analysis to see how inference about µ0 changes as we vary α0 over a plausible range of values. We apply our approach to the analysis of ACTG 175, an AIDS clinical trial. KEY WORDS: Augmented inverse probability of censoring weighted estimators; Cox proportional hazards model; Identification;
Learning Reliable Classifiers from Small or Incomplete Data Sets: the Naive Credal Classifier 2
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2008
"... In this paper, the naive credal classifier, which is a setvalued counterpart of naive Bayes, is extended to a general and flexible treatment of incomplete data, yielding a new classifier called naive credal classifier 2 (NCC2). The new classifier delivers classifications that are reliable even in t ..."
Abstract

Cited by 18 (12 self)
 Add to MetaCart
In this paper, the naive credal classifier, which is a setvalued counterpart of naive Bayes, is extended to a general and flexible treatment of incomplete data, yielding a new classifier called naive credal classifier 2 (NCC2). The new classifier delivers classifications that are reliable even in the presence of small sample sizes and missing values. Extensive empirical evaluations show that, by issuing setvalued classifications, NCC2 is able to isolate and properly deal with instances that are hard to classify (on which naive Bayes accuracy drops considerably), and to perform as well as naive Bayes on the other instances. The experiments point to a general problem: they show that with missing values, empirical evaluations may not reliably estimate the accuracy of a traditional classifier, such as naive Bayes. This phenomenon adds even more value to the robust approach to classification implemented by NCC2.
INVERSE PROBABILITY WEIGHTED ESTIMATION FOR GENERAL MISSING DATA PROBLEMS
"... I study inverse probability weighted Mestimation under a general missing data scheme. Examples include Mestimation with missing data due to a censored survival time, propensity score estimation of the average treatment effect in the linear exponential family, and variable probability sampling with ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
I study inverse probability weighted Mestimation under a general missing data scheme. Examples include Mestimation with missing data due to a censored survival time, propensity score estimation of the average treatment effect in the linear exponential family, and variable probability sampling with observed retention frequencies. I extend an important result known to hold in special cases: estimating the selection probabilities is generally more efficient than if the known selection probabilities could be used in estimation. For the treatment effect case, the setup allows a general characterization of a “double robustness ” result due to Scharfstein, Rotnitzky, and Robins (1999).
What Do We Learn from Recall Consumption Data?
, 2000
"... In this paper we use two complementary Italian data sources (the 1995 Istat and Bank of Italy household surveys) to generate householdspeci c nondurable expenditure and savings measures in the Bank of Italy sample that contains relatively highquality income data. We show that food expenditure ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
In this paper we use two complementary Italian data sources (the 1995 Istat and Bank of Italy household surveys) to generate householdspeci c nondurable expenditure and savings measures in the Bank of Italy sample that contains relatively highquality income data. We show that food expenditure data are of comparable quality and informational content across the two surveys, once heaping, rounding and time averaging are properly accounted for. We therefore depart from standard practice and rely on structural estimation of an inverse Engel curve on Istat data to impute nondurable and total expenditure to Bank of Italy observations, and show how these estimates can be used to analyse saving and consumption age pro les conditional on demographics. Acknowledgments We are grateful for helpful discussions with Enrico Rettore, Nicoletta Rosati and particularly JeanMarc Robin, and for comments by audiences at ESEM99, UCL, UCY, Universit di Padova and INSEE. We would like to tha...
The AI & M procedure for learning from incomplete data
 IN R. DECHTER AND T. RICHARDSON (EDS.), PROCEEDINGS OF THE TWENTYSECOND CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2006
, 2006
"... We investigate methods for parameter learning from incomplete data that is not missing at random. Likelihoodbased methods then require the optimization of a profile likelihood that takes all possible missingness mechanisms into account. Optimizing this profile likelihood poses two main difficulties ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We investigate methods for parameter learning from incomplete data that is not missing at random. Likelihoodbased methods then require the optimization of a profile likelihood that takes all possible missingness mechanisms into account. Optimizing this profile likelihood poses two main difficulties: multiple (local) maxima, and its very highdimensional parameter space. In this paper a new method is presented for optimizing the profile likelihood that addresses the second difficulty: in the proposed AI&M (adjusting imputation and maximization) procedure the optimization is performed by operations in the space of data completions, rather than directly in the parameter space of the profile likelihood. We apply the AI&M method to learning parameters for Bayesian networks. The method is compared against conservative inference, which takes into account each possible data completion, and against EM. The results indicate that likelihoodbased inference is still feasible in the case of unknown missingness mechanisms, and that conservative inference is unnecessarily weak. On the other hand, our results also provide evidence that the EM algorithm is still quite effective when the data is not missing at random.
Nonparametric locally efficient estimation of the treatment specific survival distribution with right censored data and covariates in observational studies
 the 1997 Proceedings of workshop on Causal Inference in Observational Studies
, 1999
"... In many observational studies one is concerned with comparing treatment specific survival distributions in the presence of confounding factors and censoring. In this paper we develop locally efficient point and interval estimators of these survival distributions which adjust for confounding by using ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
In many observational studies one is concerned with comparing treatment specific survival distributions in the presence of confounding factors and censoring. In this paper we develop locally efficient point and interval estimators of these survival distributions which adjust for confounding by using an estimate of the propensity score and concurrently allow for dependent censoring. The proposed methodology is an application of a general methodology for construction of locally efficient estimators as presented in Robins (1993) and Robins and Rotnitzky (1992). The practical performance of the methods are tested with a simulation study. Some key words: Rightcensored data, asymptotically efficient, asymptotically linear estimator, confounding, Cox proportional hazards model, Influence curve. 1
Decision making under incomplete data using the imprecise Dirichlet model
, 2006
"... The paper presents an efficient solution to decision problems where direct partial information on the distribution of the states of nature is available, either by observations of previous repetitions of the decision problem or by direct expert judgements. To process this information we use a recent ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
The paper presents an efficient solution to decision problems where direct partial information on the distribution of the states of nature is available, either by observations of previous repetitions of the decision problem or by direct expert judgements. To process this information we use a recent generalization of Walley’s imprecise Dirichlet model, allowing us also to handle incomplete observations or imprecise judgements. We derive efficient algorithms and discuss properties of the optimal solutions. In the case of precise data and pure actions we are surprisingly led to a frequencybased variant of the HodgesLehmann criterion, which was developed in classical decision theory as a compromise between Bayesian and minimax procedures.
Multiple Imputation for Multivariate Data with Missing and BelowThreshhold Measurements: TimeSeries Concentrations of Pollutants in the Arctic
 Biometrics
, 2001
"... Many chemical and environmental data sets are complicated by the existence of fully missing values or censored values known to lie below detection thresholds. For example, weeklong samples of airborne particulate matter were obtained at Alert, N. W. T. Canada between 1980 and 1991, where some of th ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Many chemical and environmental data sets are complicated by the existence of fully missing values or censored values known to lie below detection thresholds. For example, weeklong samples of airborne particulate matter were obtained at Alert, N. W. T. Canada between 1980 and 1991, where some of the concentrations of 24 particulate constituents were coarsened in the sense of being either fully missing or below detection limits. To facilitate scientific analysis, it is appealing to create complete data by filling in missing values so that standard completedata methods can be applied. We briefly review commonly used strategies for handling missing values and focus on the multiple imputation approach, which generally leads to valid inferences when faced with missing data. Three statistical models are developed for multiplyimputing the missing values of airborne particulate matter. We expect that these models are useful for creating multiple imputations in a variety of incomplete multiv...