Results 1 
2 of
2
Stochastic Complexity Based Estimation of Missing Elements in Questionnaire Data
 in Questionnaire Data”. the Annual American Educational Research Association Meeting, SIG Educational Statisticians
, 1998
"... this paper we study a new informationtheoretically justified approach to missing data estimation for multivariate categorical data. The approach discussed is a modelbased imputation procedure relative to a model class (i.e., a functional form for the probability distribution of the complete data m ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
this paper we study a new informationtheoretically justified approach to missing data estimation for multivariate categorical data. The approach discussed is a modelbased imputation procedure relative to a model class (i.e., a functional form for the probability distribution of the complete data matrix), which in our case is the set of multinomial models with some independence assumptions. Based on the given model class assumption an informationtheoretic criterion can be derived to select between the different complete data matrices. Intuitively this general criterion, called stochastic complexity, represents the shortest code length needed for coding the complete data matrix relative to the model class chosen. Using this informationtheoretic criteria, the missing data problem is reduced to a search problem, i.e., finding the data completion with minimal stochastic complexity. In the experimental part of the paper we present empirical results of the approach using two real data sets, and compare these results to those achived by commonly used techniques such as case deletion and imputating sample averages. Introduction
PUB TYPE Reports Evaluative (142) Speeches/Meeting Papers (150)
"... A new informationtheoretically justified approach to missing data estimation for multivariate categorical data was studied. The approach is a modelbased imputation procedure relative to a model class (i.e., a functional form for the probability distribution of the complete data matrix), which in t ..."
Abstract
 Add to MetaCart
(Show Context)
A new informationtheoretically justified approach to missing data estimation for multivariate categorical data was studied. The approach is a modelbased imputation procedure relative to a model class (i.e., a functional form for the probability distribution of the complete data matrix), which in this case is the set of multinomial models with some independence assumptions. Based on the given model class assumption, an informationtheoretic criterion can be derived to select between the different complete data matrices. Intuitively this general criterion, called stochastic complexity, represents the shortest code length needed for coding the complete data matrix relative to the model class chosen. Using these informationtheoretic criteria, the missing data problem is reduced to a search problem, that of finding the data completion with minimal stochastic complexity. The results of two empirical studies of the approach, using educational data sets of 478 elementary school students ("Popular kids" POPKIDS in Michigan) and 500 Irish schoolchildren ("Irish educational transitions data Irish), are presented and compared to those achieved with commonly used techniques such as case deletion and imputation of sample averages. (Contains 3 figures, 6 tables, and 36 references.) (Author/SLD) Reproductions supplied by EDRS are the best that can be made from the original document.