Results 1  10
of
15
Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values
, 2001
"... Estimating the mean and the covariance matrix of an incomplete dataset and filling in missing values with imputed values is generally a nonlinear problem, which must be solved iteratively. The expectation maximization (EM) algorithm for Gaussian data, an iterative method both for the estimation of m ..."
Abstract

Cited by 54 (3 self)
 Add to MetaCart
Estimating the mean and the covariance matrix of an incomplete dataset and filling in missing values with imputed values is generally a nonlinear problem, which must be solved iteratively. The expectation maximization (EM) algorithm for Gaussian data, an iterative method both for the estimation of mean values and covariance matrices from incomplete datasets and for the imputation of missing values, is taken as the point of departure for the development of a regularized EM algorithm. In contrast to the conventional EM algorithm, the regularized EM algorithm is applicable to sets of climate data, in which the number of variables typically exceeds the sample size. The regularized EM algorithm is based on iterated analyses of linear regressions of variables with missing values on variables with available values, with regression coefficients estimated by ridge regression, a regularized regression method in which a continuous regularization parameter controls the filtering of the noise in the data. The regularization parameter is determined by generalized crossvalidation, such as to minimize, approximately, the expected mean squared error of the imputed values. The regularized EM algorithm can estimate, and exploit for the imputation of missing values, both synchronic and diachronic covariance matrices, which may contain information on spatial covariability, stationary temporal covariability, or cyclostationary temporal covariability. A test of the regularized EM algorithm with simulated surface temperature data demonstrates that the algorithm is applicable to typical sets of climate data and that it leads to more accurate estimates of the missing values than a conventional noniterative imputation technique.
On Structural Equation Modeling with Data that are not Missing Completely at Random
 Psychometrika
, 1987
"... A general latent variable model is given which includes the specification of a missing data mechanism. This framework allows for an elucidating discussion of existing general multivariate theory bearing on maximum likelihood estimation with missing data. Here, missing completely at random is not a p ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
A general latent variable model is given which includes the specification of a missing data mechanism. This framework allows for an elucidating discussion of existing general multivariate theory bearing on maximum likelihood estimation with missing data. Here, missing completely at random is not a prerequisite for unbiased estimation in large samples, as when using the traditional listwise or pairwise present data approaches. The theory is connected with old and new results in the area of selection and factorial invariance. It is pointed out that in many applications, maximum likelihood estimation with missing data may be carried out by existing structural equation modeling software, such as LISREL and LISCOMP. Several sets of artifical data are generated within the general model framework. The proposed estimator is compared to the two traditional ones and found superior. Key words: maximum likelihood, ignorability, selectivity, factor analysis, factorial invariance,
Modular Neural Networks for Medical Prognosis: Quantifying the Benefits of Combining Neural Networks for Survival Prediction
 Connection Science 9
, 1997
"... This paper describes a medical application of modular neural networks for temporal pattern recognition. In order to increase the reliability of prognostic indices for patients living with the Acquired Immunodeficiency Syndrome (AIDS), survival prediction was performed in a system composed of modular ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
This paper describes a medical application of modular neural networks for temporal pattern recognition. In order to increase the reliability of prognostic indices for patients living with the Acquired Immunodeficiency Syndrome (AIDS), survival prediction was performed in a system composed of modular neural networks that classified cases according to death in a certain year of followup. The output of each neural network module corresponded to the probability of survival in a given year. Inputs were the values of demographic, clinical, and laboratory variables. The results of the modules were combined to produce survival curves for individuals. The neural networks were trained by backprogation and the results were evaluated in test sets of previously unseen cases. We showed that, for certain combinations of neural network modules, the performance of the prognostic index, measured by the area under the receiver operating characteristic (ROC) curve, was significantly improved (p<0.05). We...
Mixed Effects Model Analyses of Incomplete Longitudinal . . .
, 1984
"... Incomplete longitudinal data are a common problem in clinical and epidemiological studies. This work was motivated by longitudinal studies of pulmonary function in young children characterized by both missing and mistimed observations as well as timevarying covariates. The objectives of this work w ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Incomplete longitudinal data are a common problem in clinical and epidemiological studies. This work was motivated by longitudinal studies of pulmonary function in young children characterized by both missing and mistimed observations as well as timevarying covariates. The objectives of this work were to develop a model and an analysis approach which 1) accommodated both missing and mistimed data and covariates which changed over time, 2) allowed for testing of hypotheses about both the fixed and random effects, and 3) were computationally feasible and practical. A generalized Mixed Effects Model was developed which generalized some assumptions used by previous authors. In particular, the restriction that the withinsubject variance is uncorrelated and homoscedastic (0'2 I) was generalized to the form 0'2Vi where Vi is any known positive definite matrix. Maximum Likelihood Estimators were derived and the EM algorithm and the Method of Scoring were used to solve the maximum likelihood equations. Randomly generated data were used in a preliminary exploration of the
Groupbased estimation of missing hydrological data. I. Approach and general
, 2000
"... Abstract In this first paper in a set of two, the problem of estimating missing segments in streamflow records is described. The group approach, different from the traditional singlevalued approach, is proposed and explained. The approach perceives the hydrological data as sequence of groups rather ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract In this first paper in a set of two, the problem of estimating missing segments in streamflow records is described. The group approach, different from the traditional singlevalued approach, is proposed and explained. The approach perceives the hydrological data as sequence of groups rather than singlevalued observations. The techniques suggested to handle the group approach are regression, time series analysis, partitioning modelling, and artificial neural networks. Pertinent literature is reviewed and background material is used to support the group approach. Implementation and comparisons of models ' performance are deferred to the second paper. L'approche de groupe pour l'estimation des données hydrologiques manquantes: I. Présentation et méthodologie Résumé Dans ce premier de deux papiers, nous décrivons le problème de l'estimation de suites de données manquantes dans les archives de débits. Nous présentons et expliquons l'approche de groupe, différente des approches traditionnelles focalisées sur l'estimation de valeurs singulières. Cette nouvelle approche conçoit les données hydrologiques comme des suites de groupes plutôt que comme des suites d'observations singulières. Les techniques susceptibles de la servir sont: la régression, l'analyse des séries chronologiques, la segmentation et les réseaux de neurones artificiels. Nous présentons une revue de littérature d'où nous avons tiré des arguments en faveur de la promotion de l'approche de groupe. L'implementation et l'évaluation de l'approche de groupe font l'objet du second papier.
Structured Point Distribution Models:
 12 th British Machine Vison Conference
, 2001
"... Point distribution models have been successful in describing the shape constraints on two dimensional objects for shape description and image search. It is often the case that a class of objects to be modelled contains certain features which may be wholly present or absent in di#erent instances. ..."
Abstract
 Add to MetaCart
Point distribution models have been successful in describing the shape constraints on two dimensional objects for shape description and image search. It is often the case that a class of objects to be modelled contains certain features which may be wholly present or absent in di#erent instances. Moustaches on faces are a common example. Here we describe a method of coding the presence or absence of a feature within the PDM framework. We show that the method captures the intermittent nature of the feature as one of the modes of variation, and demonstrate that, where features are intermittently present, greater model specificity is achieved.
Presented at 1981 Joint Statistical Meetings
, 1981
"... Missing data items is a practical reality of most statistical investigations. The reasons for an item being missing are numerous and frequently out of the control of the investigator. Experimental animals may die during the course of the experiment; a field plot may be ravaged by pests; subjects may ..."
Abstract
 Add to MetaCart
Missing data items is a practical reality of most statistical investigations. The reasons for an item being missing are numerous and frequently out of the control of the investigator. Experimental animals may die during the course of the experiment; a field plot may be ravaged by pests; subjects may move
STRATEGIES FOR MULTIVARIATE RANDOMIZATION ANALYSES AND APPLICATIONS TO HEALTH SCIENCES DATA
, 1982
"... ..."
THE THEORY &\ ~ APPLICATION OF A GENERAL ITE~~TIVE MAXIMUM LIKELIHOOD PROCEDURE TO ~\NDO~~Y CENSORED UNIVARIATE AND BIVARIATE NO& ~ LINEAR MODELS
, 1978
"... A general iterative maximum likelihood procedure for estimation of the parameters of a randomly censored univariate or bivariate normal distribution is developed, based on Orchard and Woodbury's "Missing Information Principle " (HIP). This procedure is applied as a ganeral solution to univariate and ..."
Abstract
 Add to MetaCart
A general iterative maximum likelihood procedure for estimation of the parameters of a randomly censored univariate or bivariate normal distribution is developed, based on Orchard and Woodbury's "Missing Information Principle " (HIP). This procedure is applied as a ganeral solution to univariate and bivariate ksample estimation problems and multiple liilear regression estimation problems in the presence of random censoring on one or both variables. The proced~re is applied in the univariate cases to data sets from the literature for which specific methods have been developed and for \/hich solutions are therefore known. Simulations are run in several bivariate cases to establish the small sample characteristics of the estimates under several censoring regimens, and to demonstrate the general applicability of the procedure. Likelihood ratio tests are derived for use under random censorship conditions::or both univariate and bivariate normal problems, and,
Institute of Statistics Mimeo Series No. 1844T January 1988ANALYSIS OF CATEGORICAL DATA FROM LONGITUDINAL STUDIES OF SUBJECTS WITH POSSIBLY CLUSTERED STRUCTURES
"... Many references in the health sciences dealing with categorical data from studies with repeated measurements and possibly clustering present analyses that do not attempt to capture either of these special aspects of the data. Sometimes there may be analyses taking into consideration the longitudinal ..."
Abstract
 Add to MetaCart
Many references in the health sciences dealing with categorical data from studies with repeated measurements and possibly clustering present analyses that do not attempt to capture either of these special aspects of the data. Sometimes there may be analyses taking into consideration the longitudinal feature of the data; but for the most part, little attention has been given to any clustering of the attribute data. Studies which involve a data structure with clustering should be analyzed first by viewing the cluster as the basic unit of analysis. Only if correlation among cluster subunits is found negligible can cluster subunits be appropriately viewed as the basic analytical units. In this way, care is taken so that estimates of variances are not underestimated (or overestimated). Applications of existing methods of analysis for longitudinal categorical data are