Results 1 - 10
of
11
Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values
, 2001
"... Estimating the mean and the covariance matrix of an incomplete dataset and filling in missing values with imputed values is generally a nonlinear problem, which must be solved iteratively. The expectation maximization (EM) algorithm for Gaussian data, an iterative method both for the estimation of m ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
Estimating the mean and the covariance matrix of an incomplete dataset and filling in missing values with imputed values is generally a nonlinear problem, which must be solved iteratively. The expectation maximization (EM) algorithm for Gaussian data, an iterative method both for the estimation of mean values and covariance matrices from incomplete datasets and for the imputation of missing values, is taken as the point of departure for the development of a regularized EM algorithm. In contrast to the conventional EM algorithm, the regularized EM algorithm is applicable to sets of climate data, in which the number of variables typically exceeds the sample size. The regularized EM algorithm is based on iterated analyses of linear regressions of variables with missing values on variables with available values, with regression coefficients estimated by ridge regression, a regularized regression method in which a continuous regularization parameter controls the filtering of the noise in the data. The regularization parameter is determined by generalized cross-validation, such as to minimize, approximately, the expected mean squared error of the imputed values. The regularized EM algorithm can estimate, and exploit for the imputation of missing values, both synchronic and diachronic covariance matrices, which may contain information on spatial covariability, stationary temporal covariability, or cyclostationary temporal covariability. A test of the regularized EM algorithm with simulated surface temperature data demonstrates that the algorithm is applicable to typical sets of climate data and that it leads to more accurate estimates of the missing values than a conventional non-iterative imputation technique.
Modular Neural Networks for Medical Prognosis: Quantifying the Benefits of Combining Neural Networks for Survival Prediction
- Connection Science 9
, 1997
"... This paper describes a medical application of modular neural networks for temporal pattern recognition. In order to increase the reliability of prognostic indices for patients living with the Acquired Immunodeficiency Syndrome (AIDS), survival prediction was performed in a system composed of modular ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This paper describes a medical application of modular neural networks for temporal pattern recognition. In order to increase the reliability of prognostic indices for patients living with the Acquired Immunodeficiency Syndrome (AIDS), survival prediction was performed in a system composed of modular neural networks that classified cases according to death in a certain year of follow-up. The output of each neural network module corresponded to the probability of survival in a given year. Inputs were the values of demographic, clinical, and laboratory variables. The results of the modules were combined to produce survival curves for individuals. The neural networks were trained by backprogation and the results were evaluated in test sets of previously unseen cases. We showed that, for certain combinations of neural network modules, the performance of the prognostic index, measured by the area under the receiver operating characteristic (ROC) curve, was significantly improved (p<0.05). We...
Mixed Effects Model Analyses of Incomplete Longitudinal . . .
, 1984
"... Incomplete longitudinal data are a common problem in clinical and epidemiological studies. This work was motivated by longitudinal studies of pulmonary function in young children characterized by both missing and mistimed observations as well as time-varying covariates. The objectives of this work w ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Incomplete longitudinal data are a common problem in clinical and epidemiological studies. This work was motivated by longitudinal studies of pulmonary function in young children characterized by both missing and mistimed observations as well as time-varying covariates. The objectives of this work were to develop a model and an analysis approach which 1) accommodated both missing and mistimed data and covariates which changed over time, 2) allowed for testing of hypotheses about both the fixed and random effects, and 3) were computationally feasible and practical. A generalized Mixed Effects Model was developed which generalized some assumptions used by previous authors. In particular, the restriction that the withinsubject variance is uncorrelated and homoscedastic (0'2 I) was generalized to the form 0'2Vi where Vi is any known positive definite matrix. Maximum Likelihood Estimators were derived and the EM algorithm and the Method of Scoring were used to solve the maximum likelihood equations. Randomly generated data were used in a preliminary exploration of the
Structured Point Distribution Models:
- 12 th British Machine Vison Conference
, 2001
"... Point distribution models have been successful in describing the shape constraints on two dimensional objects for shape description and image search. It is often the case that a class of objects to be modelled contains certain features which may be wholly present or absent in di#erent instances. ..."
Abstract
- Add to MetaCart
Point distribution models have been successful in describing the shape constraints on two dimensional objects for shape description and image search. It is often the case that a class of objects to be modelled contains certain features which may be wholly present or absent in di#erent instances. Moustaches on faces are a common example. Here we describe a method of coding the presence or absence of a feature within the PDM framework. We show that the method captures the intermittent nature of the feature as one of the modes of variation, and demonstrate that, where features are intermittently present, greater model specificity is achieved.
Presented at 1981 Joint Statistical Meetings
, 1981
"... Missing data items is a practical reality of most statistical investigations. The reasons for an item being missing are numerous and frequently out of the control of the investigator. Experimental animals may die during the course of the experiment; a field plot may be ravaged by pests; subjects may ..."
Abstract
- Add to MetaCart
Missing data items is a practical reality of most statistical investigations. The reasons for an item being missing are numerous and frequently out of the control of the investigator. Experimental animals may die during the course of the experiment; a field plot may be ravaged by pests; subjects may move
STRATEGIES FOR MULTIVARIATE RANDOMIZATION ANALYSES AND APPLICATIONS TO HEALTH SCIENCES DATA
, 1982
"... ..."
THE THEORY &\ ~ APPLICATION OF A GENERAL ITE~~TIVE MAXIMUM LIKELIHOOD PROCEDURE TO ~\NDO~~Y CENSORED UNIVARIATE AND BIVARIATE NO& ~ LINEAR MODELS
, 1978
"... A general iterative maximum likelihood procedure for estimation of the parameters of a randomly censored univariate or bivariate normal distribution is developed, based on Orchard and Woodbury's "Missing Information Principle " (HIP). This procedure is applied as a ganeral solution to univariate and ..."
Abstract
- Add to MetaCart
A general iterative maximum likelihood procedure for estimation of the parameters of a randomly censored univariate or bivariate normal distribution is developed, based on Orchard and Woodbury's "Missing Information Principle " (HIP). This procedure is applied as a ganeral solution to univariate and bivariate k-sample estimation problems and multiple liilear regression estimation problems in the presence of random censoring on one or both variables. The proced~re is applied in the univariate cases to data sets from the literature for which specific methods have been developed and for \/hich solutions are therefore known. Simulations are run in several bivariate cases to establish the small sample characteristics of the estimates under several censoring regimens, and to demonstrate the general applicability of the procedure. Likelihood ratio tests are derived for use under random censorship conditions::or both univariate and bivariate normal problems, and,
Institute of Statistics Mimeo Series No. 1844T January 1988ANALYSIS OF CATEGORICAL DATA FROM LONGITUDINAL STUDIES OF SUBJECTS WITH POSSIBLY CLUSTERED STRUCTURES
"... Many references in the health sciences dealing with categorical data from studies with repeated measurements and possibly clustering present analyses that do not attempt to capture either of these special aspects of the data. Sometimes there may be analyses taking into consideration the longitudinal ..."
Abstract
- Add to MetaCart
Many references in the health sciences dealing with categorical data from studies with repeated measurements and possibly clustering present analyses that do not attempt to capture either of these special aspects of the data. Sometimes there may be analyses taking into consideration the longitudinal feature of the data; but for the most part, little attention has been given to any clustering of the attribute data. Studies which involve a data structure with clustering should be analyzed first by viewing the cluster as the basic unit of analysis. Only if correlation among cluster subunits is found negligible can cluster sub-units be appropriately viewed as the basic analytical units. In this way, care is taken so that estimates of variances are not underestimated (or overestimated). Applications of existing methods of analysis for longitudinal categorical data are
Estimation of SUR Model with Non-nested Missing Observations
, 1996
"... This paper considers alternative two-step estimators and their small sample properties for the seemingly unrelated regression (SUR) model with non-nested missing observations. A Monte Carlo experiment indicates that alternative estimators have more profound differences in their efficiency, compared ..."
Abstract
- Add to MetaCart
This paper considers alternative two-step estimators and their small sample properties for the seemingly unrelated regression (SUR) model with non-nested missing observations. A Monte Carlo experiment indicates that alternative estimators have more profound differences in their efficiency, compared to the case of nested missing observations. In particular, the two-step application of the Hartley-Hocking maximum likelihood estimator can realize a significant gain in efficiency. There are substantial losses in efficiency when only the subset of data that has complete observations is used in estimation.

