Results 1  10
of
72
Nonparametric Mixed Effects Models for Unequally Sampled Noisy Curves
 Biometrics
, 1998
"... We propose a method of analyzing collections of related curves in which the individual curves are modeled as spline functions with random coefficients. The method is applicable when the individual curves are sampled at variable and irregularly spaced points. This produces a low rank, low frequency a ..."
Abstract

Cited by 69 (2 self)
 Add to MetaCart
We propose a method of analyzing collections of related curves in which the individual curves are modeled as spline functions with random coefficients. The method is applicable when the individual curves are sampled at variable and irregularly spaced points. This produces a low rank, low frequency approximation to the covariance structure, which can be estimated naturally by the EM algorithm. Smooth curves for individual trajectories are constructed as BLUP estimates, combining data from that individual and the entire collection. This framework leads naturally to methods for examining the effects of covariates on the shapes of the curves. We use model selection techniquesAIC, BIC, and crossvalidation to select the number of breakpoints for the spline approximation. We believe that the methodology we propose provides a simple, flexible, and computationally efficient means of functional data analysis. We illustrate it with two sets of data. 1 Introduction In recent years there ha...
I.: Continuous representations of timeseries gene expression data
 J Comput Biol
"... We present algorithms for timeseries gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression pro � le is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time poin ..."
Abstract

Cited by 62 (10 self)
 Add to MetaCart
We present algorithms for timeseries gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression pro � le is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point in � uences the overall smooth expression curve. We constrain the spline coef � cients of genes in the same class to have similar expression patterns, while also allowing for gene speci � c parameters. We show that unobserved time points can be reconstructed using our method with 10–15 % less error when compared to previous best methods. Our clustering algorithm operates directly on the continuous representations of gene expression pro � les, and we demonstrate that this is particularly effective when applied to nonuniformly sampled data. Our continuous alignment algorithm also avoids dif � culties encountered by discrete approaches. In particular, our method allows for control of the number of degrees of freedom of the warp through the speci � cation of parameterized functions, which helps to avoid over � tting. We demonstrate that our algorithm produces stable lowerror alignments on real expression data and further show a speci � c application to yeast knockout data that produces biologically meaningful results. Key words: time series expression data, missing value estimation, clustering, alignment. 1.
Functionalcoefficient Regression Models for Nonlinear Time Series
 Journal of the American Statistical Association
, 1998
"... We apply the local linear regression technique for estimation of functionalcoefficient regression models for time series data. The models include threshold autoregressive models (Tong 1990) and functionalcoefficient autoregressive models (Chen and Tsay 1993) as special cases but with the added adv ..."
Abstract

Cited by 43 (11 self)
 Add to MetaCart
We apply the local linear regression technique for estimation of functionalcoefficient regression models for time series data. The models include threshold autoregressive models (Tong 1990) and functionalcoefficient autoregressive models (Chen and Tsay 1993) as special cases but with the added advantages such as depicting finer structure of the underlying dynamics and better postsample forecasting performance. We have also proposed a new bootstrap test for the goodness of fit of models and a bandwidth selector based on newly defined crossvalidatory estimation for the expected forecasting errors. The proposed methodology is dataanalytic and is of appreciable flexibility to analyze complex and multivariate nonlinear structures without suffering from the "curse of dimensionality". The asymptotic properties of the proposed estimators are investigated under the ffmixing condition. Both simulated and real data examples are used for illustration. Key Words: ffmixing; Asymptotic normali...
TwoStep Estimation of Functional Linear Models with Applications to Longitudinal Data
 Journal of the Royal Statistical Society, Series B
, 2000
"... Functional linear models are useful in longitudinal data analysis. They include many classical and recently proposed statistical models for longitudinal data and other functional data. Recently, smoothing spline and kernel methods have been proposed for estimating their coefficient functions nonpara ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
Functional linear models are useful in longitudinal data analysis. They include many classical and recently proposed statistical models for longitudinal data and other functional data. Recently, smoothing spline and kernel methods have been proposed for estimating their coefficient functions nonparametrically but these methods are either intensive in computation or inefficient in performance. Toovercome these drawbacks, in this paper, a simple and powerful twostep alternativeis proposed. In particular, the implementation of the proposed approach via local polynomial smoothing is discussed. Methods for estimating standard deviations of estimated coefficient functions are also proposed. Some asymptotic results for the local polynomial estimators are established. Two longitudinal data sets, one of which involves timedependent covariates, are used to demonstrate the proposed approach. Simulation studies show that our twostep approach improves the kernel method proposed in Hoover, et al...
Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV
 Ann. Statist
"... (ranGACV) method for choosing multiple smoothing parameters in penalized likelihood estimates for Bernoulli data. The method is intended for application with penalized likelihood smoothing spline ANOVA models. In addition we propose a class of approximate numerical methods for solving the penalized ..."
Abstract

Cited by 41 (19 self)
 Add to MetaCart
(ranGACV) method for choosing multiple smoothing parameters in penalized likelihood estimates for Bernoulli data. The method is intended for application with penalized likelihood smoothing spline ANOVA models. In addition we propose a class of approximate numerical methods for solving the penalized likelihood variational problem which, in conjunction with the ranGACV method allows the application of smoothing spline ANOVA models with Bernoulli data to much larger data sets than previously possible. These methods are based on choosing an approximating subset of the natural (representer) basis functions for the variational problem. Simulation studies with synthetic data, including synthetic data mimicking demographic risk factor data sets is used to examine the properties of the method and to compare the approach with the GRKPACK code of Wang (1997c). Bayesian “confidence intervals ” are obtained for the fits and are shown in the simulation studies to have the “across the function ” property usually claimed for these confidence intervals. Finally the method is applied
Generalized functional linear models
 Ann. Statist
, 2005
"... We propose a generalized functional linear regression model for a regression situation where the response variable is a scalar and the predictor is a random function. A linear predictor is obtained by forming the scalar product of the predictor function with a smooth parameter function, and the expe ..."
Abstract

Cited by 40 (5 self)
 Add to MetaCart
We propose a generalized functional linear regression model for a regression situation where the response variable is a scalar and the predictor is a random function. A linear predictor is obtained by forming the scalar product of the predictor function with a smooth parameter function, and the expected value of the response is related to this linear predictor via a link function. If in addition a variance function is specified, this leads to a functional estimating equation which corresponds to maximizing a functional quasilikelihood. This general approach includes the special cases of the functional linear model, as well as functional Poisson regression and functional binomial regression. The latter leads to procedures for classification and discrimination of stochastic processes and functional data. We also consider the situation where the link and variance functions are unknown and are estimated nonparametrically from the data, using a semiparametric quasilikelihood procedure. An essential step in our proposal is dimension reduction by approximating the predictor processes with a truncated KarhunenLoève expansion. We develop asymptotic inference for the proposed class of generalized regression models. In the proposed asymptotic approach, the truncation parameter increases with sample size, and a martingale central limit theorem is applied to establish the resulting increasing dimension asymptotics. We establish asymptotic normality for a properly scaled distance
Waveletbased functional mixed models
 Journal of the Royal Statistical Society, Series B
, 2006
"... Summary. Increasingly, scientific studies yield functional data, in which the ideal units of observation are curves and the observed data consist of sets of curves that are sampled on a fine grid. We present new methodology that generalizes the linear mixed model to the functional mixed model framew ..."
Abstract

Cited by 37 (10 self)
 Add to MetaCart
Summary. Increasingly, scientific studies yield functional data, in which the ideal units of observation are curves and the observed data consist of sets of curves that are sampled on a fine grid. We present new methodology that generalizes the linear mixed model to the functional mixed model framework, with model fitting done by using a Bayesian waveletbased approach. This method is flexible, allowing functions of arbitrary form and the full range of fixed effects structures and betweencurve covariance structures that are available in the mixed model framework. It yields nonparametric estimates of the fixed and randomeffects functions as well as the various betweencurve and withincurve covariance matrices.The functional fixed effects are adaptively regularized as a result of the nonlinear shrinkage prior that is imposed on the fixed effects’ wavelet coefficients, and the randomeffect functions experience a form of adaptive regularization because of the separately estimated variance components for each wavelet coefficient. Because we have posterior samples for all model quantities, we can perform pointwise or joint Bayesian inference or prediction on the quantities of the model.The adaptiveness of the method makes it especially appropriate for modelling irregular functional data that are characterized by numerous local features like peaks.
Nonparametric Function Estimation for Clustered Data When the Predictor is Measured Without/With Error
 Journal of the American Statistical Association
, 1999
"... We consider local polynomial kernel regression with a single covariate for clustered data using estimating equations. We assume that at most m < # observations are available on each cluster. In the case of random regressors, with no measurement error in the predictor, we show that it is generally ..."
Abstract

Cited by 32 (6 self)
 Add to MetaCart
We consider local polynomial kernel regression with a single covariate for clustered data using estimating equations. We assume that at most m < # observations are available on each cluster. In the case of random regressors, with no measurement error in the predictor, we show that it is generally the best strategy to ignore entirely the correlation structure within each cluster, and instead to pretend that all observations are independent. In the further special case of longitudinal data on individuals with fixed common observation times, we show that equivalent to the pooled data approach is the strategy of fitting separate nonparametric regressions at each observation time and constructing an optimal weighted average. We also consider what happens when the predictor is measured with error. Using the SIMEX approach to correct for measurement error, we construct an asymptotic theory for both the pooled and weighted average estimators. Surprisingly, for the same amount of smoothing, t...
Properties of principal component methods for functional and longitudinal data analysis
 Ann. Statist
, 2006
"... The use of principal component methods to analyze functional data is appropriate in a wide range of different settings. In studies of “functional data analysis, ” it has often been assumed that a sample of random functions is observed precisely, in the continuum and without noise. While this has bee ..."
Abstract

Cited by 32 (5 self)
 Add to MetaCart
The use of principal component methods to analyze functional data is appropriate in a wide range of different settings. In studies of “functional data analysis, ” it has often been assumed that a sample of random functions is observed precisely, in the continuum and without noise. While this has been the traditional setting for functional data analysis, in the context of longitudinal data analysis a random function typically represents a patient, or subject, who is observed at only a small number of randomly distributed points, with nonnegligible measurement error. Nevertheless, essentially the same methods can be used in both these cases, as well as in the vast number of settings that lie between them. How is performance affected by the sampling plan? In this paper we answer that question. We show that if there is a sample of n functions, or subjects, then estimation of eigenvalues is a semiparametric problem, with rootn consistent estimators, even if only a few observations are made of each function,