Results 1  10
of
35
2005a) Functional data analysis for sparse longitudinal data
 J. Am. Statist. Assoc
"... 54448 and DMS0406430. We are grateful to an Associate Editor and two referees for insightful We propose a nonparametric method to perform functional principal components analysis for the case of sparse longitudinal data. The method aims at irregularly spaced longitudinal data, where the number of r ..."
Abstract

Cited by 59 (19 self)
 Add to MetaCart
54448 and DMS0406430. We are grateful to an Associate Editor and two referees for insightful We propose a nonparametric method to perform functional principal components analysis for the case of sparse longitudinal data. The method aims at irregularly spaced longitudinal data, where the number of repeated measurements available per subject is small. In contrast, classical functional data analysis requires a large number of regularly spaced measurements per subject. We assume that the repeated measurements are randomly located with a random number of repetitions for each subject, and are determined by an underlying smooth random (subjectspecific) trajectory plus measurement errors. Basic elements of our approach are the parsimonious estimation of the covariance structure and mean function of the trajectories, and the estimation of the variance of the measurement errors. The eigenfunction basis is estimated from the data, and functional principal component score estimates are obtained by a conditioning step. This conditional estimation method is conceptually simple and straightforward to implement. A key step is the derivation of asymptotic consistency and distribution results under mild conditions, using tools from functional analysis.
Functional linear regression analysis for longitudinal data
 Ann. of Statist
, 2005
"... We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible
New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis
 J. Am. Statist. Ass
, 2004
"... Semiparametric regression models are very useful for longitudinal data analysis. The complexity of semiparametric models and the structure of longitudinal data pose new challenges to parametric inferences and model selection that frequently arise from longitudinal data analysis. In this article, two ..."
Abstract

Cited by 21 (8 self)
 Add to MetaCart
Semiparametric regression models are very useful for longitudinal data analysis. The complexity of semiparametric models and the structure of longitudinal data pose new challenges to parametric inferences and model selection that frequently arise from longitudinal data analysis. In this article, two new approaches are proposed for estimating the regression coefficients in a semiparametric model. The asymptotic normality of the resulting estimators is established. An innovative class of variable selection procedures is proposed to select significant variables in the semiparametric models. The proposed procedures are distinguished from others in that they simultaneously select significant variables and estimate unknown parameters. Rates of convergence of the resulting estimators are established. With a proper choice of regularization parameters and penalty functions, the proposed variable selection procedures are shown to perform as well as an oracle estimator. A robust standard error formula is derived using a sandwich formula and is empirically tested. Local polynomial regression techniques are used to estimate the baseline function in the semiparametric model.
WaveletBased Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis
 JASA
, 2003
"... this article we develop new methods for analyzing the data from an experiment using rodent models to investigate the effect of type of dietary fat on O methylguanineDNAmethyltransferase (MGMT), an important biomarker in early colon carcinogenesis. The data consist of observed pro# les over a ..."
Abstract

Cited by 20 (11 self)
 Add to MetaCart
this article we develop new methods for analyzing the data from an experiment using rodent models to investigate the effect of type of dietary fat on O methylguanineDNAmethyltransferase (MGMT), an important biomarker in early colon carcinogenesis. The data consist of observed pro# les over a spatial variable contained within a twostage hierarchy, a structure that we dub hierarchical functional data. We present a new method providing a uni# ed framework for modeling these data, simultaneously yielding estimates and posterior samples for mean, individual, and subsamplelevel pro# les, as well as covariance parameters at the various hierarchical levels. Our method is nonparametric in that it does not require the prespeci# cation of parametric forms for the functions and involves modeling in the wavelet space, which is especially effective for spatially heterogeneous functions as encountered in the MGMT data. Our approach is Bayesian; the only informative hyperparameters in our model are effectively smoothing parameters. Analysis of this dataset yields interesting new insights into how MGMT operates in early colon carcinogenesis, and how this may depend on diet. Our method is general, so it can be applied to other settings where hierarchical functional data are encountered
F: Functional additive models
 J Am Stat Assoc
"... In commonly used functional regression models, the regression of a scalar or functional response on the functional predictor is assumed to be linear. This means the response is a linear function of the functional principal component scores of the predictor process. We relax the linearity assumption ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
In commonly used functional regression models, the regression of a scalar or functional response on the functional predictor is assumed to be linear. This means the response is a linear function of the functional principal component scores of the predictor process. We relax the linearity assumption and propose to replace it by an additive structure. This leads to a more widely applicable and much more flexible framework for functional regression models. The proposed functional additive regression models are suitable for both scalar and functional responses. The regularization needed for effective estimation of the regression parameter function is implemented through a projection on the eigenbasis of the covariance operator of the functional components in the model. The utilization of functional principal components in an additive rather than linear way leads to substantial broadening of the scope of functional regression models and emerges as a natural approach, as the uncorrelatedness of the functional principal components is shown to lead to a straightforward implementation of the functional additive model, just based on a sequence of onedimensional smoothing steps and without need for backfitting. This facilitates the theoretical analysis, and we establish asymptotic
Functional data analysis for sparse auction data
 In Statistical Methods in eCommerce Research
, 2008
"... Bid arrivals of eBay auctions often exhibit “bid sniping”, a phenomenon where “snipers ” place their bids at the last moments of an auction. This is one reason why bid histories for eBay auctions tend to have sparse data in the middle and denser data both in the beginning and at the end of the aucti ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
Bid arrivals of eBay auctions often exhibit “bid sniping”, a phenomenon where “snipers ” place their bids at the last moments of an auction. This is one reason why bid histories for eBay auctions tend to have sparse data in the middle and denser data both in the beginning and at the end of the auction. Time spacing of the bids is thus irregular and sparse. For nearly identical products that are auctioned repeatedly, one may view the price history of each of these auctions as realization of an underlying smooth stochastic process, the price process. While the traditional Functional Data Analysis (FDA) approach requires that entire trajectories of the underlying process are observed without noise, this assumption is not satisfied for typical auction data. We provide a review of a recently developed version of functional principal component analysis (Yao et al., 2005), which is geared towards sparse, irregularly observed and noisy data, the principal analysis through conditional expectation (PACE) method. The PACE method borrows and pools information from the sparse data in all auctions. This allows the recovery of the price process even in situations where only few bids are observed. In a modified approach, we adapt PACE to summarize the bid history for varying current times during an ongoing auction through timevarying principal component scores. These scores then serve as timevarying predictors for the closing price. We study the resulting timevarying predictions using both linear regression and generalized additive modelling, with current scores as predictors. These methods will be illustrated with a case study for 157 Palm M515 PDA auctions from eBay, and the proposed methods are seen to work reasonably well. Other related issues will also be discussed. 1 1
FUNCTIONAL LINEAR REGRESSION THAT’S INTERPRETABLE 1
"... Regression models to relate a scalar Y to a functional predictor X(t) are becoming increasingly common. Work in this area has concentrated on estimating a coefficient function, β(t), with Y related to X(t) through β(t)X(t)dt. Regions where β(t) ̸ = 0 correspond to places where there is a relationshi ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Regression models to relate a scalar Y to a functional predictor X(t) are becoming increasingly common. Work in this area has concentrated on estimating a coefficient function, β(t), with Y related to X(t) through β(t)X(t)dt. Regions where β(t) ̸ = 0 correspond to places where there is a relationship between X(t) and Y. Alternatively, points where β(t) = 0indicate no relationship. Hence, for interpretation purposes, it is desirable for a regression procedure to be capable of producing estimates of β(t) that are exactly zero over regions with no apparent relationship and have simple structures over the remaining regions. Unfortunately, most fitting procedures result in an estimate for β(t) that is rarely exactly zero and has unnatural wiggles making the curve hard to interpret. In this article we introduce a new approach which uses variable selection ideas, applied to various derivatives of β(t), to produce estimates that are both interpretable, flexible and accurate. We call our method “Functional Linear Regression That’s Interpretable” (FLiRTI) and demonstrate it on simulated and realworld data sets. In addition, nonasymptotic theoretical bounds on the estimation error are presented. The bounds provide strong theoretical motivation for our approach.
Analysis of Longitudinal Data With Semiparametric Estimation of Covariance Function
, 2005
"... Improving efficiency for regression coefficients and predicting trajectories of individuals are two important aspects in the analysis of longitudinal data. Both involve estimation of the covariance function. Yet challenges arise in estimating the covariance function of longitudinal data collected at ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Improving efficiency for regression coefficients and predicting trajectories of individuals are two important aspects in the analysis of longitudinal data. Both involve estimation of the covariance function. Yet challenges arise in estimating the covariance function of longitudinal data collected at irregular time points. A class of semiparametric models for the covariance function by that imposes a parametric correlation structure while allowing a nonparametric variance function is proposed. A kernel estimator for estimating the nonparametric variance function is developed. Two methods for estimating parameters in the correlation structure—a quasilikelihood approach and a minimum generalized variance method—are proposed. A semiparametric varying coefficient partially linear model for longitudinal data is introduced, and an estimation procedure for model coefficients using a profile weighted least squares approach is proposed. Sampling properties of the proposed estimation procedures are studied, and asymptotic normality of the resulting estimators is established. Finitesample performance of the proposed procedures is assessed by Monte Carlo simulation studies. The proposed methodology is illustrated with an analysis of a real data example. KEY WORDS: Kernel regression; Local linear regression; Profile weighted least squares; Semiparametric varying coefficient model.
Inference for covariate adjusted regression via varying coefficient models
, 2005
"... We consider covariate adjusted regression (CAR), a regression method for situations where predictors and response are observed after being distorted by a multiplicative factor. The distorting factors are unknown functions of an observable covariate, where one specific distorting function is associat ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We consider covariate adjusted regression (CAR), a regression method for situations where predictors and response are observed after being distorted by a multiplicative factor. The distorting factors are unknown functions of an observable covariate, where one specific distorting function is associated with each predictor or response. The dependence of both response and predictors on the same confounding covariate may alter the underlying regression relation between undistorted but unobserved predictors and response. We consider a class of highly flexible adjustment methods for parameter estimation in the underlying regression model, which is the model of interest. Asymptotic normality of the estimates is obtained by establishing a connection to varying coefficient models. These distribution results combined with proposed consistent estimates of the asymptotic variance are used for the construction of asymptotic confidence intervals for the regression coefficients. The proposed approach is illustrated with data on serum creatinine, and finite sample properties of the proposed procedures are investigated through a simulation study. Key words and phrases. Asymptotic normality, binning, confidence intervals, multiple regression, multiplicative effects, varying coefficient model.
Covariate adjusted regression
 Biometrika
, 2005
"... Abstract: The method of covariate adjusted regression was recently proposed for situations where both predictors and response in a regression model are not directly observed, but are observed after being contaminated by unknown functions of a common observable confounder in a multiplicative fashion. ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract: The method of covariate adjusted regression was recently proposed for situations where both predictors and response in a regression model are not directly observed, but are observed after being contaminated by unknown functions of a common observable confounder in a multiplicative fashion. One example is data collected for a study on diabetes, where the variables of interest, systolic and diastolic blood pressures and glycosolated hemoglobin levels are known to be influenced by an observable confounder, body mass index. An estimation procedure based on equidistant binning (EB), currently available, gives consistent estimators for the regression coefficients adjusted for the confounder. In this paper, we propose two new estimation procedures based on nearest neighbor binning (NB) and local polynomial modeling (LP). Even though, the three methods perform similarly in terms of their bias, it is shown through simulation studies that NB has smaller variance compared to EB, and LP yields substantially lower variance relative to the two binning methods for small to moderate sample sizes. The consistency and convergence rates of the proposed estimators of LP, with the smallest MSE, are also established. We illustrate the proposed method of LP with the above mentioned diabetes data, where the goal is to uncover the regression relation between the response, glycosolated hemoglobin levels, and the predictors, systolic and diastolic blood pressures, adjusted for body mass index.