Results 1 
3 of
3
Smoothing Spline ANOVA for Exponential Families, with Application to the Wisconsin Epidemiological Study of Diabetic Retinopathy
 ANN. STATIST
, 1995
"... Let y i ; i = 1; \Delta \Delta \Delta ; n be independent observations with the density of y i of the form h(y i ; f i ) = exp[y i f i \Gammab(f i )+c(y i )], where b and c are given functions and b is twice continuously differentiable and bounded away from 0. Let f i = f(t(i)), where t = (t 1 ; \De ..."
Abstract

Cited by 101 (46 self)
 Add to MetaCart
(Show Context)
Let y i ; i = 1; \Delta \Delta \Delta ; n be independent observations with the density of y i of the form h(y i ; f i ) = exp[y i f i \Gammab(f i )+c(y i )], where b and c are given functions and b is twice continuously differentiable and bounded away from 0. Let f i = f(t(i)), where t = (t 1 ; \Delta \Delta \Delta ; t d ) 2 T (1)\Omega \Delta \Delta \Delta\Omega T (d) = T , the T (ff) are measureable spaces of rather general form, and f is an unknown function on T with some assumed `smoothness' properties. Given fy i ; t(i); i = 1; \Delta \Delta \Delta ; ng, it is desired to estimate f(t) for t in some region of interest contained in T . We develop the fitting of smoothing spline ANOVA models to this data of the form f(t) = C + P ff f ff (t ff ) + P ff!fi f fffi (t ff ; t fi ) + \Delta \Delta \Delta. The components of the decomposition satisfy side conditions which generalize the usual side conditions for parametric ANOVA. The estimate of f is obtained as the minimizer...
Approximate Smoothing Spline Methods for Large Data Sets in the Binary Case
 DEPARTMENT OF STATISTICS, UNIVERSITY OF WISCONSIN, MADISON WI
, 1997
"... We consider the use of smoothing splines in generalized additive models with binary responses in the large data set situation. Xiang and Wahba (1996) proposed using the Generalized Approximate Cross Validation (GACV ) function as a method to choose (multiple) smoothing parameters in the binary data ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
We consider the use of smoothing splines in generalized additive models with binary responses in the large data set situation. Xiang and Wahba (1996) proposed using the Generalized Approximate Cross Validation (GACV ) function as a method to choose (multiple) smoothing parameters in the binary data case and demonstrated through simulation that the GACV method compares well to existing iterative methods, as judged by the KullbackLeibler distance of the estimate to the true function being fitted. However, the calculation of the GACV function involves solving an n by n linear system, where n is the sample size. As the sample size increases, the calculation becomes numerically unstable and infeasible. To reduce these computational problems we propose a randomized version of the GACV function, which is numerically stable. Furthermore, we use a clustering algorithm to choose a set of basis functions with which to approximate the exact additive smoothing spline estimate, which has a basis function for every data point. Combining these two approaches, we are able to extend smoothing spline methods in the binary response case to much larger data sets without sacrificing much accuracy.
Model Fitting and Testing for NonGaussian Data with Large Data Sets
, 1996
"... We consider the application of the smoothing spline to the generalized linear model in large data set situations. First we derive a Generalized Approximate Cross Validation function (GACV ), which is an approximate leaveoutone cross validation function used to choose smoothing parameters. In order ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
We consider the application of the smoothing spline to the generalized linear model in large data set situations. First we derive a Generalized Approximate Cross Validation function (GACV ), which is an approximate leaveoutone cross validation function used to choose smoothing parameters. In order to apply the GACV function to a large data set situation, we propose a corresponding randomized version of it. To reduce the computational intensity of calculating the smoothing spline estimate, we suggest an approximate solution and a clustering method to choose a subset of the basis functions. Combining randomized GACV with this approximate solution, we apply it to binary response data from the Wisconsin Epidemiological Study of Diabetic Retinopathy in order to establish the accuracy of the model when applied to a large data set.