## Spline adaptation in extended linear models (2002)

Venue: | Statistical Science |

Citations: | 16 - 2 self |

### BibTeX

@ARTICLE{Hansen02splineadaptation,

author = {Mark H. Hansen and Charles Kooperberg},

title = {Spline adaptation in extended linear models},

journal = {Statistical Science},

year = {2002},

volume = {17},

pages = {2--51}

}

### OpenURL

### Abstract

Abstract. In many statistical applications, nonparametric modeling can provide insight into the features of a dataset that are not obtainable by other means. One successful approach involves the use of (univariate or multivariate) spline spaces. As a class, these methods have inherited much from classical tools for parametric modeling. For example, stepwise variable selection with spline basis terms is a simple scheme for locating knots (breakpoints) in regions where the data exhibit strong, local features. Similarly, candidate knot con gurations (generated by this or some other search technique), are routinely evaluated with traditional selection criteria like AIC or BIC. In short, strategies typically applied in parametric model selection have proved useful in constructing exible, low-dimensional models for nonparametric problems. Until recently, greedy, stepwise procedures were most frequently suggested in the literature. Researchinto Bayesian variable selection, however, has given rise to a number of new spline-based methods that primarily rely on some form of Markov chain Monte Carlo to identify promising knot locations. In this paper, we consider various alternatives to greedy, deterministic schemes, and present aBayesian framework for studying adaptation in the context of an extended linear model (ELM). Our major test cases are Logspline density estimation and (bivariate) Triogram regression models. We selected these because they illustrate a number of computational and methodological issues concerning model adaptation that arise in ELMs.

### Citations

4457 |
Classification and Regression Trees
- Breiman, Friedman, et al.
- 1984
(Show Context)
Citation Context ...bits excellent spatial adaptation, capturing the full height of spikes without overfitting smoother regions. And finally, among classification procedures, classification and regression trees or CART (=-=Breiman et al. 1984-=-) is a de facto standard, while the more recent PolyMARS models (Kooperberg et al. 1997) have been able to tackle even large problems in speech recognition. Stone et al. (1997) and a forthcoming monog... |

2771 |
Estimating the dimension of a model
- Schwarz
- 1978
(Show Context)
Citation Context ...e e ects of selection bias (Friedman and Silverman, 1989; Friedman, 1991). In Stone et al. (1997) the default value of a in (8) is log n, resulting in a criterion that is commonly referred to as BIC (=-=Schwarz, 1978-=-). Notice that our search for good knot locations based on the log-likelihood (5) has led to a heuristic minimization of a selection criterion like (7) or (8). Several comments about this reduction ar... |

2765 | Bagging predictors - Breiman - 1996 |

2422 |
A new look at the statistical model identification
- Akaike
- 1974
(Show Context)
Citation Context ...ion criterion like generalized cross validation (GCV) GCVa(t) = RSS(t)n ,^1 \Gammasa(J(t) \Gammas1)n * 2 ; (7) 5sor a variant of "A Information Criterion" (AIC) AICa(t) = \Gamma 2bl(t) + aJ(t) ; (8) (=-=Akaike, 1974-=-) where J(t) is the dimension of the spline space. The parameter a in each of these expressions controls the penalty assigned to models with more knots and is introduced to offset the effects of selec... |

2330 | The Elements of Statistical Learning - Hastie, Tibshirani, et al. - 2001 |

1288 | A practical guide to splines - Boor - 1978 |

1176 | Bayes factors - Kass, Raftery - 1995 |

935 | Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Green
- 1995
(Show Context)
Citation Context ...th and Kohn (1996) for univariate and additive regression models. Similar in spirit are the Bayesian versions of TURBO and CART proposed by Denison et al. (1998ab), which employ reversible jump MCMC (=-=Green, 1995-=-). In a Bayesian setup, model uncertainty comes from both the structural aspects of the space G { knot placement { as well as from our selection of members g 2 G { determining coe cients in expression... |

655 | Graphical models - Jordan - 2004 |

542 |
Nonparametric regression and generalized linear models: a roughness penalty approach
- Green, Silverman
- 1994
(Show Context)
Citation Context ...ss of such priors that given a basis for G and an expansion (3) involves the coe cients =( 1;::: ; J). This amounts to a partially improper, normal distribution for (Silverman, 1985; Wahba, 1990; and =-=Green and Silverman, 1994-=-), which we will return to in Section 2. Given a prior for spline functions g, we can generate a sample from the posterior distribution of g using Markov chain Monte Carlo (MCMC). In particular, in Se... |

418 | Spline Functions: Basic Theory - Schumaker - 1981 |

361 | Regression shrinkage and selection via the - Tibshirani - 1996 |

315 | The minimum description length principle in coding and modeling - Barron, Rissanen, et al. - 1998 |

258 |
Multivariate Adaptive Regression Splines (with discussion
- Friedman
- 1991
(Show Context)
Citation Context ...nes are at the heart of many popular techniques for nonparametric function estimation. For regression problems, TURBO (Friedman and Silverman, 1989), multivariate adaptive regression splines or MARS (=-=Friedman, 1991-=-) and (Breiman, 1991) have all met with considerable success. In the context of density estimation, the Logspline procedure of Kooperberg and Stone (1991, 1992) exhibits excellent spatial adaptation, ... |

236 | Flexible smoothing with B-splines and penalties (with discussion - Eilers, Marx - 1996 |

211 | Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81:27–40 - Liu, Wong, et al. - 1994 |

197 |
Some Aspects of the Spline Smoothing Approach to Non-parametric Regression Curve Fitting” (with discussion
- SILVERMAN
- 1985
(Show Context)
Citation Context ...e on smoothing splines we nd a class of such priors that given a basis for G and an expansion (3) involves the coe cients =( 1;::: ; J). This amounts to a partially improper, normal distribution for (=-=Silverman, 1985-=-; Wahba, 1990; and Green and Silverman, 1994), which we will return to in Section 2. Given a prior for spline functions g, we can generate a sample from the posterior distribution of g using Markov ch... |

178 | Additive regression and other nonparametric models - Stone - 1985 |

158 | The use of polynomial splines and their tensor products in multivariate function estimation (with discussion),” The Annals of Statistics
- Stone
- 1994
(Show Context)
Citation Context ...ar models Extended linear models (ELMs) were originally de ned as a theoretical tool for understanding the properties of spline-based procedures in a large class of estimation problems (Hansen, 1994; =-=Stone et al. 1997-=-; Huang, 1998, 2001). This class is extremely rich, containing all of the standard generalized linear models as well as density and conditional density estimation, hazard regression, censored regressi... |

156 | Model Selection and the Principle of Minimum Description Length - Hansen, Yu - 2001 |

153 | Nonparametric Regression Using Bayesian Variable Selection
- Smith, Kohn
- 1996
(Show Context)
Citation Context ...ted spline space G can be imposed by restricting the placement of t1;::: ;tK through p(tjK). While other authors have also considered a discrete set of candidate knot sequences (Denison et al. 1998a; =-=Smith and Kohn, 1996-=-), we could also specify a distribution that treats the elements of t as continuous variables (e.g. Green 1995). In our experiments we have found that for Logspline density estimation the discrete app... |

149 | A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion - KASS, WASSERMAN - 1995 |

128 | Calibration and empirical Bayes variable selection - George, Foster - 2000 |

117 | Improper Priors, Spline Smoothing and the Problem of Guarding Against Model Errors in Regression - WAHBA - 1978 |

114 | Bayesian Methods for Nonlinear Classification and Regression - DENISON, HOLMES, et al. - 2002 |

102 | Hinging hyperplanes for regression, classification, and function approximation - Breiman - 1993 |

99 | Variational methods for the solution of problems of equilibrium and vibrations - Courant - 1943 |

95 | Data dependent triangulations for piecewise linear interpolation. IMA - Dyn, Levin, et al. - 1990 |

94 | Hazard regression
- Kooperberg, Stone, et al.
- 1995
(Show Context)
Citation Context ...over tting smoother regions. And nally, among classi cation procedures, classi cation and regression trees or CART (Breiman et al. 1984) is a de facto standard, while the more recent PolyMARS models (=-=Kooperberg et al. 1997-=-) have been able to tackle even large problems in speech recognition. Stone et al. (1997) and a forthcoming monograph by Hansen, Huang, Kooperberg, Stone and Truong are the prime references for the ap... |

93 |
Flexible parsimonious smoothing and additive modeling (with discussion
- Friedman, Silverman
- 1989
(Show Context)
Citation Context ...Linear splines; Multivariate splines; Regression. 1. Introduction Polynomial splines are at the heart of many popular techniques for nonparametric function estimation. For regression problems, TURBO (=-=Friedman and Silverman, 1989-=-), multivariate adaptive regression splines or MARS (Friedman, 1991) and (Breiman, 1991) have all met with considerable success. In the context of density estimation, the Logspline procedure of Kooper... |

91 | Smoothing spline models for analysis of nested and crossed samples of curves - Brumback, Rice - 1998 |

91 | Markov fields and loglinear interaction models for contingency tables - Darroch, Lauritzen, et al. - 1980 |

89 | The Consistency of Posterior Distributions in Nonparametric Problems - Barron, Schervish, et al. - 1999 |

85 | Generalized partially linear single-index models - CARROLL, FAN, et al. - 1997 |

85 | Quantile Smoothing Splines - Koenker, Ng, et al. - 1994 |

82 | On Assessing Prior Distributions and Bayesian Regression Analysis with g-prior Distribution - Zellner - 1986 |

78 | A statistical paradox - LINDLEY - 1957 |

78 | A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problem - WAHBA - 1985 |

72 | Locally Equiangular Triangulations - Sibson - 1978 |

71 | On the estimation of a probability density function by the maximum penalized likelihood method. The Annals of Statistics 10 - Silverman - 1982 |

67 | Hybrid adaptive splines - Luo, Wahba - 1997 |

64 | Bayes factors and choice criteria for linear models - Smith, Spiegelhalter - 1980 |

62 | Bayesian CART Model Search (with discussion - Chipman, George, et al. - 1998 |

59 | Selecting the number of knots for penalized splines - Ruppert - 2002 |

57 | Inference in generalized additive mixed models by using smoothing splines - Lin, Zhang - 1999 |

51 | Rates of convergence of posterior distributions - Shen, Wasserman - 2001 |

50 | High dimensional integration of smooth functions over cubes - Novak, Ritter - 1996 |

50 | Logspline density estimation for censored data - Kooperberg, Stone - 1992 |

49 | Minimal roughness property of the delaunay triangulation - Rippa - 1990 |

48 | Logspline density estimation - STONE, Koo - 1986 |