Results 1  10
of
198
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 540 (2 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 332 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
A Theory of Networks for Approximation and Learning
 Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract

Cited by 208 (24 self)
 Add to MetaCart
Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of threelayer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the wellknown Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
Polynomial Splines and Their Tensor Products in Extended Linear Modeling
 Ann. Statist
, 1997
"... ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to m ..."
Abstract

Cited by 158 (16 self)
 Add to MetaCart
ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to model the main effects, and their tensor products are used to model any interaction components that are included. In the special context of survival analysis, the baseline hazard function is modeled and nonproportionality is allowed. In general, the theory involves the L 2 rate of convergence for the fitted model and its components. The methodology involves least squares and maximum likelihood estimation, stepwise addition of basis functions using Rao statistics, stepwise deletion using Wald statistics, and model selection using BIC, crossvalidation or an independent test set. Publically available software, written in C and interfaced to S/SPLUS, is used to apply this methodology to...
Large Sample Sieve Estimation of SemiNonparametric Models
 Handbook of Econometrics
, 2007
"... Often researchers find parametric models restrictive and sensitive to deviations from the parametric specifications; seminonparametric models are more flexible and robust, but lead to other complications such as introducing infinite dimensional parameter spaces that may not be compact. The method o ..."
Abstract

Cited by 113 (14 self)
 Add to MetaCart
Often researchers find parametric models restrictive and sensitive to deviations from the parametric specifications; seminonparametric models are more flexible and robust, but lead to other complications such as introducing infinite dimensional parameter spaces that may not be compact. The method of sieves provides one way to tackle such complexities by optimizing an empirical criterion function over a sequence of approximating parameter spaces, called sieves, which are significantly less complex than the original parameter space. With different choices of criteria and sieves, the method of sieves is very flexible in estimating complicated econometric models. For example, it can simultaneously estimate the parametric and nonparametric components in seminonparametric models with or without constraints. It can easily incorporate prior information, often derived from economic theory, such as monotonicity, convexity, additivity, multiplicity, exclusion and nonnegativity. This chapter describes estimation of seminonparametric econometric models via the method of sieves. We present some general results on the large sample properties of the sieve estimates, including consistency of the sieve extremum estimates, convergence rates of the sieve Mestimates, pointwise normality of series estimates of regression functions, rootn asymptotic normality and efficiency of sieve estimates of smooth functionals of infinite dimensional parameters. Examples are used to illustrate the general results.
Fitting a Bivariate Additive Model by Local Polynomial Regression
, 1996
"... While the additive model is a popular nonparametric regression method, many of its theoretical properties are not well understood, especially when the backfitting algorithm is used for computation of the the estimators. This article explores those properties when the additive model is fitted by loca ..."
Abstract

Cited by 56 (11 self)
 Add to MetaCart
While the additive model is a popular nonparametric regression method, many of its theoretical properties are not well understood, especially when the backfitting algorithm is used for computation of the the estimators. This article explores those properties when the additive model is fitted by local polynomial regression. Sufficient conditions guaranteeing the asymptotic existence of unique estimators for the bivariate additive model are given. Asymptotic approximations to the bias and the variance of a homoskedastic bivariate additive model with local polynomial terms are computed. This model is shown to have the same rate of convergence as that of univariate local polynomial regression. We also investigate the estimation of derivatives of the additive component functions.
Partial linear Models
, 2000
"... there has been increasing interest and activity in the general area of partially linear regression smoothing in statistics. Many methods and techniques have been proposed and studied. This monograph hopes to bring an uptodate presentation of the state of the art of partially linear regression tech ..."
Abstract

Cited by 47 (12 self)
 Add to MetaCart
there has been increasing interest and activity in the general area of partially linear regression smoothing in statistics. Many methods and techniques have been proposed and studied. This monograph hopes to bring an uptodate presentation of the state of the art of partially linear regression techniques. The emphasis of this monograph is on methodologies rather than on the theory, with a particular focus on applications of partially linear regression techniques to various statistical problems. These problems include least squares regression, asymptotically efficient estimation, bootstrap resampling, censored data analysis, linear measurement error models, nonlinear measurement models, nonlinear and nonparametric time series models. We hope that this monograph will serve as a useful reference for theoretical and applied statisticians and to graduate students and others who are interested in the area of partially linear regression. While advanced mathematical ideas have been valuable in some of the theoretical development, the methodological power of partially linear regression can be demonstrated and discussed without
Smoothing Spline ANOVA with ComponentWise Bayesian "Confidence Intervals"
 Journal of Computational and Graphical Statistics
, 1992
"... We study a multivariate smoothing spline estimate of a function of several variables, based on an ANOVA decomposition as sums of main effect functions (of one variable), twofactor interaction functions (of two variables), etc. We derive the Bayesian "confidence intervals" for the componen ..."
Abstract

Cited by 47 (17 self)
 Add to MetaCart
(Show Context)
We study a multivariate smoothing spline estimate of a function of several variables, based on an ANOVA decomposition as sums of main effect functions (of one variable), twofactor interaction functions (of two variables), etc. We derive the Bayesian "confidence intervals" for the components of this decomposition and demonstrate that, even with multiple smoothing parameters, they can be efficiently computed using the publicly available code RKPACK, which was originally designed just to compute the estimates. We carry out a small Monte Carlo study to see how closely the actual properties of these componentwise confidence intervals match their nominal confidence levels. Lastly, we analyze some lake acidity data as a function of calcium concentration, latitude, and longitude, using both polynomial and thin plate spline main effects in the same model. KEY WORDS: Bayesian "confidence intervals"; Multivariate function estimation; RKPACK; Smoothing spline ANOVA. Chong Gu chong@pop.stat.pur...
Projection estimation in multiple regression with application to functional ANOVA models
 Ann. Statist
, 1998
"... A general theory on rates of convergence of the leastsquares projection estimate in multiple regression is developed. The theory is applied to the functional ANOVA model, where the multivariate regression function is modeled as a specified sum of a constant term, main effects Žfunctions of one vari ..."
Abstract

Cited by 39 (11 self)
 Add to MetaCart
A general theory on rates of convergence of the leastsquares projection estimate in multiple regression is developed. The theory is applied to the functional ANOVA model, where the multivariate regression function is modeled as a specified sum of a constant term, main effects Žfunctions of one variable. and selected interaction terms Žfunctions of two or more variables.. The leastsquares projection is onto an approximating space constructed from arbitrary linear spaces of functions and their tensor products respecting the assumed ANOVA structure of the regression function. The linear spaces that serve as building blocks can be any of the ones commonly used in practice: polynomials, trigonometric polynomials, splines, wavelets and finite elements. The rate of convergence result that is obtained reinforces the intuition that loworder ANOVA modeling can achieve dimension reduction and thus overcome the curse of dimensionality. Moreover, the components of the projection estimate in an appropriately defined ANOVA decomposition provide consistent estimates of the corresponding components of the regression function. When the regression function does not satisfy the assumed ANOVA form, the projection estimate converges to its best approximation of that form.
Piecewisepolynomial regression trees
 Statistica Sinica
, 1994
"... A nonparametric function 1 estimation method called SUPPORT (“Smoothed and Unsmoothed PiecewisePolynomial Regression Trees”) is described. The estimate is typically made up of several pieces, each piece being obtained by fitting a polynomial regression to the observations in a subregion of the data ..."
Abstract

Cited by 39 (7 self)
 Add to MetaCart
A nonparametric function 1 estimation method called SUPPORT (“Smoothed and Unsmoothed PiecewisePolynomial Regression Trees”) is described. The estimate is typically made up of several pieces, each piece being obtained by fitting a polynomial regression to the observations in a subregion of the data space. Partitioning is carried out recursively as in a treestructured method. If the estimate is required to be smooth, the polynomial pieces may be glued together by means of weighted averaging. The smoothed estimate is thus obtained in three steps. In the first step, the regressor space is recursively partitioned until the data in each piece are adequately fitted by a polynomial of a fixed order. Partitioning is guided by analysis of the distributions of residuals and crossvalidation estimates of prediction mean square error. In the second step, the data within a neighborhood of each partition are fitted by a polynomial. The final estimate of the regression function is obtained by averaging the polynomial pieces, using smooth weight functions each of which diminishes rapidly to zero outside its associated partition. Estimates of derivatives of the regression function may be