Results 1 - 10
of
43
Generalized Additive Models
, 1995
"... This article describes flexible statistical methods that may be used to identify and characterize nonlinear regression effects. These methods are called "generalized additive models". For example, a commonly used statistical model in medical research is the logistic regression model for binary data. ..."
Abstract
-
Cited by 968 (32 self)
- Add to MetaCart
This article describes flexible statistical methods that may be used to identify and characterize nonlinear regression effects. These methods are called "generalized additive models". For example, a commonly used statistical model in medical research is the logistic regression model for binary data. Here we relate the mean of the binary response ¯ = P (y = 1) to the predictors via a linear regression model and the logit link function: log
Additive Logistic Regression: a Statistical View of Boosting
- Annals of Statistics
, 1998
"... Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of the most important recent developments in classification methodology. The performance of many classification algorithms can often be dramatically improved by sequentially applying them to reweighted versions of the input data, and t ..."
Abstract
-
Cited by 896 (20 self)
- Add to MetaCart
Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of the most important recent developments in classification methodology. The performance of many classification algorithms can often be dramatically improved by sequentially applying them to reweighted versions of the input data, and taking a weighted majority vote of the sequence of classifiers thereby produced. We show that this seemingly mysterious phenomenon can be understood in terms of well known statistical principles, namely additive modeling and maximum likelihood. For the two-class problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Direct multi-class generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multi-class generalizations of boosting in most...
Penalized Discriminant Analysis
- Annals of Statistics
, 1995
"... Fisher's linear discriminant analysis (LDA) is a popular data-analytic tool for studying the relationship between a set of predictors and a categorical response. In this paper we describe a penalized version of LDA. It is designed for situations in which there are many highly correlated predictors, ..."
Abstract
-
Cited by 98 (8 self)
- Add to MetaCart
Fisher's linear discriminant analysis (LDA) is a popular data-analytic tool for studying the relationship between a set of predictors and a categorical response. In this paper we describe a penalized version of LDA. It is designed for situations in which there are many highly correlated predictors, such as those obtained by discretizing a function, or the greyscale values of the pixels in a series of images. In cases such as these it is natural, efficient, and sometimes essential to impose a spatial smoothness constraint on the coefficients, both for improved prediction performance and interpretability. We cast the classification problem into a regression framework via optimal scoring. Using this, our proposal facilitates the use of any penalized regression technique in the classification setting. The technique is illustrated with examples in speech recognition and handwritten character recognition. AMS 1991 Classifications: Primary 62H30, Secondary 62G07 1 Introduction Linear discrim...
Local Regression: Automatic Kernel Carpentry
- Statistical Science
, 1993
"... . A kernel smoother is an intuitive estimate of a regression function or conditional expectation; at each point x 0 the estimate of E(Y j x 0 ) is a weighted mean of the sample Y i , with observations close to x 0 receiving the largest weights. Unfortunately this simplicity has flaws. At the boundar ..."
Abstract
-
Cited by 93 (2 self)
- Add to MetaCart
. A kernel smoother is an intuitive estimate of a regression function or conditional expectation; at each point x 0 the estimate of E(Y j x 0 ) is a weighted mean of the sample Y i , with observations close to x 0 receiving the largest weights. Unfortunately this simplicity has flaws. At the boundary of the predictor space, the kernel neighborhood is asymmetric and the estimate may have substantial bias. Bias can be a problem in the interior as well if the predictors are nonuniform or if the regression function has substantial curvature. These problems are particularly severe when the predictors are multidimensional. A variety of kernel modifications have been proposed to provide approximate and asymptotic adjustment for these biases. Such methods generally place substantial restrictions on the regression problems that can be considered; in unfavorable situations, they can perform very poorly. Moreover, the necessary modifications are very difficult to implement in the multidimensional...
Flexible Discriminant Analysis by Optimal Scoring
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1993
"... Fisher's linear discriminant analysis is a valuable tool for multigroup classification. With a large number of predictors, one can nd a reduced number of discriminant coordinate functions that are "optimal" for separating the groups. With two such functions one can produce a classification map that ..."
Abstract
-
Cited by 80 (12 self)
- Add to MetaCart
Fisher's linear discriminant analysis is a valuable tool for multigroup classification. With a large number of predictors, one can nd a reduced number of discriminant coordinate functions that are "optimal" for separating the groups. With two such functions one can produce a classification map that partitions the reduced space into regions that are identified with group membership, and the decision boundaries are linear. This paper is about richer nonlinear classification schemes. Linear discriminant analysis is equivalent to multi-response linear regression using optimal scorings to represent the groups. We obtain nonparametric versions of discriminant analysis by replacing linear regression by any nonparametric regression method. In this way, any multi-response regression technique (such as MARS or neural networks) can be post-processed to improve their classification performence.
Piecewise-polynomial regression trees
- Statistica Sinica
, 1994
"... A nonparametric function 1 estimation method called SUPPORT (“Smoothed and Unsmoothed Piecewise-Polynomial Regression Trees”) is described. The estimate is typically made up of several pieces, each piece being obtained by fitting a polynomial regression to the observations in a subregion of the data ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
A nonparametric function 1 estimation method called SUPPORT (“Smoothed and Unsmoothed Piecewise-Polynomial Regression Trees”) is described. The estimate is typically made up of several pieces, each piece being obtained by fitting a polynomial regression to the observations in a subregion of the data space. Partitioning is car-ried out recursively as in a tree-structured method. If the estimate is required to be smooth, the polynomial pieces may be glued together by means of weighted averaging. The smoothed estimate is thus obtained in three steps. In the first step, the regressor space is recursively partitioned until the data in each piece are adequately fitted by a polynomial of a fixed order. Partitioning is guided by analysis of the distributions of residuals and cross-validation estimates of prediction mean square error. In the sec-ond step, the data within a neighborhood of each partition are fitted by a polynomial. The final estimate of the regression function is obtained by averaging the polynomial pieces, using smooth weight functions each of which diminishes rapidly to zero outside its associated partition. Estimates of derivatives of the regression function may be
REACT Scatterplot Smoothers: Superefficiency through Basis Economy
- J. AMER. STATIST. ASSOC
, 1999
"... ..."
Component Selection and Smoothing in Smoothing Spline Analysis of Variance Models
- COSSO. INSTITUTE OF STATISTICS MIMEO SERIES 2556, NCSU
, 2003
"... We propose a new method for model selection and model fitting in nonparametric regression models, in the framework of smoothing spline ANOVA. The "COSSO" is a method of regularization with the penalty functional being the sum of component norms, instead of the squared norm employed in the traditi ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
We propose a new method for model selection and model fitting in nonparametric regression models, in the framework of smoothing spline ANOVA. The "COSSO" is a method of regularization with the penalty functional being the sum of component norms, instead of the squared norm employed in the traditional smoothing spline method. The COSSO provides a unified framework for several recent proposals for model selection in linear models and smoothing spline ANOVA models. Theoretical properties, such as the existence and the rate of convergence of the COSSO estimator, are studied. In the special case of a tensor product design with periodic functions, a detailed analysis reveals that the COSSO applies a novel soft thresholding type operation to the function components and selects the correct model structure with probability tending to one. We give
Uncertain Reasoning and Forecasting
- International Journal of Forecasting
, 1995
"... We develop a probability forecasting model through a synthesis of Bayesian beliefnetwork models and classical time-series analysis. By casting Bayesian time-series analyses as temporal belief-network problems, weintroduce dependency models that capture richer and more realistic models of dynamic ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
We develop a probability forecasting model through a synthesis of Bayesian beliefnetwork models and classical time-series analysis. By casting Bayesian time-series analyses as temporal belief-network problems, weintroduce dependency models that capture richer and more realistic models of dynamic dependencies. With richer models and associated computational methods, we can movebeyond the rigid classical assumptions of linearityin the relationships among variables and of normality of their probability distributions.
Semilinear high-dimensional model for normalization of microarray data: a theoretical analysis and partial consistency
- J. Amer. Statist. Assoc
, 2005
"... Normalization of microarray data is essential for removing experimental biases and revealing meaningful biological results. Motivated by a problem of normalizing microarray data, a semilinear in-slide model (SLIM) has been proposed. To aggregate information from other arrays, SLIM is generalized to ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
Normalization of microarray data is essential for removing experimental biases and revealing meaningful biological results. Motivated by a problem of normalizing microarray data, a semilinear in-slide model (SLIM) has been proposed. To aggregate information from other arrays, SLIM is generalized to account for across-array information, resulting in an even more dynamic semiparametric regression model. This model can be used to normalize microarray data even when there is no replication within an array. We demonstrate that this semiparametric model has a number of interesting features. The parametric component and the nonparametric component that are of primary interest can be consistently estimated, the former having a parametric rate and the latter having a nonparametric rate, whereas the nuisance parameters cannot be consistently estimated. This is an interesting extension of the partial consistent phenomena, which itself is of theoretical interest. The asymptotic normality for the parametric component and the rate of convergence for the nonparametric component are established. The results are augmented by simulation studies and illustrated by an application to the cDNA microarray analysis of neuroblastoma cells in response to the macrophage migration inhibitory factor.

