Results 1  10
of
53
The variable selection problem
 Journal of the American Statistical Association
, 2000
"... The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables ..."
Abstract

Cited by 44 (3 self)
 Add to MetaCart
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. This vignette reviews some of the key developments which have led to the wide variety of approaches for this problem. 1
Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV
 Ann. Statist
"... (ranGACV) method for choosing multiple smoothing parameters in penalized likelihood estimates for Bernoulli data. The method is intended for application with penalized likelihood smoothing spline ANOVA models. In addition we propose a class of approximate numerical methods for solving the penalized ..."
Abstract

Cited by 44 (19 self)
 Add to MetaCart
(ranGACV) method for choosing multiple smoothing parameters in penalized likelihood estimates for Bernoulli data. The method is intended for application with penalized likelihood smoothing spline ANOVA models. In addition we propose a class of approximate numerical methods for solving the penalized likelihood variational problem which, in conjunction with the ranGACV method allows the application of smoothing spline ANOVA models with Bernoulli data to much larger data sets than previously possible. These methods are based on choosing an approximating subset of the natural (representer) basis functions for the variational problem. Simulation studies with synthetic data, including synthetic data mimicking demographic risk factor data sets is used to examine the properties of the method and to compare the approach with the GRKPACK code of Wang (1997c). Bayesian “confidence intervals ” are obtained for the fits and are shown in the simulation studies to have the “across the function ” property usually claimed for these confidence intervals. Finally the method is applied
Component selection and smoothing in multivariate nonparametric regression
"... We propose a new method for model selection and model fitting in multivariate nonparametric regression models, in the framework of smoothing spline ANOVA. The “COSSO ” is a method of regularization with the penalty functional being the sum of component norms, instead of the squared norm employed in ..."
Abstract

Cited by 39 (0 self)
 Add to MetaCart
(Show Context)
We propose a new method for model selection and model fitting in multivariate nonparametric regression models, in the framework of smoothing spline ANOVA. The “COSSO ” is a method of regularization with the penalty functional being the sum of component norms, instead of the squared norm employed in the traditional smoothing spline method. The COSSO provides a unified framework for several recent proposals for model selection in linear models and smoothing spline ANOVA models. Theoretical properties, such as the existence and the rate of convergence of the COSSO estimator, are studied. In the special case of a tensor product design with periodic functions, a detailed analysis reveals that the COSSO does model selection by applying a novel soft thresholding type operation to the function components. We give an equivalent formulation of the COSSO estimator which leads naturally to an iterative algorithm. We compare the COSSO with MARS, a popular method that builds functional ANOVA models, in simulations and real examples. The COSSO method can be extended to classification problems and we compare its performance with those of a number of machine learning algorithms on real datasets. The COSSO gives very competitive performance in these studies. 1. Introduction. Consider
Regression approaches for microarray data analysis
 Journal of Computational Biology
"... A variety of new procedures have been devised to handle the twosample comparison (e.g., tumor versus normal tissue) of gene expression values as measured with microarrays. Such new methods are required in part because of some de � ning characteristics of microarraybased studies: (i) the very large ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
(Show Context)
A variety of new procedures have been devised to handle the twosample comparison (e.g., tumor versus normal tissue) of gene expression values as measured with microarrays. Such new methods are required in part because of some de � ning characteristics of microarraybased studies: (i) the very large number of genes contributing expression measures which far exceeds the number of samples (observations) available and (ii) the fact that by virtue of pathway/network relationships, the gene expression measures tend to be highly correlated. These concerns are exacerbated in the regression setting, where the objective is to relate gene expression, simultaneously for multiple genes, to some external outcome or phenotype. Correspondingly, several methods have been recently proposed for addressing these issues. We brie � y critique some of these methods prior to a detailed evaluation of gene harvesting. This reveals that gene harvesting, without additional constraints, can yield artifactual solutions. Results obtained employing such constraints motivate the use of regularized regression procedures such as the lasso, least angle regression, and support vector machines. Model selection and solution multiplicity issues are also discussed. The methods are evaluated using a microarraybased study of cardiomyopathy in transgenic mice.
A proximal iteration for deconvolving Poisson noisy images using sparse representations
"... ..."
(Show Context)
Data mining criteria for treebased regression and classification
 In Proceedings KDD
, 2001
"... This paper is concerned with the construction of regression and classification trees that are more adapted to data mining applications than conventional trees. To this end, we propose new splitting criteria for growing trees. Conventional splitting criteria attempt to perform well on both sides of a ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
This paper is concerned with the construction of regression and classification trees that are more adapted to data mining applications than conventional trees. To this end, we propose new splitting criteria for growing trees. Conventional splitting criteria attempt to perform well on both sides of a split by attempting a compromise in the quality of fit between the left and the right side. By contrast, we adopt a data mining point of view by proposing criteria that search for interesting subsets of the data, as opposed to modeling all of the data equally well. The new criteria do not split based on a compromise between the left and the right bucket; they effectively pick the more interesting bucket and ignore the other. As expected, the result is often a simpler characterization of interesting subsets of the data. Less expected is that the new criteria often yield whole trees that provide more interpretable data descriptions. Surprisingly, it is a “flaw ” that works to their advantage: The new criteria have an increased tendency to accept splits near the boundaries of the predictor ranges. This socalled “endcut problem ” leads to the repeated peeling of small layers of data and results in very unbalanced but highly expressive and interpretable trees. 1
Quantile Regression in Reproducing Kernel Hilbert Spaces
"... In this article we consider quantile regression in reproducing kernel Hilbert spaces, which we call kernel quantile regression (KQR). We make three contributions: (1) we propose an efficient algorithm that computes the entire solution path of the KQR, with essentially the same computational cost as ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
In this article we consider quantile regression in reproducing kernel Hilbert spaces, which we call kernel quantile regression (KQR). We make three contributions: (1) we propose an efficient algorithm that computes the entire solution path of the KQR, with essentially the same computational cost as fitting one KQR model; (2) we derive a simple formula for the effective dimension of the KQR model, which allows convenient selection of the regularization parameter; and (3) we develop an asymptotic theory for the KQR model.
Microarray gene expression data with linked survival phenotypes: diffuse largeBcell lymphoma revisited
 Biostatistics
, 2006
"... Abstract: Diffuse largeBcell lymphoma (DLBCL) is an aggressive malignancy of mature B lymphocytes and is the most common type of lymphoma in adults. While treatment advances have been substantial in what was formerly a fatal disease, less than 50 % of patients achieve lasting remission. In an eff ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Abstract: Diffuse largeBcell lymphoma (DLBCL) is an aggressive malignancy of mature B lymphocytes and is the most common type of lymphoma in adults. While treatment advances have been substantial in what was formerly a fatal disease, less than 50 % of patients achieve lasting remission. In an effort to predict treatment success and explain disease heterogeneity clinical features have been employed for prognostic purposes, but have yielded only modest predictive performance. This has spawned a series of high profile microarraybased gene expression studies of DLBCL, in the hope that molecular level information could be used to refine prognosis. The intent of this paper is to reevaluate these microarraybased prognostic assessments, and extend the statistical methodology that has been used in this context. Methodological challenges arise in using patients ’ gene expression profiles to predict survival endpoints on account of the large number of genes and their complex interdependence. We initially focus on the Lymphochip data and analysis of Rosenwald et al., (2002). After describing relationships between the analyses performed and gene harvesting (Hastie et al., 2001), we argue for the utility of penalized approaches, in particular LARSLasso (Efron et al., 2004). While these techniques have been extended to the proportional hazards / partial likelihood framework, the resultant algorithms are computationally burdensome. We develop residualbased approximations that eliminate this burden yet perform similarly. Comparisons of predictive accuracy across both methods and studies are effected using timedependent ROC curves. These indicate that gene expression data, in turn, only delivers modest predictions of post therapy DLBCL survival. We conclude by outlining possibilities for further work.
When do stepwise algorithms meet subset selection criteria?
, 2007
"... Recent results in homotopy and solution paths demonstrate that certain welldesigned greedy algorithms, with a range of values of the algorithmic parameter, can provide solution paths to a sequence of convex optimization problems. On the other hand, in regression many existing criteria in subset sel ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Recent results in homotopy and solution paths demonstrate that certain welldesigned greedy algorithms, with a range of values of the algorithmic parameter, can provide solution paths to a sequence of convex optimization problems. On the other hand, in regression many existing criteria in subset selection (including Cp, AIC, BIC, MDL, RIC, etc.) involve optimizing an objective function that contains a counting measure. The two optimization problems are formulated as (P1) and (P0) in the present paper. The latter is generally combinatoric and has been proven to be NPhard. We study the conditions under which the two optimization problems have common solutions. Hence, in these situations a stepwise algorithm can be used to solve the seemingly unsolvable problem. Our main result is motivated by recent work in sparse representation, while two others emerge from different angles: a direct analysis of sufficiency and necessity and a condition on the mostly correlated covariates. An extreme example connected with least angle regression is of independent interest.
Regularization Methods for Additive Models
 LECT. NOTES COMPUT. SCI
, 2003
"... This paper tackles the problem of model complexity in the context of additive models. Several methods have been proposed to estimate smoothing parameters, as well as to perform variable selection. Nevertheless, ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
This paper tackles the problem of model complexity in the context of additive models. Several methods have been proposed to estimate smoothing parameters, as well as to perform variable selection. Nevertheless,