Results 1 - 10
of
18
Model selection and estimation in regression with grouped variables
- Journal of the Royal Statistical Society, Series B
, 2006
"... We consider the problem of selecting grouped variables (factors) for accurate predic-tion in regression. Such a problem arises naturally in many practical situations with the multi-factor ANOVA problem as the most important and well known example. Instead of selecting factors by stepwise backward el ..."
Abstract
-
Cited by 238 (5 self)
- Add to MetaCart
We consider the problem of selecting grouped variables (factors) for accurate predic-tion in regression. Such a problem arises naturally in many practical situations with the multi-factor ANOVA problem as the most important and well known example. Instead of selecting factors by stepwise backward elimination, we focus on estimation accuracy and consider extensions of the LASSO, the LARS, and the nonnegative garrote for factor selection. The LASSO, the LARS, and the nonnegative garrote are recently proposed regression methods that can be used to select individual variables. We study and propose efficient algorithms for the extensions of these methods for factor selection, and show that these extensions give superior performance to the traditional stepwise backward elimination method in factor selection problems. We study the similarities and the differences among these methods. Simulations and real examples are used to illustrate the methods.
Mixtures of g-priors for Bayesian variable selection
- Journal of the American Statistical Association
, 2008
"... Zellner’s g-prior remains a popular conventional prior for use in Bayesian variable selection, despite several undesirable consistency issues. In this paper, we study mixtures of g-priors as an alternative to default g-priors that resolve many of the problems with the original formulation, while mai ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
Zellner’s g-prior remains a popular conventional prior for use in Bayesian variable selection, despite several undesirable consistency issues. In this paper, we study mixtures of g-priors as an alternative to default g-priors that resolve many of the problems with the original formulation, while maintaining the computational tractability that has made the g-prior so popular. We present theoretical properties of the mixture g-priors and provide real and simulated examples to compare the mixture formulation with fixed g-priors, Empirical Bayes approaches and other default procedures.
Efficient empirical Bayes variable selection and estimation in linear models
- J. Amer. Statist. Assoc
, 2005
"... We propose an empirical Bayes method for variable selection and coefficient esti-mation in linear regression models. The method is based on a particular hierarchical Bayes formulation, and the empirical Bayes estimator is shown to be closely related to the LASSO estimator. Such a connection allows u ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
We propose an empirical Bayes method for variable selection and coefficient esti-mation in linear regression models. The method is based on a particular hierarchical Bayes formulation, and the empirical Bayes estimator is shown to be closely related to the LASSO estimator. Such a connection allows us to take advantage of the recently developed quick LASSO algorithm to compute the empirical Bayes estimate, and pro-vides a new way to select the tuning parameter in the LASSO method. Unlike previous empirical Bayes variable selection methods, which in most practical situations can only be implemented through a greedy stepwise algorithm, our method gives a global solu-tion efficiently. Simulations and real examples show that the proposed method is very competitive in terms of variable selection, estimation accuracy, and computation speed when compared with other variable selection and estimation methods.
LASSO-Patternsearch Algorithm with Application to Ophthalmology and Genomic Data
, 2008
"... The LASSO-Patternsearch algorithm is proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. ..."
Abstract
-
Cited by 10 (8 self)
- Add to MetaCart
The LASSO-Patternsearch algorithm is proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. The method is designed for the case where there is a possibly very large number of candidate patterns but it is believed that only a relatively small number are important. A LASSO is used to greatly reduce the number of candidate patterns, using a novel computational algorithm that can handle an extremely large number of unknowns simultaneously. The patterns surviving the LASSO are further pruned in the framework of (parametric) generalized linear models. A novel tuning procedure based on the GACV for Bernoulli outcomes, modified to act
When do stepwise algorithms meet subset selection criteria
- ISyE Statistics Techical Report, URL = http://www.isye.gatech.edu/statistics/papers
, 2005
"... Recent results in homotopy and solution paths demonstrate that certain well-designed greedy algorithms, with a range of values of the algorithmic parameter, can provide solution paths to a sequence of convex optimization problems. On the other hand, in regression many existing criteria in subset sel ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Recent results in homotopy and solution paths demonstrate that certain well-designed greedy algorithms, with a range of values of the algorithmic parameter, can provide solution paths to a sequence of convex optimization problems. On the other hand, in regression many existing criteria in subset selection (including Cp, AIC, BIC, MDL, RIC, etc.) involve optimizing an objective function that contains a counting measure. The two optimization problems are formulated as (P1) and(P0) in the present paper. The latter is generally combinatoric and has been proven to be NP-hard. We study the conditions under which the two optimization problems have common solutions. Hence, in these situations a stepwise algorithm can be used to solve the seemingly unsolvable problem. Our main result is motivated by recent work in sparse representation, while two others emerge from different angles: a direct analysis of sufficiency and necessity and a condition on the mostly correlated covariates. An extreme example connected with least angle regression is of independent interest. 1. Introduction. We
Cryptanalysis of the Cellular Message Encryption Algorithm By David Wagner Bruce Schneier John Kelsey i
- IEEE/ACM Trans. Comput. Biol. Bioinform
, 2005
"... Abstract—We construct a gene-to-gene regulatory network from time-series data of expression levels for the whole genome of the yeast Saccharomyces cerevisae, in a case where the number of measurements is much smaller than the number of genes in the network. This network is analyzed with respect to p ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract—We construct a gene-to-gene regulatory network from time-series data of expression levels for the whole genome of the yeast Saccharomyces cerevisae, in a case where the number of measurements is much smaller than the number of genes in the network. This network is analyzed with respect to present biological knowledge of all genes (according to the Gene Ontology database), and we find some of its large-scale properties to be in accordance with known facts about the organism. The linear modeling employed here has been explored several times, but due to lack of any validation beyond investigating individual genes, it has been seriously questioned with respect to its applicability to biological systems. Our results show the adequacy of the approach and make further investigations of the model meaningful. Index Terms—Biology and genetics, time series analysis, network problems, gene network, network inference, Lasso, yeast, validation, outdegree. æ 1
Streamwise Feature Selection
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... In streamwise feature selection, new features are sequentially considered for addition to a predictive model. When the space of potential features is large, streamwise feature selection offers many advantages over traditional feature selection methods, which assume that all features are known in ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
In streamwise feature selection, new features are sequentially considered for addition to a predictive model. When the space of potential features is large, streamwise feature selection offers many advantages over traditional feature selection methods, which assume that all features are known in advance. Features can be generated dynamically, focusing the search for new features on promising subspaces, and overfitting can be controlled by dynamically adjusting the threshold for adding features to the model. In contrast to traditional forward feature selection algorithms such as stepwise regression in which at each step all possible features are evaluated and the best one is selected, streamwise feature selection only evaluates each feature once when it is generated. We describe information-investing and #-investing, two adaptive complexity penalty methods for streamwise feature selection which dynamically adjust the threshold on the error reduction required for adding a new feature. These two methods give false discovery rate style guarantees against overfitting. They differ
Variable Selection and Model Choice in Geoadditive Regression Models
"... Model choice and variable selection are issues of major concern in practi-cal regression analyses. We propose a boosting procedure that facilitates both tasks in a class of complex geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction s ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Model choice and variable selection are issues of major concern in practi-cal regression analyses. We propose a boosting procedure that facilitates both tasks in a class of complex geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, random effects, and varying coefficient terms. The major modelling compo-nent are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a remaining smooth component with one degree of freedom to obtain a fair comparison between all model terms. A generic representation of the geoadditive model allows to devise a general boosting algorithm that imple-ments automatic model choice and variable selection. We demonstrate the versatility of our approach with two examples: a geoadditive Poisson regres-sion model for species counts in habitat suitability analyses and a geoadditive logit model for the analysis of forest health. Key words: bivariate smoothing, boosting, functional gradient, penalised splines, random effects, space-varying effects
Streaming Feature Selection using Alpha-investing
, 2005
"... In Streaming Feature Selection (SFS), new features are sequentially considered for addition to a predictive model. When the space of potential features is large, SFS offers many advantages over traditional feature selection methods, which assume that all features are known in advance. Features can b ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In Streaming Feature Selection (SFS), new features are sequentially considered for addition to a predictive model. When the space of potential features is large, SFS offers many advantages over traditional feature selection methods, which assume that all features are known in advance. Features can be generated dynamically, focusing the search for new features on promising subspaces, and overfitting can be controlled by dynamically adjusting the threshold for adding features to the model. We describe α-investing, an adaptive complexity penalty method for SFS which dynamically adjusts the threshold on the error reduction required for adding a new feature. α-investing gives false discovery ratestyle guarantees against overfitting. It differs from standard penalty methods such as AIC, BIC or RIC, which always drastically over- or under-fit in the limit of infinite numbers of non-predictive features. Empirical results show that SFS is competitive with much more compute-intensive feature selection methods such as stepwise regression, and allows feature selection on problems with over a million potential features.

