• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Regularization paths for generalized linear models via coordinate descent (2009)

Cached

  • Download as a PDF

Download Links

  • [www.cnbc.cmu.edu]
  • [www-stat.stanford.edu]
  • [www.stanford.edu]
  • [web.stanford.edu]
  • [www.stanford.edu]
  • [www.jstatsoft.org]
  • [www-stat.stanford.edu]
  • [statweb.stanford.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Jerome Friedman , Trevor Hastie , Rob Tibshirani
Citations:722 - 15 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

Citations

4200 Regression shrinkage and selection via the lasso - Tibshirani - 1997 (Show Context)

Citation Context

...dle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods. 1 Introduction The lasso [=-=Tibshirani, 1996-=-] is a popular method for regression that uses an ℓ1 penalty to achieve a sparse solution. In the signal processing literature, the lasso is also known as basis pursuit [Chen et al., 1998]. This idea ...

2712 Atomic decomposition by basis pursuit - Chen, Donoho, et al. - 1998 (Show Context)

Citation Context

...on The lasso [Tibshirani, 1996] is a popular method for regression that uses an ℓ1 penalty to achieve a sparse solution. In the signal processing literature, the lasso is also known as basis pursuit [=-=Chen et al., 1998-=-]. This idea has been broadly applied, for example to generalized linear models [Tibshirani, 1996] and Cox’s proportional hazard models for survival data [Tibshirani, 1997]. In recent years, there has...

1774 Molecular classification of cancer: class discovery and class prediction by gene expression monitoring - Golub, Slonim, et al. - 1999 (Show Context)

Citation Context

...e sparsity of the solution to (1) (i.e. the number of coefficients equal to zero) increases monotonically from 0 to the sparsity of the lasso solution. Figure 1 shows an example. The dataset is from [=-=Golub et al., 1999-=-b], consisting of 72 observations on 3571 genes measured with DNA microarrays. The observations fall in two classes: we treat this as a regression problem for illustration. The coefficient profiles fr...

1325 Least angle regression - Efron, Hastie, et al. - 2004
1266 Ideal spatial adaptation by wavelet shrinkage - Donoho, Johnstone - 1994 (Show Context)

Citation Context

.... If ˜βj > 0, then ∂R ∂βj | β= ˜ β = − 1 N N∑ i=1 xij(yi − ˜ βo − x T i ˜ β) + λ(1 − α)βj + λα. (4) A similar expression exists if ˜ βj < 0, and ˜ βj = 0 is treated separately. Simple calculus shows [=-=Donoho and Johnstone, 1994-=-] that the coordinate4● ● ● ● ● ● ● ● ● ● 2 4 6 8 10 −0.02 0.00 0.02 0.04 0.06 Step Coefficients ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ...

1156 Model selection and estimation in regression with grouped variables - Yuan, Lin - 2006 (Show Context)

Citation Context

...roportional hazard models for survival data (Tibshirani 1997). In recent years, there has been an enormous amount of research activity devoted to related regularization methods: 1. The grouped lasso (=-=Yuan and Lin 2007-=-; Meier et al. 2008), where variables are included or excluded in groups. 2. The Dantzig selector (Candes and Tao 2007, and discussion), a slightly modified version of the lasso. 3. The elastic net (Z...

1107 Core Team, 2009. R: A Language and Environment for Statistical Computing - Development
972 T.: Regularization and variable selection via the elastic net - Zou, Hastie - 2005 (Show Context)

Citation Context

...7; Meier et al. 2008), where variables are included or excluded in groups. 2. The Dantzig selector (Candes and Tao 2007, and discussion), a slightly modified version of the lasso. 3. The elastic net (=-=Zou and Hastie 2005-=-) for correlated variables, which uses a penalty that is part ℓ1, part ℓ2.2 Regularization Paths for GLMs via Coordinate Descent 4. ℓ1 regularization paths for generalized linear models (Park and Has...

941 Variable selection via nonconcave penalized likelihood and its oracle properties - Fan, Li - 2001 (Show Context)

Citation Context

...part ℓ2.2 Regularization Paths for GLMs via Coordinate Descent 4. ℓ1 regularization paths for generalized linear models (Park and Hastie 2007a). 5. Methods using non-concave penalties, such as SCAD (=-=Fan and Li 2005-=-) and Friedman’s generalized elastic net (Friedman 2008), enforce more severe variable selection than the lasso. 6. Regularization paths for the support-vector machine (Hastie et al. 2004). 7. The gra...

868 The Dantzig selector: Statistical estimation when p ≫ n, Annals of Statistics - Candès, Tao - 2007
745 A iterative thresholding algorithm for linear inverse problems with a sparsity constraint - Daubechies, Defrise, et al. - 2004
681 The adaptive lasso and its oracle properties - Zou - 2006 (Show Context)

Citation Context

...nalized, and always enters the model unrestricted at the first step and remains in the model. Penalty rescaling would also allow, for example, our software to be used to implement the adaptive lasso (=-=Zou 2006-=-). Considerable speedup is obtained by organizing the iterations around the active set of features— those with nonzero coefficients. After a complete cycle through all the variables, we iterate on onl...

595 Sparse inverse covariance estimation with the graphical lasso - Friedman, Hastie, et al.
559 Newsweeder: Learning to filter netnews - LANG - 1995 (Show Context)

Citation Context

...ment classification problem with mostly binary features. The response is binary, and indicates whether the document is an advertisement. Only 1.2% nonzero values in the predictor matrix. • Newsgroup [=-=Lang, 1995-=-]: document classification problem. We used the training set cultured from these data by Koh et al. [2007]. The response is binary, and indicates a subclass of topics; the predictors are binary, and i...

331 Multiclass cancer diagnosis using tumor gene expression signatures - Ramaswamy, Tamayo, et al.
324 Pathwise coordinate optimization - Friedman, Hastie, et al.
298 Convergence of a block coordinate descent method for nondifferentiable minimization - TSENG - 2001 (Show Context)

Citation Context

...t (Friedman et al. 2009) implemented in the R programming system (R Development Core Team 2009). We do not revisit the wellestablished convergence properties of coordinate descent in convex problems (=-=Tseng 2001-=-) in this article. Lasso procedures are frequently used in domains with very large datasets, such as genomics and web analysis. Consequently a focus of our research has been algorithmic efficiency and...

290 An interior-point method for large-scale l1-regularized logistic regression - Koh, Kim, et al. - 2007
276 The group lasso for logistic regression - Meier, Geer, et al. - 2008
242 A new approach to variable selection in least squares problems. - Osborne, Presnell, et al. - 2000
202 The entire regularization path for the support vector machine - Hastie, Rosset, et al. - 2004 (Show Context)

Citation Context

..., such as SCAD (Fan and Li 2005) and Friedman’s generalized elastic net (Friedman 2008), enforce more severe variable selection than the lasso. 6. Regularization paths for the support-vector machine (=-=Hastie et al. 2004-=-). 7. The graphical lasso (Friedman et al. 2008) for sparse covariance estimation and undirected graphs. Efron et al. (2004) developed an efficient algorithm for computing the entire regularization pa...

193 Boosting for Tumor Classification with Gene Expression - Dettling, Bühlmann - 2003
190 Large-scale Bayesian logistic regression for text categorization - Genkin, Lewis, et al. (Show Context)

Citation Context

...s outcome z with Pr(z = 1) = p, Pr(z = 0) = 1 − p. We compared the speed of glmnet to the interior point method l1logreg (Koh et al. 2007b,a), Bayesian binary regression (BBR, Madigan and Lewis 2007; =-=Genkin et al. 2007-=-), and the lasso penalized logistic program LPL supplied by Ken Lange (see Wu and Lange 2008). The latter two methods also use a coordinate descent approach. The BBR software automatically performs te...

190 Sparse multinomial logistic regression: Fast algorithms and generalization bounds - Krishnapuram, Carin, et al. - 2005
178 Scalable training of L1-regularized log-linear models - Andrew, Gao
139 Piecewise linear regularized solution paths, The Annals of Statistics 35(3 - Rosset, Zhu - 2007
131 Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR - TR, DK, et al. (Show Context)

Citation Context

...e number of coefficients equal to zero) increases monotonically from 0 to the sparsity of the lasso solution. Figure 1 shows an example that demonstrates the effect of varying α. The dataset is from (=-=Golub et al. 1999-=-), consisting of 72 observations on 3571 genes measured with DNA microarrays. The observations fall in two classes, so we use the penalties in conjunction with the 1 Zou and Hastie (2005) called this ...

131 Regularization and variable selection via the elastic - Zou, Hastie (Show Context)

Citation Context

...email: hastie@stanford.edu. Sequoia Hall, Stanford University, CA94305. 12. The Dantzig selector [Candes and Tao, 2007, and discussion], a slightly modified version of the lasso; 3. The elastic net [=-=Zhou and Hastie, 2005-=-] for correlated variables, which uses a penalty that is part ℓ1, part ℓ2; 4. ℓ1 regularization paths for generalized linear models [Park and Hastie, 2006]; 5. Regularization paths for the support-vec...

121 L1-regularization path algorithm for generalized linear models - Park, Hastie (Show Context)

Citation Context

...fied version of the lasso; 3. The elastic net [Zou and Hastie, 2005] for correlated variables, which uses a penalty that is part ℓ1, part ℓ2; 4. ℓ1 regularization paths for generalized linear models [=-=Park and Hastie, 2007-=-]; 5. Regularization paths for the support-vector machine [Hastie et al., 2004]. 6. The graphical lasso [Friedman et al., 2008] for sparse covariance estimation and undirected graphs Efron et al. [200...

119 The LASSO Method for Variable Selection in - Tibshirani - 1997 (Show Context)

Citation Context

... also known as basis pursuit (Chen et al. 1998). This idea has been broadly applied, for example to generalized linear models (Tibshirani 1996) and Cox’s proportional hazard models for survival data (=-=Tibshirani 1997-=-). In recent years, there has been an enormous amount of research activity devoted to related regularization methods: 1. The grouped lasso (Yuan and Lin 2007; Meier et al. 2008), where variables are i...

109 Coordinate descent algorithms for lasso penalized regression - Wu, Lange
107 A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics. - Krishnaj, Keerthi - 2003
77 Rob Tibshirani - Bair, Hastie, et al. - 2006
70 Classification of gene microarrays by penalized logistic regression,” - Zhu, Hastie - 2004
62 Learning to remove internet advertisement. - Kushmerick - 1999 (Show Context)

Citation Context

... cancer classes. • Leukemia [Golub et al., 1999a]: gene-expression data with a binary response indicating type of leukemia—AML vs ALL. We used the preprocessed data of Dettling [2004]. • Internet-Ad [=-=Kushmerick, 1999-=-]: document classification problem with mostly binary features. The response is binary, and indicates whether the document is an advertisement. Only 1.2% nonzero values in the predictor matrix. • News...

51 An L1 Regularization-path Algorithm for Generalized Linear Models. - Park, Hastie - 2007 (Show Context)

Citation Context

...ied version of the lasso; 3. The elastic net [Zhou and Hastie, 2005] for correlated variables, which uses a penalty that is part ℓ1, part ℓ2; 4. ℓ1 regularization paths for generalized linear models [=-=Park and Hastie, 2006-=-]; 5. Regularization paths for the support-vector machine [Hastie et al., 2004]. 6. The graphical lasso [Friedman et al., 2007b] for sparse covariance estimation and undirected graphs Efron et al. [20...

39 Fast sparse regression and classification. - Friedman - 2008 (Show Context)

Citation Context

...escent 4. ℓ1 regularization paths for generalized linear models (Park and Hastie 2007a). 5. Methods using non-concave penalties, such as SCAD (Fan and Li 2005) and Friedman’s generalized elastic net (=-=Friedman 2008-=-), enforce more severe variable selection than the lasso. 6. Regularization paths for the support-vector machine (Hastie et al. 2004). 7. The graphical lasso (Friedman et al. 2008) for sparse covarian...

34 Tibshirani (2008): “Sparse inverse covariance estimation with the graphical lasso - Friedman, Hastie, et al.
31 L1-regularization path algorithm for generalized linear models - MY, Hastie
27 Defrise M and De Mol C 2004 An iterative thresholding algorithm for linear inverse problems with a sparsity constraint Commun - Daubechies
23 Penalized regressions: the bridge vs. the lasso - Fu - 1998
23 Atomic decomposition by basis pursuit. - SS, DL, et al. - 1998
12 2006a) glmpath: L1 Regularization Path for Generalized Linear Models and Proportional Hazards Model. URL http://cran.r-project.org/src/contrib/Descriptions/glmpath.html. R package version 0.91 - Park, Hastie
11 Coordinate descent procedures for lasso penalized regression. - Wu, Lange - 2007 (Show Context)

Citation Context

...erior point method l1logreg (Koh et al. 2007b,a), Bayesian binary regression (BBR, Madigan and Lewis 2007; Genkin et al. 2007), and the lasso penalized logistic program LPL supplied by Ken Lange (see =-=Wu and Lange 2008-=-). The latter two methods also use a coordinate descent approach. The BBR software automatically performs ten-fold cross-validation when given a set of λ values. Hence we report the total time for ten...

8 The Elements of Statistical Learning: Prediction, Inference and Data Mining. 2nd edn - Hastie - 2009 (Show Context)

Citation Context

... “one-standard-error” rule. The top of each plot is annotated with the size of the models.18 Regularization Paths for GLMs via Coordinate Descent Alternatively, they can use K-fold cross-validation (=-=Hastie et al. 2009-=-, for example), where the training data is used both for training and testing in an unbiased way. Figure 2 illustrates cross-validation on a simulated dataset. For logistic regression, we sometimes us...

6 Hartemink AJ. 2005. Sparse multinomial logistic regression: fast algorithms and generalization bounds - Krishnapuram, Carin, et al.
5 De Geer S, Bühlmann P. 2008 The group lasso for logistic regression - Meier, Van
5 Turlach (2000). A new approach to variable selection in least squares problems - Osborne, Presnell, et al.
4 Adaptable, efficient and robust methods for regression and classification via piecewise linear regularized coefficient paths - Rosset, Zhu - 2003
4 elasticnet: Elastic Net Regularization and Variable Selection. R package version - Zou, Hastie - 2005 (Show Context)

Citation Context

...not much software available for elastic net. Comparisons of our glmnet code with the R package elasticnet will mimic the comparisons with lars (Hastie and Efron 2007) for the lasso, since elasticnet (=-=Zou and Hastie 2004-=-) is built on the lars package. 5.1. Regression with the lasso We generated Gaussian data with N observations and p predictors, with each pair of predictors Xj, Xj ′ having the same population correla...

3 Tibshirani R (2009) glmnet: Lasso and elastic-net regularized generalized linear models. Version1. Available: http://www-stat. stanford.edu/,tibs/glmnet-matlab. Accessed 16 - Friedman, Hastie - 2013
3 T (2004). Classification of Expression Arrays by Penalized Logistic Regression - Zhu, Hastie
2 der Kooij. Prediction Accuracy and Stability of Regression with Optimal Scaling Transformations - van - 2007
2 Genome-wide association analysis by penalized logistic regression. - Wu, Chen, et al. - 2009
2 Penalized Regressions: the Bridge vs - Fu - 1998
2 Kooij A (2007). “Prediction Accuracy and Stability of Regrsssion with Optimal Scaling Transformations - der
2 Prediction accuracy and stability of regrsssion with optimal sclaing transformations - Kooij - 2007
1 Efficient l1 logistic regression - Lee, Lee, et al. - 2006 (Show Context)

Citation Context

...o-column matrix of counts, sometimes referred to as grouped data. We discuss this in more detail in Section 4.2. • The Newton algorithm is not guaranteed to converge without stepsize optimization [in =-=Lee et al., 2006-=-]. Our code does not implement any checks for divergence; this would slow it down, and when used as recommended we do not feel it is necessary.. We have a closed form expression for the starting solut...

1 OWL-QN: Orthant-Wise Limited-Memory Quasi-Newton Optimizer for L1-Regularized Objectives - Andrew, Gao
1 package version 0.999375-30, URL http://CRAN.R-project.org/package=Matrix. of Statistical Software 19 Candes E, Tao T (2007). “The Dantzig Selector: Statistical Estimation When p is much Larger than n.” The Annals of Statistics - unknown authors
1 A MATLAB Implementation of glmnet - Jiang - 2009 (Show Context)

Citation Context

...able under general public licence (GPL-2) from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=glmnet. Sparse data inputs are handled by the Matrix package. MATLAB functions (=-=Jiang 2009-=-) are available from http://www-stat.stanford.edu/~tibs/glmnet-matlab/. Acknowledgments We would like to thank Holger Hoefling for helpful discussions, and Hui Jiang for writing the MATLAB interface t...

1 S (2007b). l1logreg: A Solver for L1-Regularized Logistic Regression. R package version 0.1-1. Avaliable from Kwangmoo Koh (deneb1@stanford.edu - Koh, SJ, et al.
1 Abbeel P, Ng A (2006). “Efficient L1 Logistic Regression - Lee, Lee
1 BBR, BMR: Bayesian Logistic Regression. Open-source standalone software, URL http://www.bayesianregression.org - Madigan, Lewis - 2007 (Show Context)

Citation Context

...s to generate a two-class outcome z with Pr(z = 1) = p, Pr(z = 0) = 1 − p. We compared the speed of glmnet to the interior point method l1logreg (Koh et al. 2007b,a), Bayesian binary regression (BBR, =-=Madigan and Lewis 2007-=-; Genkin et al. 2007), and the lasso penalized logistic program LPL supplied by Ken Lange (see Wu and Lange 2008). The latter two methods also use a coordinate descent approach. The BBR software autom...

1 The lasso method for variable selection in the cox model - Soc - 1996
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University