## Group Lasso with Overlap and Graph Lasso

### Cached

### Download Links

Citations: | 116 - 14 self |

### BibTeX

@MISC{Jacob_grouplasso,

author = {Laurent Jacob and Guillaume Obozinski},

title = {Group Lasso with Overlap and Graph Lasso},

year = {}

}

### OpenURL

### Abstract

We propose a new penalty function which, when used as regularization for empirical risk minimization procedures, leads to sparse estimators. The support of the sparse vector is typically a union of potentially overlapping groups of covariates defined a priori, or a set of covariates which tend to be connected to each other when a graph of covariates is given. We study theoretical properties of the estimator, and illustrate its behavior on simulated and breast cancer gene expression data. 1.

### Citations

1965 | Regression shrinkage and selection via the Lasso
- Tibshirani
- 1996
(Show Context)
Citation Context ... which is often of primary importance in many applications such as biology or social sciences. A popular example is the penalization of a ℓ2 criterion by the ℓ1 norm of the estimator, known as lasso (=-=Tibshirani, 1996-=-) or basis pursuit (Chen et al., 1998). Interestingly, the lasso is able to recover the exact support of a sparse model from data generated by this model if the covariates are not too correlated (Zhao... |

1755 | Atomic decomposition by basis pursuit
- Chen, Donoho, et al.
- 2001
(Show Context)
Citation Context ...e in many applications such as biology or social sciences. A popular example is the penalization of a ℓ2 criterion by the ℓ1 norm of the estimator, known as lasso (Tibshirani, 1996) or basis pursuit (=-=Chen et al., 1998-=-). Interestingly, the lasso is able to recover the exact support of a sparse model from data generated by this model if the covariates are not too correlated (Zhao & Yu, 2006; Wainwright, 2006). 1 Thi... |

543 | Model selection and estimation in regression with grouped variables
- Yuan, Lin
- 2006
(Show Context)
Citation Context ...ties to enforce the estimation of models with specific sparsity patterns. For example, when the covariates are partitioned into groups, the group lasso leads to the selection of groups of covariates (=-=Yuan & Lin, 2006-=-). The group lasso penalty for a model, also called ℓ1/ℓ2 penalty, is the sum (i.e., ℓ1 norm) of the ℓ2 norms of the restrictions of the model to the different groups of covariates. It recovers the su... |

351 | Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS, 102, 15545–15550. Ş.Nacu et al - Subramanian - 2005 |

167 | Sharp thresholds for high-dimensional and noisy recovery of sparsity
- Wainwright
- 2006
(Show Context)
Citation Context ...rsuit (Chen et al., 1998). Interestingly, the lasso is able to recover the exact support of a sparse model from data generated by this model if the covariates are not too correlated (Zhao & Yu, 2006; =-=Wainwright, 2006-=-). 1 This work was undertaken while Guillaume Obozinski was affiliated with UC Berkeley, Department of Statistics. Appearing in Proceedings of the 26 th International Conference on Machine Learning, M... |

161 | Consistency of the group lasso and multiple kernel learning
- Bach
- 2008
(Show Context)
Citation Context ...ariates. It recovers the support of a model if the support is a union of groups and if covariates of different groups are not too correlated. It can be generalized to an infinite-dimensional setting (=-=Bach, 2008-=-). Other variants of the group lasso include joint selection of covariates for multi-task learning (Obozinski et al., 2009) and penalties to enforce hierarchical selection of covariates, e.g., when on... |

147 | Asymptotics for lasso-type estimators
- Knight, Fu
- 2000
(Show Context)
Citation Context ...verlap and Graph Lasso minwJ1 ∈RJ 1 1 2n ‖Y − XJ1wJ1 ‖2 + λnΩ G1 overlap (wJ1) . By standard arguments, we can prove that w1 converges in Euclidean norm to ¯w restricted to J1 as n tends to infinity (=-=Fu & Knight, 2000-=-). In the rest of the proof we show how to construct a vector w ∈ Rp from w1 which under condition (C2) is with high probability a solution to (6). By adding null components to w1, we obtain a vector ... |

147 | The group lasso for logistic regression - Meier, Geer, et al. - 2008 |

136 |
Network-based classification of breast cancer metastasis. Molecular Systems Biology
- Chuang, Lee, et al.
- 2007
(Show Context)
Citation Context ...ts on a graph carrying biological information such as regulation, involvement in the same chain of metabolic reactions, or protein-protein interaction. Similarly to what is done in pathway analysis, (=-=Chuang et al., 2007-=-) built a network by compiling several biological networks and performed such graph analysis by identifying discriminant subnetworks in one step and using these subnetworks to learn a classifier in a ... |

113 | A geneexpression signature as a predictor of survival in breast - Vijver, He, et al. - 2002 |

98 | Structured variable selection with sparsity-inducing norms
- Jenatton, Audibert, et al.
- 2011
(Show Context)
Citation Context ...sparse vectors, whose support in typically the complement of a union of groups. Although this may be relevant for some applications, with appropriately designed families of groups — as considered by (=-=Jenatton et al., 2009-=-) — , we are interested in this paper in penalties which induce the opposite effect: that the support of w be a union of groups. For that purpose, we propose instead the following penalty: Ω G overlap... |

91 | Grouped and hierarchical model selection through composite absolute penalties
- Zhao, Rocha, et al.
(Show Context)
Citation Context ...enalties to enforce hierarchical selection of covariates, e.g., when one has a hierarchy over the covariates and wants to select covariates only if their ancestors in the hierarchy are also selected (=-=Zhao et al., 2009-=-; Bach, 2009). In this paper we are interested in a more general situation. We assume that either (i) groups of covariates are given, potentially with overlap between the groups, and we wish to estima... |

84 |
Joint covariate selection and joint subspace selection for multiple classification problems
- Obozinski, Taskar, et al.
- 2010
(Show Context)
Citation Context ... groups are not too correlated. It can be generalized to an infinite-dimensional setting (Bach, 2008). Other variants of the group lasso include joint selection of covariates for multi-task learning (=-=Obozinski et al., 2009-=-) and penalties to enforce hierarchical selection of covariates, e.g., when one has a hierarchy over the covariates and wants to select covariates only if their ancestors in the hierarchy are also sel... |

81 | Exploring large feature spaces with hierarchical multiple kernel learning
- Bach
- 2008
(Show Context)
Citation Context ... hierarchical selection of covariates, e.g., when one has a hierarchy over the covariates and wants to select covariates only if their ancestors in the hierarchy are also selected (Zhao et al., 2009; =-=Bach, 2009-=-). In this paper we are interested in a more general situation. We assume that either (i) groups of covariates are given, potentially with overlap between the groups, and we wish to estimate a model w... |

41 | The group-lasso for generalized linear models: uniqueness of solutions and efficient algorithms
- Roth, Fischer
- 2008
(Show Context)
Citation Context ...e directly estimated from X˜ with a classical group lasso for non-overlapping groups. We implemented the approach of (Meier et al., 2008) to estimate the group lasso in the expanded space. Note that (=-=Roth & Fischer, 2008-=-) provides a faster algorithm for the group Lasso. When there are many groups with important overlap however, an alternative implementation without explicit data duplication, e.g., with a variational ... |

29 |
Classication of microarray data using gene networks
- Rapaport, Zinovyev, et al.
(Show Context)
Citation Context ...to be connected to each other in a given biological network,Group Lasso with Overlap and Graph Lasso could then lead to increased interpretability of the signature and potential better performances (=-=Rapaport et al., 2007-=-). To reach this goal, we propose and study a new penalty which generalizes the ℓ1/ℓ2 norm to overlapping groups for the first case, and propose to cast the problem of selecting connected covariates i... |

11 | The generalized LASSO: a wrapper approach to gene selection foe microarray data
- Roth
- 2002
(Show Context)
Citation Context ... few samples in high dimension, such as predicting the class of a tumour from gene expression measurements with microarrays, and simultaneously select a few genes to establish a predictive signature (=-=Roth, 2002-=-). Selecting a few genes that either belong to the same functional groups, where the groups are given a priori and may overlap, or tend to be connected to each other in a given biological network,Gro... |

3 |
An Implicit Function Theorem
- Kumagai
- 1980
(Show Context)
Citation Context ...|×|G1| (( [ ∑ −wi + g∋i (wJ1,αJ1 ,ηG1 ) ↦→ ηg ] ) αi i∈J1 (‖αg‖2 ) , − 1)g∈G1 then (7) is equivalent to F(wJ1 ,αJ1 ,ηG1 ) = 0. We use the implicit function theorem for non-differentiable function of (=-=Kumagai, 1980-=-). The theorem states that for a continuous function F : R |J1| × R |J1|×|G1| → R |J1|×|G1| such that F(w0,(α0,η0)) = 0, if there exist open neighborhoods U ⊂ R |J1| and U ′ ⊂ R |J1|×|G1| of w0 and (α... |

3 |
An implicit function theorem: Comment. Journal of Optimization Theory and Applications, 31:285–288
- Kumagai
- 1980
(Show Context)
Citation Context ...|×|G1| (wJ1,αJ1 ,ηG1 ) ↦→ (( [ ∑ −wi + g∋i ηg ] ) αi i∈J1 (‖αg‖2 ) , − 1)g∈G1 then (7) is equivalent to F(wJ1 ,αJ1 ,ηG1 ) = 0. We use the implicit function theorem for non-differentiable function of (=-=Kumagai, 1980-=-). The theorem states that for a continuous function F : R |J1| × R |J1|×|G1| → R |J1|×|G1| such that F(w0,(α0,η0)) = 0, if there exist open neighborhoods U ⊂ R |J1| and U ′ ⊂ R |J1|×|G1| of w0 and (α... |