## Union support recovery in high-dimensional multivariate (2008)

### Cached

### Download Links

Citations: | 19 - 4 self |

### BibTeX

@MISC{Obozinski08unionsupport,

author = {Guillaume Obozinski and Martin J. Wainwright and Michael I. Jordan},

title = {Union support recovery in high-dimensional multivariate},

year = {2008}

}

### OpenURL

### Abstract

regression

### Citations

4010 |
Convex Optimization
- Boyd, Vandenberghe
- 2004
(Show Context)
Citation Context ...3) in which ‖β‖0 is replaced with the ℓ1 norm ‖β‖1. This relaxation, often referred to as the Lasso (Tibshirani, 1996), is a quadratic program, and can be solved efficiently by various methods (e.g., =-=Boyd and Vandenberghe, 2004-=-; Osborne et al., 2000; Efron et al., 2004)). A variety of theoretical results are now in place for the Lasso, both in the traditional setting where the sample size n tends to infinity with the proble... |

2022 | Regression shrinkage and selection via the LASSO
- Tibshirani
- 1996
(Show Context)
Citation Context ...s in β and where λn > 0 is a regularization parameter. Unfortunately, this optimization problem is computationally intractable, a fact which has led various authors to consider the convex relaxation (=-=Tibshirani, 1996-=-; Chen et al., 1998) arg min β∈Rp { 1 n ‖y − Xβ‖2 } 2 + λn‖β‖1 , (3) in which ‖β‖0 is replaced with the ℓ1 norm ‖β‖1. This relaxation, often referred to as the Lasso (Tibshirani, 1996), is a quadratic... |

1315 |
An Introduction to Multivariate Statistical Analysis, Second Edition
- Anderson
- 1984
(Show Context)
Citation Context ... ζ(B∗ S ), define the K × K random matrix 2 imply that ‖Z∗ i + hi∆i‖2 ≥ 1 M ∗ n : = λ2 n n (Z∗ S) T ( ̂ ΣSS) −1 Z ∗ S + 1 n 2 W T (In − ΠS)W and note that (using standard results on Wishart matrices (=-=Anderson, 1984-=-)) E [M ∗ n] = λ2 n n − s − 1 (Z∗ S) T (ΣSS) −1 Z ∗ 2 n − s S + σ n2 IK. (46) 28To bound Mn in spectral norm, we use the triangle inequality: ||Mn || 2 ≤ ||Mn − M ∗ n || 2 } {{ } + ||M ∗ n − E [M ∗ n... |

554 | Model selection and estimation in regression with grouped variables - Yuan, Lin - 2006 |

407 | High-dimensional graphs and variable selection with the Lasso. The Annals of Statistics 34
- Meinshausen, Bühlmann
- 2006
(Show Context)
Citation Context ...em size p fixed (Knight and Fu, 2000), as well as under high-dimensional scaling, in which p and n tend to infinity simultaneously, thereby allowing p to be comparable to or even larger than n (e.g., =-=Meinshausen and Bühlmann, 2006-=-; Wainwright, 2006; Zhao and Yu, 2006). In many applications, it is natural to impose sparsity constraints on the regression vector β ∗ , and a variety of such constraints have been considered. For ex... |

379 | Uncertainty principles and ideal atomic decomposition
- Donoho, Huo
(Show Context)
Citation Context ... research in the last decade. There is now a substantial body of work based on ℓ1-regularization, dating back to the seminal work of Tibshirani (1996) and Donoho and collaborators (Chen et al., 1998; =-=Donoho and Huo, 2001-=-). The bulk of this work has focused on the standard problem of linear regression, in which one makes observations of the form y = Xβ ∗ + w, (1) where y ∈ R n is a real-valued vector of observations, ... |

295 | Multiple kernel learning, conic duality and the SMO algorithm
- Bach, Lanckriet, et al.
- 2004
(Show Context)
Citation Context ...t of covariates that are “relevant” across regressions (Obozinski et al., 2007; Argyriou et al., 2006; Turlach et al., 2005; Zhang et al., 2008). Based on such motivations, a recent line of research (=-=Bach et al., 2004-=-; Tropp, 2006; 2Yuan and Lin, 2006; Zhao et al., 2007; Obozinski et al., 2007; Ravikumar et al., 2008) has studied the use of block-regularization schemes, in which the ℓ1 norm is composed with some ... |

172 | A new approach to variable selection in least squares problems
- Osborne, Presnell, et al.
- 2000
(Show Context)
Citation Context ...with the ℓ1 norm ‖β‖1. This relaxation, often referred to as the Lasso (Tibshirani, 1996), is a quadratic program, and can be solved efficiently by various methods (e.g., Boyd and Vandenberghe, 2004; =-=Osborne et al., 2000-=-; Efron et al., 2004)). A variety of theoretical results are now in place for the Lasso, both in the traditional setting where the sample size n tends to infinity with the problem size p fixed (Knight... |

169 | Sharp thresholds for high-dimensional and noisy recovery of sparsity. Available at arXiv:math.ST/0605740
- Wainwright
- 2006
(Show Context)
Citation Context ...2000), as well as under high-dimensional scaling, in which p and n tend to infinity simultaneously, thereby allowing p to be comparable to or even larger than n (e.g., Meinshausen and Bühlmann, 2006; =-=Wainwright, 2006-=-; Zhao and Yu, 2006). In many applications, it is natural to impose sparsity constraints on the regression vector β ∗ , and a variety of such constraints have been considered. For example, one can con... |

162 | Consistency of the group Lasso and multiple kernel learning
- Bach
- 2008
(Show Context)
Citation Context ... the same qualitative results and the same convergence rates for q = 1 as for q > 1. Our focus, which is motivated by the empirical observation that the group Lasso can outperform the ordinary Lasso (=-=Bach, 2008-=-; Yuan and Lin, 2006; Zhao et al., 2007; Obozinski et al., 2007), is precisely the distinction between q = 1 and q > 1 (specifically q = 2). The distinction between q = 1 and q = 2 is also significant... |

154 | Asymptotics for lasso-type estimators
- Knight, Fu
- 2000
(Show Context)
Citation Context ..., 2000; Efron et al., 2004)). A variety of theoretical results are now in place for the Lasso, both in the traditional setting where the sample size n tends to infinity with the problem size p fixed (=-=Knight and Fu, 2000-=-), as well as under high-dimensional scaling, in which p and n tend to infinity simultaneously, thereby allowing p to be comparable to or even larger than n (e.g., Meinshausen and Bühlmann, 2006; Wain... |

149 | The Group LASSO for Logistic Regression - Meier, Geer, et al. - 2008 |

145 | Multi-task feature learning
- Argyriou, Evgeniou, et al.
(Show Context)
Citation Context ...n: multiple regressions can be related by a (partially) shared sparsity pattern, such as when there are an underlying set of covariates that are “relevant” across regressions (Obozinski et al., 2007; =-=Argyriou et al., 2006-=-; Turlach et al., 2005; Zhang et al., 2008). Based on such motivations, a recent line of research (Bach et al., 2004; Tropp, 2006; 2Yuan and Lin, 2006; Zhao et al., 2007; Obozinski et al., 2007; Ravi... |

128 | Basis pursuit
- Chen
- 1995
(Show Context)
Citation Context ...tatistical learning research in the last decade. There is now a substantial body of work based on ℓ1-regularization, dating back to the seminal work of Tibshirani (1996) and Donoho and collaborators (=-=Chen et al., 1998-=-; Donoho and Huo, 2001). The bulk of this work has focused on the standard problem of linear regression, in which one makes observations of the form y = Xβ ∗ + w, (1) where y ∈ R n is a real-valued ve... |

123 | Lasso-type recovery of sparse representations for high-dimensional data - Meinshausen, Yu - 2009 |

95 | Adaptive estimation of a quadratic functional by model selection - LAURENT, MASSART - 2000 |

92 | Grouped and hierarchical model selection through composite absolute penalties
- Zhao, Rocha, et al.
- 2006
(Show Context)
Citation Context ...efficients may be required to be zero or non-zero in a blockwise manner; for example, one might wish to include a particular covariate and all powers of that covariate as a group (Yuan and Lin, 2006; =-=Zhao et al., 2007-=-). Another example arises when we consider variable selection in the setting of multivariate regression: multiple regressions can be related by a (partially) shared sparsity pattern, such as when ther... |

81 | Sparse additive models
- Ravikumar, Lafferty, et al.
- 2009
(Show Context)
Citation Context ...2006; Turlach et al., 2005; Zhang et al., 2008). Based on such motivations, a recent line of research (Bach et al., 2004; Tropp, 2006; 2Yuan and Lin, 2006; Zhao et al., 2007; Obozinski et al., 2007; =-=Ravikumar et al., 2008-=-) has studied the use of block-regularization schemes, in which the ℓ1 norm is composed with some other ℓq norm (q > 1), thereby obtaining the ℓ1/ℓq norm defined as a sum of ℓq norms over groups of re... |

77 |
Simultaneous variable selection
- Turlach, Venables, et al.
- 2005
(Show Context)
Citation Context ... can be related by a (partially) shared sparsity pattern, such as when there are an underlying set of covariates that are “relevant” across regressions (Obozinski et al., 2007; Argyriou et al., 2006; =-=Turlach et al., 2005-=-; Zhang et al., 2008). Based on such motivations, a recent line of research (Bach et al., 2004; Tropp, 2006; 2Yuan and Lin, 2006; Zhao et al., 2007; Obozinski et al., 2007; Ravikumar et al., 2008) ha... |

22 | Metric Characterization of Random Variables and Random Processes - Kozachenko |

15 | Variable selection for multicategory svm via sup-norm regularization
- Zhang, Liu, et al.
- 2008
(Show Context)
Citation Context ...partially) shared sparsity pattern, such as when there are an underlying set of covariates that are “relevant” across regressions (Obozinski et al., 2007; Argyriou et al., 2006; Turlach et al., 2005; =-=Zhang et al., 2008-=-). Based on such motivations, a recent line of research (Bach et al., 2004; Tropp, 2006; 2Yuan and Lin, 2006; Zhao et al., 2007; Obozinski et al., 2007; Ravikumar et al., 2008) has studied the use of... |

11 | Joint covariate selection for grouped classification
- Obozinski, Taskar, et al.
- 2007
(Show Context)
Citation Context ...f multivariate regression: multiple regressions can be related by a (partially) shared sparsity pattern, such as when there are an underlying set of covariates that are “relevant” across regressions (=-=Obozinski et al., 2007-=-; Argyriou et al., 2006; Turlach et al., 2005; Zhang et al., 2008). Based on such motivations, a recent line of research (Bach et al., 2004; Tropp, 2006; 2Yuan and Lin, 2006; Zhao et al., 2007; Obozi... |

10 |
Joint covariate selection for grouped classification
- Obozinski, Taskar, et al.
- 2007
(Show Context)
Citation Context ...f multivariate regression: multiple regressions can be related by a (partially) shared sparsity pattern, such as when there are an underlying set of covariates that are “relevant” across regressions (=-=Obozinski et al., 2007-=-; Argyriou et al., 2006; Turlach et al., 2005; Zhang et al., 2008). Based on such motivations, a recent line of research (Bach et al., 2004; Tropp, 2006; 2Yuan and Lin, 2006; Zhao et al., 2007; Obozi... |

9 | Nonlinear programming - P - 1995 |

8 | On the ℓ1–ℓq regularized regression - Liu, Zhang - 2008 |

2 | Just relax: Convex programming methods for identifying sparse signals in noise - A - 2006 |

1 | Atomic decomposition by basis - Computing - 2001 |

1 |
Annals of Statistics, 32(2):407–499. Least angle regression
- Efron, Hastie, et al.
- 2004
(Show Context)
Citation Context ... This relaxation, often referred to as the Lasso (Tibshirani, 1996), is a quadratic program, and can be solved efficiently by various methods (e.g., Boyd and Vandenberghe, 2004; Osborne et al., 2000; =-=Efron et al., 2004-=-)). A variety of theoretical results are now in place for the Lasso, both in the traditional setting where the sample size n tends to infinity with the problem size p fixed (Knight and Fu, 2000), as w... |

1 |
Atomic decomposition by basis
- Davidson, Szarek
- 2001
(Show Context)
Citation Context ...s 1/r ‖A‖ ℓ∞/ℓp . (42b) C Some concentration inequalities for random matrices In this appendix, we state some known concentration inequalities for the extreme eigenvalues of Gaussian random matrices (=-=Davidson and Szarek, 2001-=-). Although these results hold more generally, our interest here is on scalings (n, s) such that s/n → 0. Lemma 7. Let U ∈ R n×s be a random matrix from the standard Gaussian ensemble (i.e., Uij ∼ N(0... |