Results 1 - 10
of
132
The Dantzig selector: statistical estimation when p is much larger than n
, 2005
"... In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ ..."
Abstract
-
Cited by 879 (14 self)
- Add to MetaCart
(Show Context)
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ p, and the zi’s are i.i.d. N(0, σ 2). Is it possible to estimate x reliably based on the noisy data y? To estimate x, we introduce a new estimator—we call the Dantzig selector—which is solution to the ℓ1-regularization problem min ˜x∈R p ‖˜x‖ℓ1 subject to ‖A T r‖ℓ ∞ ≤ (1 + t −1) √ 2 log p · σ, where r is the residual vector y − A˜x and t is a positive scalar. We show that if A obeys a uniform uncertainty principle (with unit-normed columns) and if the true parameter vector x is sufficiently sparse (which here roughly guarantees that the model is identifiable), then with very large probability ‖ˆx − x ‖ 2 ℓ2 ≤ C2 ( · 2 log p · σ 2 + ∑ min(x 2 i, σ 2) Our results are nonasymptotic and we give values for the constant C. In short, our estimator achieves a loss within a logarithmic factor of the ideal mean squared error one would achieve with an oracle which would supply perfect information about which coordinates are nonzero, and which were above the noise level. In multivariate regression and from a model selection viewpoint, our result says that it is possible nearly to select the best subset of variables, by solving a very simple convex program, which in fact can easily be recast as a convenient linear program (LP).
A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers
"... ..."
Necessary and sufficient conditions on sparsity pattern recovery
, 2009
"... The paper considers the problem of detecting the sparsity pattern of a k-sparse vector in R n from m random noisy measurements. A new necessary condition on the number of measurements for asymptotically reliable detection with maximum likelihood (ML) estimation and Gaussian measurement matrices is ..."
Abstract
-
Cited by 106 (12 self)
- Add to MetaCart
(Show Context)
The paper considers the problem of detecting the sparsity pattern of a k-sparse vector in R n from m random noisy measurements. A new necessary condition on the number of measurements for asymptotically reliable detection with maximum likelihood (ML) estimation and Gaussian measurement matrices is derived. This necessary condition for ML detection is compared against a sufficient condition for simple maximum correlation (MC) or thresholding algorithms. The analysis shows that the gap between thresholding and ML can be described by a simple expression in terms of the total signal-to-noise ratio (SNR), with the gap growing with increasing SNR. Thresholding is also compared against the more sophisticated lasso and orthogonal matching pursuit (OMP) methods. At high SNRs, it is shown that the gap between lasso and OMP over thresholding is described by the range of powers of the nonzero component values of the unknown signals. Specifically, the key benefit of lasso and OMP over thresholding is the ability of lasso and OMP to detect signals with relatively small components.
Asymptotic analysis of MAP estimation via the replica method and applications to compressed sensing
, 2009
"... The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measureme ..."
Abstract
-
Cited by 77 (9 self)
- Add to MetaCart
(Show Context)
The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of ann-dimensional vector “decouples ” asnscalar MAP estimators. The result is a counterpart to Guo and Verdú’s replica analysis of minimum mean-squared error estimation. The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.
On the Asymptotic Properties of The Group Lasso Estimator in Least Squares Problems
"... We derive conditions guaranteeing estimation and model selection consistency, oracle properties and persistence for the group-lasso estimator and model selector proposed by Yuan and Lin (2006) for least squares problems when the covariates have a natural grouping structure. We study both the case of ..."
Abstract
-
Cited by 58 (0 self)
- Add to MetaCart
We derive conditions guaranteeing estimation and model selection consistency, oracle properties and persistence for the group-lasso estimator and model selector proposed by Yuan and Lin (2006) for least squares problems when the covariates have a natural grouping structure. We study both the case of a fixed-dimensional parameter space with increasing sample size and the case when the model complexity changes with the sample size. 1
Information-Theoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing
, 2011
"... We study the compressed sensing reconstruction problem for a broad class of random, banddiagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala et al. [KMS+ 11], message passing algorithms ca ..."
Abstract
-
Cited by 51 (5 self)
- Add to MetaCart
We study the compressed sensing reconstruction problem for a broad class of random, banddiagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala et al. [KMS+ 11], message passing algorithms can effectively solve the reconstruction problem for spatially coupled measurements with undersampling rates close to the fraction of non-zero coordinates. We use an approximate message passing (AMP) algorithm and analyze it through the state evolution method. We give a rigorous proof that this approach is successful as soon as the undersampling rate δ exceeds the (upper) Rényi information dimension of the signal, d(pX). More precisely, for a sequence of signals of diverging dimension n whose empirical distribution converges to pX, reconstruction is with high probability successful from d(pX) n + o(n) measurements taken according to a band diagonal matrix. For sparse signals, i.e. sequences of dimension n and k(n) non-zero entries, this implies reconstruction from k(n)+o(n) measurements. For ‘discrete ’ signals, i.e. signals whose coordinates take a fixed finite set of values, this implies reconstruction from o(n) measurements. The result
Information theoretic bounds for compressed sensing
- IEEE Trans. Inf. Theory
, 2010
"... In this paper we derive information theoretic performance bounds to sensing and reconstruction of sparse phenomena from noisy projections. We consider two settings: output noise models where the noise enters after the projection and input noise models where the noise enters before the projection. We ..."
Abstract
-
Cited by 44 (6 self)
- Add to MetaCart
(Show Context)
In this paper we derive information theoretic performance bounds to sensing and reconstruction of sparse phenomena from noisy projections. We consider two settings: output noise models where the noise enters after the projection and input noise models where the noise enters before the projection. We consider two types of distortion for reconstruction: support errors and mean-squared errors. Our goal is to relate the number of measurements, m, and SNR, to signal sparsity, k, distortion level, d, and signal dimension, n. We consider support errors in a worst-case setting. We employ different variations of Fano’s inequality to derive necessary conditions on the number of measurements and SNR required for exact reconstruction. To derive sufficient conditions we develop new insights on max-likelihood analysis based on a novel superposition property. In particular this property implies that small support errors are the dominant error events. Consequently, our ML analysis does not suffer the conservatism of the union bound and leads to a tighter analysis of max-likelihood. These results provide order-wise tight bounds. For output noise models we show that asymptotically an SNR of Θ(log(n)) together with Θ(k log(n/k)) measurements is necessary and sufficient for exact support recovery. Furthermore, if a small fraction of support errors
Honest variable selection in linear and logistic regression models via ℓ1 and ℓ1 +ℓ2 penalization
- Electronic Journal of Statistics
"... This paper investigates correct variable selection in finite samples via ℓ1 and ℓ1+ℓ2 type penalization schemes. The asymptotic consistency of variable selection immediately follows from this analysis. We focus on logistic and linear regression models. The following questions are central to our pape ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
(Show Context)
This paper investigates correct variable selection in finite samples via ℓ1 and ℓ1+ℓ2 type penalization schemes. The asymptotic consistency of variable selection immediately follows from this analysis. We focus on logistic and linear regression models. The following questions are central to our paper: given
Why Gabor frames? Two fundamental measures of coherence and their role in model selection
- J. Commun. Netw
, 2010
"... ar ..."
(Show Context)