• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting (2009)

by M J Wainwright
Venue:IEEE Trans. Inf. Theory
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 132
Next 10 →

The Dantzig selector: statistical estimation when p is much larger than n

by Emmanuel Candes, Terence Tao , 2005
"... In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ ..."
Abstract - Cited by 879 (14 self) - Add to MetaCart
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ p, and the zi’s are i.i.d. N(0, σ 2). Is it possible to estimate x reliably based on the noisy data y? To estimate x, we introduce a new estimator—we call the Dantzig selector—which is solution to the ℓ1-regularization problem min ˜x∈R p ‖˜x‖ℓ1 subject to ‖A T r‖ℓ ∞ ≤ (1 + t −1) √ 2 log p · σ, where r is the residual vector y − A˜x and t is a positive scalar. We show that if A obeys a uniform uncertainty principle (with unit-normed columns) and if the true parameter vector x is sufficiently sparse (which here roughly guarantees that the model is identifiable), then with very large probability ‖ˆx − x ‖ 2 ℓ2 ≤ C2 ( · 2 log p · σ 2 + ∑ min(x 2 i, σ 2) Our results are nonasymptotic and we give values for the constant C. In short, our estimator achieves a loss within a logarithmic factor of the ideal mean squared error one would achieve with an oracle which would supply perfect information about which coordinates are nonzero, and which were above the noise level. In multivariate regression and from a model selection viewpoint, our result says that it is possible nearly to select the best subset of variables, by solving a very simple convex program, which in fact can easily be recast as a convenient linear program (LP).
(Show Context)

Citation Context

...le selection strategy—hence, the name. There is of course a huge literature on model selection, and many procedures motivated by a wide array of criteria have been proposed over the years—among which =-=[1, 7, 26, 31, 36]-=-. By and large, the most commonly discussed approach— the “canonical selection procedure” according to [26]—is defined as arg min ˜β∈Rp ‖y − X ˜β‖ 2 ℓ2 + � · σ 2 ·‖˜β‖ℓ0 , ‖ ˜β‖ℓ0 := |{i : (1.15) ˜βi ...

A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers

by Sahand Negahban, Pradeep Ravikumar, Martin J. Wainwright, Bin Yu
"... ..."
Abstract - Cited by 218 (32 self) - Add to MetaCart
Abstract not found

Necessary and sufficient conditions on sparsity pattern recovery

by Alyson K. Fletcher, Sundeep Rangan, Vivek K Goyal , 2009
"... The paper considers the problem of detecting the sparsity pattern of a k-sparse vector in R n from m random noisy measurements. A new necessary condition on the number of measurements for asymptotically reliable detection with maximum likelihood (ML) estimation and Gaussian measurement matrices is ..."
Abstract - Cited by 106 (12 self) - Add to MetaCart
The paper considers the problem of detecting the sparsity pattern of a k-sparse vector in R n from m random noisy measurements. A new necessary condition on the number of measurements for asymptotically reliable detection with maximum likelihood (ML) estimation and Gaussian measurement matrices is derived. This necessary condition for ML detection is compared against a sufficient condition for simple maximum correlation (MC) or thresholding algorithms. The analysis shows that the gap between thresholding and ML can be described by a simple expression in terms of the total signal-to-noise ratio (SNR), with the gap growing with increasing SNR. Thresholding is also compared against the more sophisticated lasso and orthogonal matching pursuit (OMP) methods. At high SNRs, it is shown that the gap between lasso and OMP over thresholding is described by the range of powers of the nonzero component values of the unknown signals. Specifically, the key benefit of lasso and OMP over thresholding is the ability of lasso and OMP to detect signals with relatively small components.
(Show Context)

Citation Context

...l algorithms that can close this gap is an open research area. Previous necessary conditions had been based on information-theoretic capacity arguments in [19], [20] and a use of Fano’s inequality in =-=[21]-=-. More recent publications with necessary conditions include [22]–[25]. As described in Section III, our new necessary condition is stronger than the previous results in certain important regimes. In ...

Asymptotic analysis of MAP estimation via the replica method and applications to compressed sensing

by Sundeep Rangan, Alyson K. Fletcher, Vivek K Goyal , 2009
"... The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measureme ..."
Abstract - Cited by 77 (9 self) - Add to MetaCart
The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of ann-dimensional vector “decouples ” asnscalar MAP estimators. The result is a counterpart to Guo and Verdú’s replica analysis of minimum mean-squared error estimation. The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.
(Show Context)

Citation Context

... a subset of features with strong linear influence on some observed data . Several works have attempted to find conditions under which the support of a sparse vector can be fully detected [44], [56], =-=[65]-=- or partially detected [66]–[68]. Unfortunately, with the exception of [44], the only available results are bounds that are not tight. One of the uses of the RS PMAP decoupling property is to exactly ...

On the Asymptotic Properties of The Group Lasso Estimator in Least Squares Problems

by Yuval Nardi, Alessandro Rinaldo
"... We derive conditions guaranteeing estimation and model selection consistency, oracle properties and persistence for the group-lasso estimator and model selector proposed by Yuan and Lin (2006) for least squares problems when the covariates have a natural grouping structure. We study both the case of ..."
Abstract - Cited by 58 (0 self) - Add to MetaCart
We derive conditions guaranteeing estimation and model selection consistency, oracle properties and persistence for the group-lasso estimator and model selector proposed by Yuan and Lin (2006) for least squares problems when the covariates have a natural grouping structure. We study both the case of a fixed-dimensional parameter space with increasing sample size and the case when the model complexity changes with the sample size. 1

Information-theoretic limits on sparse signal recovery: Dense . . .

by Wei Wang, Martin J. Wainwright, Kannan Ramchandran , 2008
"... ..."
Abstract - Cited by 56 (4 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...conditions for exact support recovery, applicable to a general class of dense measurement matrices (including non-Gaussian ensembles). In conjunction with the sufficient conditions from previous work =-=[29]-=-, this analysis provides a sharp characterization of necessary and 1 For example, ` -recovery methods based on linear programming have complexity O(p ) in the signal dimension p. 2 Note, however, that...

Information-Theoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing

by David L. Donoho, Adel Javanmard, Andrea Montanari , 2011
"... We study the compressed sensing reconstruction problem for a broad class of random, banddiagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala et al. [KMS+ 11], message passing algorithms ca ..."
Abstract - Cited by 51 (5 self) - Add to MetaCart
We study the compressed sensing reconstruction problem for a broad class of random, banddiagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala et al. [KMS+ 11], message passing algorithms can effectively solve the reconstruction problem for spatially coupled measurements with undersampling rates close to the fraction of non-zero coordinates. We use an approximate message passing (AMP) algorithm and analyze it through the state evolution method. We give a rigorous proof that this approach is successful as soon as the undersampling rate δ exceeds the (upper) Rényi information dimension of the signal, d(pX). More precisely, for a sequence of signals of diverging dimension n whose empirical distribution converges to pX, reconstruction is with high probability successful from d(pX) n + o(n) measurements taken according to a band diagonal matrix. For sparse signals, i.e. sequences of dimension n and k(n) non-zero entries, this implies reconstruction from k(n)+o(n) measurements. For ‘discrete ’ signals, i.e. signals whose coordinates take a fixed finite set of values, this implies reconstruction from o(n) measurements. The result

Information theoretic bounds for compressed sensing

by Shuchin Aeron, Venkatesh Saligrama, Manqi Zhao - IEEE Trans. Inf. Theory , 2010
"... In this paper we derive information theoretic performance bounds to sensing and reconstruction of sparse phenomena from noisy projections. We consider two settings: output noise models where the noise enters after the projection and input noise models where the noise enters before the projection. We ..."
Abstract - Cited by 44 (6 self) - Add to MetaCart
In this paper we derive information theoretic performance bounds to sensing and reconstruction of sparse phenomena from noisy projections. We consider two settings: output noise models where the noise enters after the projection and input noise models where the noise enters before the projection. We consider two types of distortion for reconstruction: support errors and mean-squared errors. Our goal is to relate the number of measurements, m, and SNR, to signal sparsity, k, distortion level, d, and signal dimension, n. We consider support errors in a worst-case setting. We employ different variations of Fano’s inequality to derive necessary conditions on the number of measurements and SNR required for exact reconstruction. To derive sufficient conditions we develop new insights on max-likelihood analysis based on a novel superposition property. In particular this property implies that small support errors are the dominant error events. Consequently, our ML analysis does not suffer the conservatism of the union bound and leads to a tighter analysis of max-likelihood. These results provide order-wise tight bounds. For output noise models we show that asymptotically an SNR of Θ(log(n)) together with Θ(k log(n/k)) measurements is necessary and sufficient for exact support recovery. Furthermore, if a small fraction of support errors
(Show Context)

Citation Context

...oximate Support Recovery In this part we further restrict the signal X to be bounded away from zero by a constant β > 0 on its support. This is a standard assumption employed by other researchers(see =-=[5, 6, 7, 8]-=-) since it is impossible to identify the support of a signal X from noisy measurements with arbitrarily small non-zero components. We derive necessary and sufficient conditions for exact and approxima...

Honest variable selection in linear and logistic regression models via ℓ1 and ℓ1 +ℓ2 penalization

by Florentina Bunea - Electronic Journal of Statistics
"... This paper investigates correct variable selection in finite samples via ℓ1 and ℓ1+ℓ2 type penalization schemes. The asymptotic consistency of variable selection immediately follows from this analysis. We focus on logistic and linear regression models. The following questions are central to our pape ..."
Abstract - Cited by 39 (3 self) - Add to MetaCart
This paper investigates correct variable selection in finite samples via ℓ1 and ℓ1+ℓ2 type penalization schemes. The asymptotic consistency of variable selection immediately follows from this analysis. We focus on logistic and linear regression models. The following questions are central to our paper: given
(Show Context)

Citation Context

...e asymptotic consistency imsart-ejs ver. 2008/01/24 file: ejs_2008_287.tex date: January 15, 2009F. Bunea/Honest variable selection 8 in linear regression models with Gaussian design is presented in =-=[24]-=-. There the coefficient set I ∗ is assumed to have been selected uniformly at random from {1, . . .,M}, and one studies asymptotically the average error probability, where one averages over all possib...

Why Gabor frames? Two fundamental measures of coherence and their role in model selection

by Waheed U. Bajwa, Robert Calderbank, Sina Jafarpour - J. Commun. Netw , 2010
"... ar ..."
Abstract - Cited by 37 (14 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...v max { 1, 1SNR·MAR } k log p.4 On the other hand, one of the best known results for model selection using the maximum likelihood algorithm requires that n % max { k log (p−k) SNR·MAR , k log (p/k) } =-=[40]-=- (also see [23], [41]). This establishes that OST (and its variants) performs near-optimally for Gaussian design matrices provided (i) the SNR in the measurement system is not too high or (ii) the ene...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University