Results 1 -
7 of
7
Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit
, 2006
"... Finding the sparsest solution to underdetermined systems of linear equations y = Φx is NP-hard in general. We show here that for systems with ‘typical’/‘random ’ Φ, a good approximation to the sparsest solution is obtained by applying a fixed number of standard operations from linear algebra. Our pr ..."
Abstract
-
Cited by 116 (15 self)
- Add to MetaCart
Finding the sparsest solution to underdetermined systems of linear equations y = Φx is NP-hard in general. We show here that for systems with ‘typical’/‘random ’ Φ, a good approximation to the sparsest solution is obtained by applying a fixed number of standard operations from linear algebra. Our proposal, Stagewise Orthogonal Matching Pursuit (StOMP), successively transforms the signal into a negligible residual. Starting with initial residual r0 = y, at the s-th stage it forms the ‘matched filter ’ Φ T rs−1, identifies all coordinates with amplitudes exceeding a specially-chosen threshold, solves a least-squares problem using the selected coordinates, and subtracts the leastsquares fit, producing a new residual. After a fixed number of stages (e.g. 10), it stops. In contrast to Orthogonal Matching Pursuit (OMP), many coefficients can enter the model at each stage in StOMP while only one enters per stage in OMP; and StOMP takes a fixed number of stages (e.g. 10), while OMP can take many (e.g. n). StOMP runs much faster than competing proposals for sparse solutions, such as ℓ1 minimization and OMP, and so is attractive for solving large-scale problems. We use phase diagrams to compare algorithm performance. The problem of recovering a k-sparse vector x0 from (y, Φ) where Φ is random n × N and y = Φx0 is represented by a point (n/N, k/n)
Adapting to unknown sparsity by controlling the false discovery rate
, 2000
"... We attempt to recover a high-dimensional vector observed in white noise, where the vector is known to be sparse, but the degree of sparsity is unknown. We consider three different ways of defining sparsity of a vector: using the fraction of nonzero terms; imposing power-law decay bounds on the order ..."
Abstract
-
Cited by 70 (11 self)
- Add to MetaCart
We attempt to recover a high-dimensional vector observed in white noise, where the vector is known to be sparse, but the degree of sparsity is unknown. We consider three different ways of defining sparsity of a vector: using the fraction of nonzero terms; imposing power-law decay bounds on the ordered entries; and controlling the ℓp norm for p small. We obtain a procedure which is asymptotically minimax for ℓr loss, simultaneously throughout a range of such sparsity classes. The optimal procedure is a data-adaptive thresholding scheme, driven by control of the False Discovery Rate (FDR). FDR control is a recent innovation in simultaneous testing, in which one seeks to ensure that at most a certain fraction of the rejected null hypotheses will correspond to false rejections. In our treatment, the FDR control parameter q also plays a controlling role in asymptotic minimaxity. Our results say that letting q = qn → 0 with problem size n is sufficient for asymptotic minimaxity, while keeping fixed q>1/2prevents asymptotic minimaxity. To our knowledge, this relation between ideas in simultaneous inference and asymptotic decision theory is new. Our work provides a new perspective on a class of model selection rules which has been introduced recently by several authors. These new rules impose complexity penalization of the form 2·log ( potential model size / actual model size). We exhibit a close connection with FDR-controlling procedures having q tending to 0; this connection strongly supports a conjecture of simultaneous asymptotic minimaxity for such model selection rules.
Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences
- Ann. Statist
, 2002
"... An empirical Bayes approach to the estimation of possibly sparse sequences observed in Gaussian white noise is set out and investigated. The prior considered is a mixture of an atom of probability at zero and a heavy-tailed density, with the mixing weight chosen by marginal maximum likelihood, in ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
An empirical Bayes approach to the estimation of possibly sparse sequences observed in Gaussian white noise is set out and investigated. The prior considered is a mixture of an atom of probability at zero and a heavy-tailed density, with the mixing weight chosen by marginal maximum likelihood, in the hope of adapting between sparse and dense sequences. If estimation is then carried out using the posterior median, this is a random thresholding procedure. Other thresholding rules using the same threshold can also be used. Probability bounds on the threshold chosen by the marginal maximum likelihood approach lead to overall bounds on the risk of the method over the class of signal sequences of length n with normalized ` p norm bounded by , for > 0 and 0 < p 2: Estimation error is measured by mean q loss, for 0 < q 2: For all p and q in (0; 2], the method achieves the optimal estimation rate as n ! 1 and ! 0 at various rates, and in this sense adapts automatically to the sparseness or otherwise of the underlying signal. In addition the risk is uniformly bounded over all signals. If the posterior mean is used as the estimator, the results still hold for q > 1: Simulations show excellent performance. Computationally, the method is tractable and essentially of O(n) complexity, and software is available. The extension to a modi ed thresholding method relevant to the wavelet estimation of derivatives of functions is also considered.
General empirical Bayes wavelet methods and exactly adaptive minimax estimation
-
, 2005
"... In many statistical problems, stochastic signals can be represented as a sequence of noisy wavelet coefficients. In this paper, we develop general empirical Bayes methods for the estimation of true signal. Our estimators approximate certain oracle separable rules and achieve adaptation to ideal risk ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
In many statistical problems, stochastic signals can be represented as a sequence of noisy wavelet coefficients. In this paper, we develop general empirical Bayes methods for the estimation of true signal. Our estimators approximate certain oracle separable rules and achieve adaptation to ideal risks and exact minimax risks in broad collections of classes of signals. In particular, our estimators are uniformly adaptive to the minimum risk of separable estimators and the exact minimax risks simultaneously in Besov balls of all smoothness and shape indices, and they are uniformly superefficient in convergence rates in all compact sets in Besov spaces with a finite secondary shape parameter. Furthermore, in classes nested between Besov balls of the same smoothness index, our estimators dominate threshold and James–Stein estimators within an infinitesimal fraction of the minimax risks. More general block empirical Bayes estimators are developed. Both white noise with drift and nonparametric regression are considered.
Asymptotic minimaxity of false discovery rate thresholding for sparse exponential data
- Ann. Statist
, 2006
"... Control of the False Discovery Rate (FDR) is an important development in multiple hypothesis testing, allowing the user to limit the fraction of rejected null hypotheses which correspond to false rejections (i.e. false discoveries). The FDR principle also can be used in multiparameter estimation pro ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Control of the False Discovery Rate (FDR) is an important development in multiple hypothesis testing, allowing the user to limit the fraction of rejected null hypotheses which correspond to false rejections (i.e. false discoveries). The FDR principle also can be used in multiparameter estimation problems to set thresholds for separating signal from noise when the signal is sparse. Success has been proven when the noise is Gaussian; see [3]. In this paper, we consider the application of FDR thresholding to a non-Gaussian setting, in hopes of learning whether the good asymptotic properties of FDR thresholding as an estimation tool hold more broadly than just at the standard Gaussian model. We consider a vector Xi, i = 1,..., n, whose coordinates are independent exponential with individual means µi. The vector µ is thought to be sparse, with most coordinates 1 and a small fraction significantly larger than 1. This models a situation where most coordinates are simply ‘noise’, but a small fraction of the coordinates contain ‘signal’. We develop an estimation theory working with log(µi) as the estimand, and use the percoordinate mean-squared error in recovering log(µi) to measure risk. We consider minimax
Properties and Refinements of The Fused Lasso
"... We consider estimating an unknown signal, which is both blocky and sparse, corrupted by additive noise. We study three interrelated least squares procedures and their asymptotic properties. The first procedure is the fused lasso, put forward by Friedman et al. (2007), which we modify into a differen ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We consider estimating an unknown signal, which is both blocky and sparse, corrupted by additive noise. We study three interrelated least squares procedures and their asymptotic properties. The first procedure is the fused lasso, put forward by Friedman et al. (2007), which we modify into a different estimator, called the fused adaptive lasso, with better properties. The other two estimators we discuss solve least squares problems on sieves, one constraining the maximal ℓ1 norm and the maximal total variation seminorm, the other restricting the number of blocks and the number of of nonzero coordinates of the signal. We derive conditions for the recovery of the true block partition and the true sparsity patterns by the fused lasso and the fused adaptive lasso, and convergence rates for the sieve estimators, explicitly in terms of the constraining parameters. 1
GENERAL MAXIMUM LIKELIHOOD EMPIRICAL BAYES ESTIMATION OF NORMAL MEANS
, 908
"... We propose a general maximum likelihood empirical Bayes (GM-LEB) method for the estimation of a mean vector based on observations with i.i.d. normal errors. We prove that under mild moment conditions on the unknown means, the average mean squared error (MSE) of the GMLEB is within an infinitesimal f ..."
Abstract
- Add to MetaCart
We propose a general maximum likelihood empirical Bayes (GM-LEB) method for the estimation of a mean vector based on observations with i.i.d. normal errors. We prove that under mild moment conditions on the unknown means, the average mean squared error (MSE) of the GMLEB is within an infinitesimal fraction of the minimum average MSE among all separable estimators which use a single deterministic estimating function on individual observations, provided that the risk is of greater order than (log n) 5 /n. We also prove that the GMLEB is uniformly approximately minimax in regular and weak ℓp balls when the order of the length-normalized norm of the unknown means is between (log n) κ1 /n

