Results 11 - 20
of
108
Enhancing Sparsity by Reweighted ℓ1 Minimization
, 2007
"... It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained ℓ1 minimization. In this paper, we study a novel method for sparse signal recovery that in many si ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained ℓ1 minimization. In this paper, we study a novel method for sparse signal recovery that in many situations outperforms ℓ1 minimization in the sense that substantially fewer measurements are needed for exact recovery. The algorithm consists of solving a sequence of weighted ℓ1-minimization problems where the weights used for the next iteration are computed from the value of the current solution. We present a series of experiments demonstrating the remarkable performance and broad applicability of this algorithm in the areas of sparse signal recovery, statistical estimation, error correction and image processing. Interestingly, superior gains are also achieved when our method is applied to recover signals with assumed near-sparsity in overcomplete representations—not by reweighting the ℓ1 norm of the coefficient sequence as is common, but by reweighting the ℓ1 norm of the transformed object. An immediate consequence is the possibility of highly efficient data acquisition protocols by improving on a technique known as compressed sensing.
Iterative thresholding algorithms
- in Preprint, 2007. [Online]. Available : http ://www.dsp.ece.rice.edu/cs
"... This article provides a variational formulation for hard and firm thresholding. A related functional can be used to regularize inverse problems by sparsity constraints. We show that a damped hard or firm thresholded Landweber iteration converges to its minimizer. This provides an alternative to an a ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
This article provides a variational formulation for hard and firm thresholding. A related functional can be used to regularize inverse problems by sparsity constraints. We show that a damped hard or firm thresholded Landweber iteration converges to its minimizer. This provides an alternative to an algorithm recently studied by the authors. We prove stability of minimizers with respect to the parameters of the functional and its regularization properties by means of Γ-convergence. All investigations are done in the general setting of vector-valued (multi-channel) data.
On the Asymptotic Properties of The Group Lasso Estimator in Least Squares Problems
"... We derive conditions guaranteeing estimation and model selection consistency, oracle properties and persistence for the group-lasso estimator and model selector proposed by Yuan and Lin (2006) for least squares problems when the covariates have a natural grouping structure. We study both the case of ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
We derive conditions guaranteeing estimation and model selection consistency, oracle properties and persistence for the group-lasso estimator and model selector proposed by Yuan and Lin (2006) for least squares problems when the covariates have a natural grouping structure. We study both the case of a fixed-dimensional parameter space with increasing sample size and the case when the model complexity changes with the sample size. 1
Giannakis, “RLS-weighted Lasso for adaptive estimation of sparse signals
- IEEE Trans. on Signal Proc
"... The batch least-absolute shrinkage and selection operator (Lasso) has well-documented merits for estimating sparse signals of interest emerging in various applications, where observations adhere to parsimonious linear regression models. To cope with linearly growing complexity and memory requirement ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
The batch least-absolute shrinkage and selection operator (Lasso) has well-documented merits for estimating sparse signals of interest emerging in various applications, where observations adhere to parsimonious linear regression models. To cope with linearly growing complexity and memory requirements that batch Lasso estimators face when processing observations sequentially, the present paper develops a recursive Lasso algorithm that can also track slowlyvarying sparse signals of interest. Performance analysis reveals that recursive Lasso can either estimate consistently the sparse signal’s support or its nonzero entries, but not both. This motivates the development of a weighted version of the recursive Lasso scheme with weights obtained from the recursive least-squares (RLS) algorithm. The resultant RLS-weighted Lasso algorithm provably estimates sparse signals consistently. Simulated tests compare competing alternatives and corroborate the performance of the novel algorithms in estimating time-invariant and tracking slow-varying signals under sparsity constraints. Index Terms — Lasso, Variable Selection, Sparsity, Tracking. 1.
On the conditions used to prove oracle results for the Lasso
- Electron. J. Stat
"... Abstract: Oracle inequalities and variable selection properties for the Lasso in linear models have been established under a variety of different assumptions on the design matrix. We show in this paper how the different conditions and concepts relate to each other. The restricted eigenvalue conditio ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Abstract: Oracle inequalities and variable selection properties for the Lasso in linear models have been established under a variety of different assumptions on the design matrix. We show in this paper how the different conditions and concepts relate to each other. The restricted eigenvalue condition [2] or the slightly weaker compatibility condition [18] are sufficient for oracle results. We argue that both these conditions allow for a fairly general class of design matrices. Hence, optimality of the Lasso for prediction and estimation holds for more general situations than what it appears from coherence [5, 4] or restricted isometry [10] assumptions.
High-Dimensional Non-Linear Variable Selection through Hierarchical Kernel Learning
, 2009
"... We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that characterize non-linear interactions between the original variables. T ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that characterize non-linear interactions between the original variables. To select efficiently from these many kernels, we use the natural hierarchical structure of the problem to extend the multiple kernel learning framework to kernels that can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a graph-adapted sparsity-inducing norm, in polynomial time in the number of selected kernels. Moreover, we study the consistency of variable selection in high-dimensional settings, showing that under certain assumptions, our regularization framework allows a number of irrelevant variables which is exponential in the number of observations. Our simulations on synthetic datasets and datasets from the UCI repository show state-of-the-art predictive performance for non-linear regression problems. 1
GAUSSIAN MODEL SELECTION WITH AN UNKNOWN VARIANCE
- SUBMITTED TO THE ANNALS OF STATISTICS
, 2007
"... Let Y be a Gaussian vector whose components are independent with a common unknown variance. We consider the problem of estimating the mean µ of Y by model selection. More precisely, we start with a collection S = {Sm, m ∈ M} of linear subspaces of R n and associate to each of these the least-squares ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Let Y be a Gaussian vector whose components are independent with a common unknown variance. We consider the problem of estimating the mean µ of Y by model selection. More precisely, we start with a collection S = {Sm, m ∈ M} of linear subspaces of R n and associate to each of these the least-squares estimator of µ on Sm. Then, we use a data driven penalized criterion in order to select one estimator among these. Our first objective is to analyze the performance of estimators associated to classical criteria such as FPE, AIC, BIC and AMDL. Our second objective is to propose better penalties that are versatile enough to take into account both the complexity of the collection S and the sample size. Then we apply those to solve various statistical problems such as variable selection, change point detections and signal estimation among others. Our results are based on a non-asymptotic risk bound with respect to the Euclidean loss for the selected estimator. Some analogous results are also established for the Kullback loss.
Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity
"... Abstract—A cooperative approach to the sensing task of wireless cognitive radio (CR) networks is introduced based on a basis expansion model of the power spectral density (PSD) map in space and frequency. Joint estimation of the model parameters enables identification of the (un)used frequency bands ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Abstract—A cooperative approach to the sensing task of wireless cognitive radio (CR) networks is introduced based on a basis expansion model of the power spectral density (PSD) map in space and frequency. Joint estimation of the model parameters enables identification of the (un)used frequency bands at arbitrary locations, and thus facilitates spatial frequency reuse. The novel scheme capitalizes on two forms of sparsity: the first one introduced by the narrow-band nature of transmit-PSDs relative to the broad swaths of usable spectrum; and the second one emerging from sparsely located active radios in the operational space. An estimator of the model coefficients is developed based on the Lasso algorithm to exploit these forms of sparsity and reveal the unknown positions of transmitting CRs. The resultant scheme can be implemented via distributed online iterations, which solve quadratic programs locally (one per radio), and are adaptive to changes in the system. Simulations corroborate that exploiting sparsity in CR sensing reduces spatial and frequency spectrum leakage by 15 dB relative to least-squares (LS) alternatives. Index Terms—Cognitive radios, compressive sampling, cooperative systems, distributed estimation, parallel network processing, sensing, sparse models, spectral analysis. I.
Pre-conditioning’’ for feature selection and regression in high-dimensional problems
- Ann. Statist
, 2008
"... We consider regression problems where the number of predictors greatly exceeds the number of observations. We propose a method for variable selection that first estimates the regression function, yielding a “preconditioned” response variable. The primary method used for this initial regression is su ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
We consider regression problems where the number of predictors greatly exceeds the number of observations. We propose a method for variable selection that first estimates the regression function, yielding a “preconditioned” response variable. The primary method used for this initial regression is supervised principal components. Then we apply a standard procedure such as forward stepwise selection or the LASSO to the preconditioned response variable. In a number of simulated and real data examples, this two-step procedure outperforms forward stepwise selection or the usual LASSO (applied directly to the raw outcome). We also show that under a certain Gaussian latent variable model, application of the LASSO to the preconditioned response variable is consistent as the number of predictors and observations increases. Moreover, when the observational noise is rather large, the suggested procedure can give a more accurate estimate than LASSO. We illustrate our method on some real problems, including survival analysis with microarray data.
Variable inclusion and shrinkage algorithms
- Journal of the American Statistical Association
, 2008
"... The Lasso is a popular and computationally efficient procedure for automatically performing both variable selection and coefficient shrinkage on linear regression models. One limitation of the Lasso is that the same tuning parameter is used for both variable selection and shrinkage. As a result, it ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
The Lasso is a popular and computationally efficient procedure for automatically performing both variable selection and coefficient shrinkage on linear regression models. One limitation of the Lasso is that the same tuning parameter is used for both variable selection and shrinkage. As a result, it typically ends up selecting a model with too many variables to prevent over shrinkage of the regression coefficients. We suggest an improved class of methods called ”Variable Inclusion and Shrinkage Algorithms” (VISA). Our approach is capable of selecting sparse models while avoiding over shrinkage problems and uses a path algorithm so is also computationally efficient. We show through extensive simulations that VISA significantly outperforms the Lasso and also provides improvements over more recent procedures, such as the Dantzig selector, Relaxed Lasso and Adaptive Lasso. In addition, we provide theoretical justification for VISA in terms of non-asymptotic bounds on the estimation error that suggest it should exhibit good performance even for large numbers of predictors. Finally, we extend the VISA methodology, path algorithm, and theoretical bounds to the Generalized Linear Models framework.

