Results 1  10
of
74
Coordinate descent algorithms for lasso penalized regression
 Ann. Appl. Stat
, 2008
"... Imposition of a lasso penalty shrinks parameter estimates toward zero and performs continuous model selection. Lasso penalized regression is capable of handling linear regression problems where the number of predictors far exceeds the number of cases. This paper tests two exceptionally fast algorith ..."
Abstract

Cited by 58 (0 self)
 Add to MetaCart
Imposition of a lasso penalty shrinks parameter estimates toward zero and performs continuous model selection. Lasso penalized regression is capable of handling linear regression problems where the number of predictors far exceeds the number of cases. This paper tests two exceptionally fast algorithms for estimating regression coefficients with a lasso penalty. The previously known ℓ2 algorithm is based on cyclic coordinate descent. Our new ℓ1 algorithm is based on greedy coordinate descent and Edgeworth’s algorithm for ordinary ℓ1 regression. Each algorithm relies on a tuning constant that can be chosen by crossvalidation. In some regression problems it is natural to group parameters and penalize parameters group by group rather than separately. If the group penalty is proportional to the Euclidean norm of the parameters of the group, then it is possible to majorize the norm and reduce parameter estimation to ℓ2 regression with a lasso penalty. Thus, the existing algorithm can be extended to novel settings. Each of the algorithms discussed is tested via either simulated or real data or both. The Appendix proves that a greedy form of the ℓ2 algorithm converges to the minimum value of the objective function.
MonteCarlo Sure: A blackbox optimization of regularization parameters for general denoising algorithms
 IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2008
"... We consider the problem of optimizing the parameters of a given denoising algorithm for restoration of a signal corrupted by white Gaussian noise. To achieve this, we propose to minimize Stein’s unbiased risk estimate (SURE) which provides a means of assessing the true meansquared error (MSE) pure ..."
Abstract

Cited by 28 (4 self)
 Add to MetaCart
We consider the problem of optimizing the parameters of a given denoising algorithm for restoration of a signal corrupted by white Gaussian noise. To achieve this, we propose to minimize Stein’s unbiased risk estimate (SURE) which provides a means of assessing the true meansquared error (MSE) purely from the measured data without need for any knowledge about the noisefree signal. Specifically, we present a novel MonteCarlo technique which enables the user to calculate SURE for an arbitrary denoising algorithm characterized by some specific parameter setting. Our method is a blackbox approach which solely uses the response of the denoising operator to additional input noise and does not ask for any information about its functional form. This, therefore, permits the use of SURE for optimization of a wide variety of denoising algorithms. We justify our claims by presenting experimental results for SUREbased optimization of a series of popular imagedenoising algorithms such as totalvariation denoising, wavelet softthresholding, and Wiener filtering/smoothing splines. In the process, we also compare the performance of these methods. We demonstrate numerically that SURE computed using the new approach accurately predicts the true MSE for all the considered algorithms. We also show that SURE uncovers the optimal values of the parameters in all cases.
Convergent incremental optimization transfer algorithms: Application to tomography
 IEEE Trans. Med. Imag., Submitted
"... Abstract—No convergent ordered subsets (OS) type image reconstruction algorithms for transmission tomography have been proposed to date. In contrast, in emission tomography, there are two known families of convergent OS algorithms: methods that use relaxation parameters (Ahn and Fessler, 2003), and ..."
Abstract

Cited by 24 (10 self)
 Add to MetaCart
Abstract—No convergent ordered subsets (OS) type image reconstruction algorithms for transmission tomography have been proposed to date. In contrast, in emission tomography, there are two known families of convergent OS algorithms: methods that use relaxation parameters (Ahn and Fessler, 2003), and methods based on the incremental expectation maximization (EM) approach (Hsiao et al., 2002). This paper generalizes the incremental EM approach by introducing a general framework that we call “incremental optimization transfer. ” Like incremental EM methods, the proposed algorithms accelerate convergence speeds and ensure global convergence (to a stationary point) under mild regularity conditions without requiring inconvenient relaxation parameters. The general optimization transfer framework enables the use of a very broad family of nonEM surrogate functions. In particular, this paper provides the first convergent OStype algorithm for transmission tomography. The general approach is applicable to both monoenergetic and polyenergetic transmission scans as well as to other image reconstruction problems. We propose a particular incremental optimization transfer method for (nonconcave) penalizedlikelihood (PL) transmission image reconstruction by using separable paraboloidal surrogates (SPS). Results show that the new “transmission incremental optimization transfer (TRIOT) ” algorithm is faster than nonincremental ordinary SPS and even OSSPS yet is convergent. I.
A fast thresholded Landweber algorithm for waveletregularized multidimensional deconvolution
 IEEE Trans. Image Process
, 2008
"... Abstract—We present a fast variational deconvolution algorithm that minimizes a quadratic data term subject to a regularization on the 1norm of the wavelet coefficients of the solution. Previously available methods have essentially consisted in alternating between a Landweber iteration and a wavele ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
Abstract—We present a fast variational deconvolution algorithm that minimizes a quadratic data term subject to a regularization on the 1norm of the wavelet coefficients of the solution. Previously available methods have essentially consisted in alternating between a Landweber iteration and a waveletdomain softthresholding operation. While having the advantage of simplicity, they are known to converge slowly. By expressing the cost functional in a Shannon wavelet basis, we are able to decompose the problem into a series of subbanddependent minimizations. In particular, this allows for larger (subbanddependent) step sizes and threshold levels than the previous method. This improves the convergence properties of the algorithm significantly. We demonstrate a speedup of one order of magnitude in practical situations. This makes waveletregularized deconvolution more widely accessible, even for applications with a strong limitation on computational complexity. We present promising results in 3D deconvolution microscopy, where the size of typical data sets does not permit more than a few tens of iterations. Index Terms—Deconvolution, fast, fluorescence microscopy, iterative, nonlinear, sparsity, 3D, thresholding, wavelets,
On the convergence of concaveconvex procedure
 In NIPS Workshop on Optimization for Machine Learning
, 2009
"... The concaveconvex procedure (CCCP) is a majorizationminimization algorithm that solves d.c. (difference of convex functions) programs as a sequence of convex programs. In machine learning, CCCP is extensively used in many learning algorithms like sparse support vector machines (SVMs), transductive ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
The concaveconvex procedure (CCCP) is a majorizationminimization algorithm that solves d.c. (difference of convex functions) programs as a sequence of convex programs. In machine learning, CCCP is extensively used in many learning algorithms like sparse support vector machines (SVMs), transductive SVMs, sparse principal component analysis, etc. Though widely used in many applications, the convergence behavior of CCCP has not gotten a lot of specific attention. Yuille and Rangarajan analyzed its convergence in their original paper, however, we believe the analysis is not complete. Although the convergence of CCCP can be derived from the convergence of the d.c. algorithm (DCA), its proof is more specialized and technical than actually required for the specific case of CCCP. In this paper, we follow a different reasoning and show how Zangwill’s global convergence theory of iterative algorithms provides a natural framework to prove the convergence of CCCP, allowing a more elegant and simple proof. This underlines Zangwill’s theory as a powerful and general framework to deal with the convergence issues of iterative algorithms, after also being used to prove the convergence of algorithms like expectationmaximization, generalized alternating minimization, etc. In this paper, we provide a rigorous analysis of the convergence of CCCP by addressing these questions: (i) When does CCCP find a local minimum or a stationary point of the d.c. program under consideration? (ii) When does the sequence generated by CCCP converge? We also present an open problem on the issue of local convergence of CCCP. 1
Bayesian Hyperspectral Image Segmentation with Discriminative Class Learning
"... Abstract. This paper presents a new Bayesian approach to hyperspectral image segmentation that boosts the performance of the discriminative classifiers. This is achieved by combining class densities based on discriminative classifiers with a MultiLevel Logistic MarkovGibs prior. This density favor ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
Abstract. This paper presents a new Bayesian approach to hyperspectral image segmentation that boosts the performance of the discriminative classifiers. This is achieved by combining class densities based on discriminative classifiers with a MultiLevel Logistic MarkovGibs prior. This density favors neighbouring labels of the same class. The adopted discriminative classifier is the Fast Sparse Multinomial Regression. The discrete optimization problem one is led to is solved efficiently via graph cut tools. The effectiveness of the proposed method is evaluated, with simulated and real AVIRIS images, in two directions: 1) to improve the classification performance and 2) to decrease the size of the training sets. 1
A Fast Multilevel Algorithm for WaveletRegularized Image Restoration
 IEEE Trans. Image Processing
"... Abstract—We present a multilevel extension of the popular “thresholded Landweber ” algorithm for waveletregularized image restoration that yields an order of magnitude speed improvement over the standard fixedscale implementation. The method is generic and targeted towards largescale linear inver ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
Abstract—We present a multilevel extension of the popular “thresholded Landweber ” algorithm for waveletregularized image restoration that yields an order of magnitude speed improvement over the standard fixedscale implementation. The method is generic and targeted towards largescale linear inverse problems, such as 3D deconvolution microscopy. The algorithm is derived within the framework of bound optimization. The key idea is to successively update the coefficients in the various wavelet channels using fixed, subbandadapted iteration parameters (step sizes and threshold levels). The optimization problem is solved efficiently via a proper chaining of basic iteration modules. The higher level description of the algorithm is similar to that of a multigrid solver for PDEs, but there is one fundamental difference: the latter iterates though a sequence of multiresolution versions of the original problem, while, in our case, we cycle through the wavelet subspaces corresponding to the difference between successive approximations. This strategy is motivated by the special structure of the problem and the preconditioning properties of the wavelet representation. We establish that the solution of the restoration problem corresponds to a fixed point of our multilevel optimizer. We also provide experimental evidence that the improvement in convergence rate is essentially determined by the (unconstrained) linear part of the algorithm, irrespective of the type of wavelet. Finally, we illustrate the technique with some image deconvolution examples, including some real 3D fluorescence microscopy data. Index Terms—Bound optimization, confocal, convergence acceleration, deconvolution, fast, fluorescence, inverse problems,regularization, majorizeminimize, microscopy, multigrid, multilevel, multiresolution, multiscale, nonlinear, optimization transfer, preconditioning, reconstruction, restoration, sparsity,
A generative model for brain tumor segmentation in multimodal images
 IN: PROC MICCAI, LNCS 6362
, 2010
"... We introduce a generative probabilistic model for segmentation of tumors in multidimensional images. The model allows for different tumor boundaries in each channel, reflecting difference in tumor appearance across modalities. We augment a probabilistic atlas of healthy tissue priors with a laten ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
We introduce a generative probabilistic model for segmentation of tumors in multidimensional images. The model allows for different tumor boundaries in each channel, reflecting difference in tumor appearance across modalities. We augment a probabilistic atlas of healthy tissue priors with a latent atlas of the lesion and derive the estimation algorithm to extract tumor boundaries and the latent atlas from the image data. We present experiments on 25 glioma patient data sets, demonstrating significant improvement over the traditional multivariate tumor segmentation.
Efficient Minimization Method for a Generalized Total Variation Functional
, 2009
"... Replacing the ℓ² data fidelity term of the standard Total Variation (TV) functional with an ℓ¹ data fidelity term has been found to offer a number of theoretical and practical benefits. Efficient algorithms for minimizing this ℓ¹TV functional have only recently begun to be developed, the fastest of ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
Replacing the ℓ² data fidelity term of the standard Total Variation (TV) functional with an ℓ¹ data fidelity term has been found to offer a number of theoretical and practical benefits. Efficient algorithms for minimizing this ℓ¹TV functional have only recently begun to be developed, the fastest of which exploit graph representations, and are restricted to the denoising problem. We describe an alternative approach that minimizes a generalized TV functional, including both ℓ²TV and ℓ¹TV as special cases, and is capable of solving more general inverse problems than denoising (e.g. deconvolution). This algorithm is competitive with the graphbased methods in the denoising case, and is the fastest algorithm of which we are aware for general inverse problems involving a nontrivial forward linear operator.
Hyperspectral Image Segmentation Using a New Bayesian Approach with Active Learning
"... This paper introduces a new supervised Bayesian approach to hyperspectral image segmentation with active learning, which consists of two main steps: (a) learning, for each class label, the posterior probability distributions using a multinomial logistic regression model; (b) segmenting the hyperspec ..."
Abstract

Cited by 12 (10 self)
 Add to MetaCart
This paper introduces a new supervised Bayesian approach to hyperspectral image segmentation with active learning, which consists of two main steps: (a) learning, for each class label, the posterior probability distributions using a multinomial logistic regression model; (b) segmenting the hyperspectral image based on the posterior probability distribution learned in step (a) and on a multilevel logistic prior which encodes the spatial information. The multinomial logistic regressors are learned by using the recently introduced logistic regression via splitting and augmented Lagrangian (LORSAL) algorithm. The maximum a posteriori segmentation is efficiently computed by the αExpansion mincut based integer optimization algorithm. Aiming at reducing the costs of acquiring large training sets, active learning is performed using a mutual information based criterion. The stateoftheart performance of the proposed approach is illustrated using both simulated and real hyperspectral data sets in a number of experimental comparisons with recently introduced hyperspectral image classification methods. Index Terms Hyperspectral image segmentation, sparse multinomial logistic regression, illposed problems, graph cuts, integer optimization, mutual information, active learning. I.