Results 1  10
of
256
A fast iterative shrinkagethresholding algorithm with application to . . .
, 2009
"... We consider the class of Iterative ShrinkageThresholding Algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods is attractive due to its simplicity, however, they are also known to converge quite slowly. In this paper we present a Fast Iterat ..."
Abstract

Cited by 1055 (8 self)
 Add to MetaCart
(Show Context)
We consider the class of Iterative ShrinkageThresholding Algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods is attractive due to its simplicity, however, they are also known to converge quite slowly. In this paper we present a Fast Iterative ShrinkageThresholding Algorithm (FISTA) which preserves the computational simplicity of ISTA, but with a global rate of convergence which is proven to be significantly better, both theoretically and practically. Initial promising numerical results for waveletbased image deblurring demonstrate the capabilities of FISTA.
Smooth minimization of nonsmooth functions
 Math. Programming
, 2005
"... In this paper we propose a new approach for constructing efficient schemes for nonsmooth convex optimization. It is based on a special smoothing technique, which can be applied to the functions with explicit maxstructure. Our approach can be considered as an alternative to blackbox minimization. F ..."
Abstract

Cited by 522 (1 self)
 Add to MetaCart
(Show Context)
In this paper we propose a new approach for constructing efficient schemes for nonsmooth convex optimization. It is based on a special smoothing technique, which can be applied to the functions with explicit maxstructure. Our approach can be considered as an alternative to blackbox minimization. From the viewpoint of efficiency estimates, we manage to improve the traditional bounds on the number of iterations of the gradient schemes from O unchanged. 1 ɛ 2 to O
An interiorpoint method for largescale l1regularized logistic regression
 Journal of Machine Learning Research
, 2007
"... Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interiorpoint method for solving largescale ℓ1regularized logistic regression problems. Small problems with up to a thousand ..."
Abstract

Cited by 284 (8 self)
 Add to MetaCart
(Show Context)
Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interiorpoint method for solving largescale ℓ1regularized logistic regression problems. Small problems with up to a thousand or so features and examples can be solved in seconds on a PC; medium sized problems, with tens of thousands of features and examples, can be solved in tens of seconds (assuming some sparsity in the data). A variation on the basic method, that uses a preconditioned conjugate gradient method to compute the search step, can solve very large problems, with a million features and examples (e.g., the 20 Newsgroups data set), in a few minutes, on a PC. Using warmstart techniques, a good approximation of the entire regularization path can be computed much more efficiently than by solving a family of problems independently.
Fast gradientbased algorithms for constrained total variation image denoising and deblurring problems
 IEEE TRANSACTION ON IMAGE PROCESSING
, 2009
"... This paper studies gradientbased schemes for image denoising and deblurring problems based on the discretized total variation (TV) minimization model with constraints. We derive a fast algorithm for the constrained TVbased image deburring problem. To achieve this task we combine an acceleration of ..."
Abstract

Cited by 166 (2 self)
 Add to MetaCart
(Show Context)
This paper studies gradientbased schemes for image denoising and deblurring problems based on the discretized total variation (TV) minimization model with constraints. We derive a fast algorithm for the constrained TVbased image deburring problem. To achieve this task we combine an acceleration of the well known dual approach to the denoising problem with a novel monotone version of a fast iterative shrinkage/thresholding algorithm (FISTA) we have recently introduced. The resulting gradientbased algorithm shares a remarkable simplicity together with a proven global rate of convergence which is significantly better than currently known gradient projectionsbased methods. Our results are applicable to both the anisotropic and isotropic discretized TV functionals. Initial numerical results demonstrate the viability and efficiency of the proposed algorithms on image deblurring problems with box constraints.
Solving monotone inclusions via compositions of nonexpansive averaged operators
 Optimization
, 2004
"... A unified fixed point theoretic framework is proposed to investigate the asymptotic behavior of algorithms for finding solutions to monotone inclusion problems. The basic iterative scheme under consideration involves nonstationary compositions of perturbed averaged nonexpansive operators. The analys ..."
Abstract

Cited by 145 (31 self)
 Add to MetaCart
A unified fixed point theoretic framework is proposed to investigate the asymptotic behavior of algorithms for finding solutions to monotone inclusion problems. The basic iterative scheme under consideration involves nonstationary compositions of perturbed averaged nonexpansive operators. The analysis covers proximal methods for common zero problems as well as various splitting methods for finding a zero of the sum of monotone operators.
Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method
 SIAM J. Sci. Comput
, 2001
"... We describe new algorithms of the locally optimal block preconditioned conjugate gradient (LOBPCG) method for symmetric eigenvalue problems, based on a local optimization of a threeterm recurrence, and suggest several other new methods. To be able to compare numerically different methods in the cla ..."
Abstract

Cited by 141 (19 self)
 Add to MetaCart
(Show Context)
We describe new algorithms of the locally optimal block preconditioned conjugate gradient (LOBPCG) method for symmetric eigenvalue problems, based on a local optimization of a threeterm recurrence, and suggest several other new methods. To be able to compare numerically different methods in the class, with different preconditioners, we propose a common system of model tests, using random preconditioners and initial guesses. As the "ideal" control algorithm, we advocate the standard preconditioned conjugate gradient method for finding an eigenvector as an element of the nullspace of the corresponding homogeneous system of linear equations under the assumption that the eigenvalue is known. We recommend that every new preconditioned eigensolver be compared with this "ideal" algorithm on our model test problems in terms of the speed of convergence, costs of every iteration, and memory requirements. We provide such comparison for our LOBPCG method. Numerical results establish that our algorithm is practically as efficient as the "ideal" algorithm when the same preconditioner is used in both methods. We also show numerically that the LOBPCG method provides approximations to first eigenpairs of about the same quality as those by the much more expensive global optimization method on the same generalized block Krylov subspace. We propose a new version of block Davidson's method as a generalization of the LOBPCG method. Finally, direct numerical comparisons with the JacobiDavidson method show that our method is more robust and converges almost two times faster.
Incremental Subgradient Methods For Nondifferentiable Optimization
, 2001
"... We consider a class of subgradient methods for minimizing a convex function that consists of the sum of a large number of component functions. This type of minimization arises in a dual context from Lagrangian relaxation of the coupling constraints of large scale separable problems. The idea is to p ..."
Abstract

Cited by 118 (10 self)
 Add to MetaCart
We consider a class of subgradient methods for minimizing a convex function that consists of the sum of a large number of component functions. This type of minimization arises in a dual context from Lagrangian relaxation of the coupling constraints of large scale separable problems. The idea is to perform the subgradient iteration incrementally, by sequentially taking steps along the subgradients of the component functions, with intermediate adjustment of the variables after processing each component function. This incremental approach has been very successful in solving large differentiable least squares problems, such as those arising in the training of neural networks, and it has resulted in a much better practical rate of convergence than the steepest descent method. In this paper, we establish the convergence properties of a number of variants of incremental subgradient methods, including some that are stochastic. Based on the analysis and computational experiments, the methods appear very promising and effective for important classes of large problems. A particularly interesting discovery is that by randomizing the order of selection of component functions for iteration, the convergence rate is substantially improved.
Distributed stochastic subgradient projection algorithms for convex optimization
 Journal of Optimization Theory and Applications
, 2010
"... Abstract. We consider a distributed multiagent network system where the goal is to minimize a sum of convex objective functions of the agents subject to a common convex constraint set. Each agent maintains an iterate sequence and communicates the iterates to its neighbors. Then, each agent combines ..."
Abstract

Cited by 82 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We consider a distributed multiagent network system where the goal is to minimize a sum of convex objective functions of the agents subject to a common convex constraint set. Each agent maintains an iterate sequence and communicates the iterates to its neighbors. Then, each agent combines weighted averages of the received iterates with its own iterate, and adjusts the iterate by using subgradient information (known with stochastic errors) of its own function and by projecting onto the constraint set. The goal of this paper is to explore the effects of stochastic subgradient errors on the convergence of the algorithm. We first consider the behavior of the algorithm in mean, and then the convergence with probability 1 and in mean square. We consider general stochastic errors that have uniformly bounded second moments and obtain bounds on the limiting performance of the algorithm in mean for diminishing and nondiminishing stepsizes. When the means of the errors diminish, we prove that there is mean consensus between the agents and mean convergence to the optimum function value for diminishing stepsizes. When the mean errors diminish sufficiently fast, we strengthen the results to consensus and convergence of the iterates to an optimal solution with probability 1 and in mean square.
Successive Overrelaxation for Support Vector Machines
 IEEE Transactions on Neural Networks
, 1998
"... Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs [11, 12, 9] is used to train a support vector machine (SVM) [20, 3] for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a t ..."
Abstract

Cited by 79 (14 self)
 Add to MetaCart
Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs [11, 12, 9] is used to train a support vector machine (SVM) [20, 3] for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a time, similar to Platt's sequential minimal optimization (SMO) algorithm [18] which handles two constraints at a time, it can process very large datasets that need not reside in memory. The algorithm converges linearly to a solution. Encouraging numerical results are presented on datasets with up to 10 million points. Such massive discrimination problems cannot be processed by conventional linear or quadratic programming methods, and to our knowledge have not been solved by other methods. 1 Introduction Successive overrelaxation, originally developed for the solution of large systems of linear equations [16, 15] has been successfully applied to mathematical programming problems [4, 11, 12, 1...