## Linear convergence of iterative soft-thresholding

Venue: | J. Fourier Anal. Appl |

Citations: | 35 - 9 self |

### BibTeX

@ARTICLE{Bredies_linearconvergence,

author = {Kristian Bredies and Dirk A. Lorenz},

title = {Linear convergence of iterative soft-thresholding},

journal = {J. Fourier Anal. Appl},

year = {},

pages = {813--837}

}

### OpenURL

### Abstract

ABSTRACT. In this article a unified approach to iterative soft-thresholding algorithms for the solution of linear operator equations in infinite dimensional Hilbert spaces is presented. We formulate the algorithm in the framework of generalized gradient methods and present a new convergence analysis. As main result we show that the algorithm converges with linear rate as soon as the underlying operator satisfies the so-called finite basis injectivity property or the minimizer possesses a so-called strict sparsity pattern. Moreover it is shown that the constants can be calculated explicitly in special cases (i.e. for compact operators). Furthermore, the techniques also can be used to establish linear convergence for related methods such as the iterative thresholding algorithm for joint sparsity and the accelerated gradient projection method. 1.

### Citations

808 | Least angle regression
- Efron, Hastie, et al.
- 2004
(Show Context)
Citation Context ... of the above (non-smooth) minimization problem is not straightforward. There is a vast amount of literature dealing with efficient computational algorithms for equivalent formulations of the problem =-=[8, 12, 14, 16, 21, 22, 27, 33]-=-, both in the infinite-dimensional setting as well as for finitely many dimensions, but mostly for the finitedimensional case. An often-used, simple but apparently slow algorithm is the iterative soft... |

467 |
Convex analysis and variational problems
- Ekeland, Temam
- 1976
(Show Context)
Citation Context ...blem 1 − sL 2 〈u − v, w − v〉 . (3.2) s (3.3) ) Ds(u). (3.4) min v∈H ‖v − u + sF ′ (u)‖ 2 2 + sΦ(v) it immediately follows that the subdifferential inclusion u − sF ′(u) − v ∈ s∂Φ(v) is satisfied, see =-=[13, 29]-=- for an introduction to convex analysis and subdifferential calculus. This can be rewritten to 〈u − sF ′ (u) − v, w − v〉 ≤ s ( Φ(w) − Φ(v) ) for all w ∈ H , while rearranging and dividing by s proves ... |

458 |
An iterative thresholding algorithm for linear inverse problems with a sparsity constraint
- Daubechies, Defrise, et al.
- 2004
(Show Context)
Citation Context ...stead of considering the linear equation, a regularized problem is posed for which the solution is stable with respect to noise. A common approach is to regularize by minimizing a Tikhonov functional =-=[7, 15, 28]-=-. A special class of these regularizations has been of recent interest, namely of the type min u∈ℓ 2 ‖Ku − f‖2 ∞∑ + αk|uk| . (1.1) 2 k=1 Math Subject Classifications. 65J22, 46N10, 49M05. Keywords and... |

318 | Stable recovery of sparse overcomplete representations in the presence of noise
- Donoho, Elad, et al.
(Show Context)
Citation Context ...2 ∞∑ αk|〈u, ψk〉| can be rephrased as (1.1) with K = AB. Indeed, solutions of this type of problem admit only finitely many non-zero coefficients and often coincide with the sparsest solution possible =-=[10,18,20]-=-. Unfortunately, the numerical solution of the above (non-smooth) minimization problem is not straightforward. There is a vast amount of literature dealing with efficient computational algorithms for ... |

317 | Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems
- Figueiredo, Nowak, et al.
- 2007
(Show Context)
Citation Context ... of the above (non-smooth) minimization problem is not straightforward. There is a vast amount of literature dealing with efficient computational algorithms for equivalent formulations of the problem =-=[8, 12, 14, 16, 21, 22, 27, 33]-=-, both in the infinite-dimensional setting as well as for finitely many dimensions, but mostly for the finitedimensional case. An often-used, simple but apparently slow algorithm is the iterative soft... |

294 |
Signal recovery by proximal forwardbackward splitting,Multiscale Modeling and Simulation
- Combettes, Wajs
- 2005
(Show Context)
Citation Context ...imate for the distance to a minimizer to evaluate the fidelity of the outcome of the computations. The convergence proofs in the infinite-dimensional case presented in [7], and for generalizations in =-=[5]-=-, however, do not imply a-priori estimates and do not inherently give any rate of convergence, although, in many cases, linear convergence can be deduced quite easily from the fact that iterative thre... |

172 | A new approach to variable selection in least squares problems
- Osborne, Presnell, et al.
- 2000
(Show Context)
Citation Context ... of the above (non-smooth) minimization problem is not straightforward. There is a vast amount of literature dealing with efficient computational algorithms for equivalent formulations of the problem =-=[8, 12, 14, 16, 21, 22, 27, 33]-=-, both in the infinite-dimensional setting as well as for finitely many dimensions, but mostly for the finitedimensional case. An often-used, simple but apparently slow algorithm is the iterative soft... |

135 |
Global uniqueness for a two-dimensional inverse boundary value problem
- Nachman
- 1996
(Show Context)
Citation Context ...mples are the Radon transform [25], solution operators for partial differential equations, e.g. in heat conduction problems [6] or inverse boundary value problems like electrical impedance tomography =-=[26]-=-. The combination with a synthesis operator B for an orthonormal basis does not influence the injectivity.4 Kristian Bredies and Dirk A. Lorenz Moreover, the restriction to orthonormal bases can be r... |

123 |
Monotone operators in Banach spaces and nonlinear partial differential equations
- Showalter
- 1996
(Show Context)
Citation Context ... u ∗ which satisfies the optimality condition w ∗ = −K ∗ (Ku ∗ − f) ∈ ∂Φ(u ∗ ). As one knows from convex analysis, this can also be formulated pointwise, and Asplund’s characterization of ∂| · | (see =-=[31]-=-, Proposition II.8.6) leads to |w ∗ k | ∗ ≤ αk if u ∗ k = 0 |w ∗ k | ∗ = αk and w ∗ k · u∗k = αk|u ∗ k | if u∗k ̸= 0 where w ∗ k · u∗ k denotes the usual inner product of w∗ k and u∗ k in RN . Now, on... |

90 | Distributed compressed sensing
- Baron, Wakin, et al.
- 2005
(Show Context)
Citation Context ... method leads to a special case of the so-called proximal forward-backward splitting method which amounts to the iteration ( ( u − sn(F ′ (u n ) + b n ) ) + a n − u n) u n+1 = u n + tn Jsn where tn ∈ =-=[0,1]-=- and {a n }, {b n } are absolutely summable sequences in H. In [5], it is shown that this method converges strongly to a minimizer under appropriate conditions. There exist, however, no general statem... |

83 | Recovery of exact sparse representations in the presence of noise,” INRIA
- Fuchs
- 2004
(Show Context)
Citation Context ...2 ∞∑ αk|〈u, ψk〉| can be rephrased as (1.1) with K = AB. Indeed, solutions of this type of problem admit only finitely many non-zero coefficients and often coincide with the sparsest solution possible =-=[10,18,20]-=-. Unfortunately, the numerical solution of the above (non-smooth) minimization problem is not straightforward. There is a vast amount of literature dealing with efficient computational algorithms for ... |

64 | Accelerated projected gradient method for linear inverse problems with sparsity constraints: ArXiv e-prints, 706, no. 0706.4297, accessed April 23, 2008; http://adsabs.harvard.edu/abs/2007arXiv0706.4297D. D D F G H H L O P P R T T v A26 Hennenfent et al.o
- Daubechies, Fornasier, et al.
- 2007
(Show Context)
Citation Context ...− u n )‖ 2 ≤ 2(1 − δ)‖u n+1 − u n ‖ 2 , (4.5) is sufficient for the above, since one has the estimate (3.3). Together with the boundedness 0 < s ≤ sn, this is exactly the step-size ‘Condition (B)’ in =-=[8]-=-. Hence, as can be easily seen, the choice gives sufficient descent in order to apply Proposition 2. Consequently, linear convergence remains valid for such an ‘accelerated’ iterative soft-thresholdin... |

61 |
Regularization of inverse problems, volume 375 of Mathematics and its Applications
- Engl, Hanke, et al.
- 1996
(Show Context)
Citation Context ...stead of considering the linear equation, a regularized problem is posed for which the solution is stable with respect to noise. A common approach is to regularize by minimizing a Tikhonov functional =-=[7, 15, 28]-=-. A special class of these regularizations has been of recent interest, namely of the type min u∈ℓ 2 ‖Ku − f‖2 ∞∑ + αk|uk| . (1.1) 2 k=1 Math Subject Classifications. 65J22, 46N10, 49M05. Keywords and... |

61 | Highly sparse representations from dictionaries are unique and independent of the sparseness measure
- Gribonval, Nielsen
- 2006
(Show Context)
Citation Context ...2 ∞∑ αk|〈u, ψk〉| can be rephrased as (1.1) with K = AB. Indeed, solutions of this type of problem admit only finitely many non-zero coefficients and often coincide with the sparsest solution possible =-=[10,18,20]-=-. Unfortunately, the numerical solution of the above (non-smooth) minimization problem is not straightforward. There is a vast amount of literature dealing with efficient computational algorithms for ... |

57 |
Coordinate and subspace optimization methods for linear least squares with non-quadratic regularization
- Elad, Matalon, et al.
(Show Context)
Citation Context |

53 |
Convex Programming in Hilbert Space
- Goldstein
- 1964
(Show Context)
Citation Context ...on of steepest descent, i.e. the negative gradient. In constrained optimization, the gradient is often projected back to the feasible set, yielding the well-known gradient projection algorithm method =-=[11, 19, 23]-=-. In the following, a step of generalization is introduced: The method is extended to deal with sums of smooth and nonsmooth functionals, and covers in particular constrained smooth minimiza-Linear c... |

43 |
A method for largescale ℓ1-regularized least squares problems with applications in signal processing and statistics
- KIM, KOH, et al.
- 2008
(Show Context)
Citation Context |

37 |
Global and asymptotic convergence rate estimates for a class of projected gradient processes
- Dunn
- 1981
(Show Context)
Citation Context ...on of steepest descent, i.e. the negative gradient. In constrained optimization, the gradient is often projected back to the feasible set, yielding the well-known gradient projection algorithm method =-=[11, 19, 23]-=-. In the following, a step of generalization is introduced: The method is extended to deal with sums of smooth and nonsmooth functionals, and covers in particular constrained smooth minimiza-Linear c... |

30 |
Constrained minimization problems
- Levitin, Polyak
- 1966
(Show Context)
Citation Context ...on of steepest descent, i.e. the negative gradient. In constrained optimization, the gradient is often projected back to the feasible set, yielding the well-known gradient projection algorithm method =-=[11, 19, 23]-=-. In the following, a step of generalization is introduced: The method is extended to deal with sums of smooth and nonsmooth functionals, and covers in particular constrained smooth minimiza-Linear c... |

28 |
An outline of adaptive wavelet galerkin methods for tikhonov regularization of inverse parabolic problems
- Dahlke, Maass
- 2002
(Show Context)
Citation Context ...roperty is natural, since the operators A are often injective. Prominent examples are the Radon transform [25], solution operators for partial differential equations, e.g. in heat conduction problems =-=[6]-=- or inverse boundary value problems like electrical impedance tomography [26]. The combination with a synthesis operator B for an orthonormal basis does not influence the injectivity.4 Kristian Bredi... |

26 |
A generalized conditional gradient method and its connection to an iterative shrinkage method. Accepted for publication in Computational Optimization and Applications
- Bredies, Lorenz, et al.
- 2005
(Show Context)
Citation Context ...ooth minimiza-Linear convergence of iterative soft-thresholding 5 tion problems. The gain is that the iteration (1.2) fits into this generalized framework. Similar to the generalization performed in =-=[4]-=-, its main idea is to replace the constraint by a general proper, convex and lower semi-continuous functional Φ which leads, for the gradient projection method, to the successive application of the as... |

24 | Convergence rates and source conditions for Tikhonov regularization with sparsity constraints
- Lorenz
- 2008
(Show Context)
Citation Context ...FBI property. This property also plays a role in the performance analysis of Newton methods applied to minimization problems with sparsity constraints [21] and error estimates for ℓ 1 -regularization =-=[24]-=-. As we have moreover seen, linear convergence can also be obtained whenever we have convergence a solution with strict sparsity pattern. This result is closely connected with the fact that (1.1), con... |

24 |
Characteristic inequalities of uniformly convex and uniformly smooth Banach spaces
- Xu, Roach
- 1991
(Show Context)
Citation Context ...of a norm of a 2-convex Banach space X, i.e. Φ(u) = ‖u‖ p X with p ∈ ]1,2], which is moreover continuously embedded in H, one can show that ‖v − u ∗ ‖ 2 X ≤ C1R(v) holds on each bounded set of X, see =-=[34]-=-. Consequently, with jp = ∂ 1 p‖ · ‖p X denoting the duality mapping with gauge t ↦→ tp−1 , ‖v − u ∗ ‖ 2 ≤ C2‖v − u ∗ ‖ 2 X ( p ≤ C1C2 ‖v‖ X − ‖u∗‖ p X − p〈jp(u ∗ ), v − u ∗ 〉 ) = cR(v) observing that... |

23 | Iterated hard shrinkage for minimization problems with sparsity constraints
- Bredies, Lorenz
(Show Context)
Citation Context ...ses, linear convergence can be deduced quite easily from the fact that iterative thresholding converges strongly and from the special structure of the algorithm. To the best knowledge of the authors, =-=[3]-=- contains the first results about the convergence of iterative algorithms for linear inverse problems with sparsity constraints in infinite dimensions for which the convergence rate is inherent in the... |

23 |
Regularization of ill-posed problems in Banach spaces: Convergence rates. Inverse Problems 21
- Resmerita
- 2005
(Show Context)
Citation Context ...stead of considering the linear equation, a regularized problem is posed for which the solution is stable with respect to noise. A common approach is to regularize by minimizing a Tikhonov functional =-=[7, 15, 28]-=-. A special class of these regularizations has been of recent interest, namely of the type min u∈ℓ 2 ‖Ku − f‖2 ∞∑ + αk|uk| . (1.1) 2 k=1 Math Subject Classifications. 65J22, 46N10, 49M05. Keywords and... |

20 | A semismooth Newton method for Tikhonov functionals with sparsity constraints
- Griesse, Lorenz
(Show Context)
Citation Context |

19 |
Combettes, Bregman monotone optimization algorithms
- Bauschke, Borwein, et al.
(Show Context)
Citation Context ... − u ∗ 〉 + Φ(v) − Φ(u ∗ ) . (3.11) Note that if the subgradient of Φ in u ∗ is unique, R is the Bregman distance of Φ in u ∗ , a notion which is extensively used in the analysis of descent algorithms =-=[2, 30]-=-. Moreover, we make use of the remainder of the Taylor expansion of F, T(v) = F(v) − F(u ∗ ) − 〈F ′ (u ∗ ), v − u ∗ 〉 . (3.12) Remark 5 (On the Bregman distance). In many cases the Bregmanlike distanc... |

18 |
Nonlinear iterative methods for linear ill-posed problems in Banach spaces
- Schöpfer, Louis, et al.
(Show Context)
Citation Context ... − u ∗ 〉 + Φ(v) − Φ(u ∗ ) . (3.11) Note that if the subgradient of Φ in u ∗ is unique, R is the Bregman distance of Φ in u ∗ , a notion which is extensively used in the analysis of descent algorithms =-=[2, 30]-=-. Moreover, we make use of the remainder of the Taylor expansion of F, T(v) = F(v) − F(u ∗ ) − 〈F ′ (u ∗ ), v − u ∗ 〉 . (3.12) Remark 5 (On the Bregman distance). In many cases the Bregmanlike distanc... |

14 |
On algorithms for solving least squares problems under an L1 penalty or an L1 constraint
- Turlach
- 2005
(Show Context)
Citation Context |

9 |
Rockafellar and Roger J-B. Wets. Variational Analysis
- Tyrell
- 1998
(Show Context)
Citation Context ...blem 1 − sL 2 〈u − v, w − v〉 . (3.2) s (3.3) ) Ds(u). (3.4) min v∈H ‖v − u + sF ′ (u)‖ 2 2 + sΦ(v) it immediately follows that the subdifferential inclusion u − sF ′(u) − v ∈ s∂Φ(v) is satisfied, see =-=[13, 29]-=- for an introduction to convex analysis and subdifferential calculus. This can be rewritten to 〈u − sF ′ (u) − v, w − v〉 ≤ s ( Φ(w) − Φ(v) ) for all w ∈ H , while rearranging and dividing by s proves ... |

9 | R.: An iterative algorithm for nonlinear inverse problems with joint sparsity constraints in vector valued regimes and an application to color image inpainting. Inverse Problems 23(5
- Teschke, Ramlau
- 1851
(Show Context)
Citation Context ...ing the broad range of applications.18 Kristian Bredies and Dirk A. Lorenz 5.1 Joint sparsity constraints First, we consider the situation of so-called joint sparsity for vector-valued problems, see =-=[1,17,32]-=-. The problems considered are set in the Hilbert space (ℓ2) N for some N ≥ 1 which is interpreted such that for u ∈ (ℓ2) N the k-th component uk is a vector in RN . Given a linear and continuous opera... |

8 |
The interior radon transform
- Maass
- 1992
(Show Context)
Citation Context ...ith the FBI property). In the context of inverse problems with sparsity constraints, the FBI property is natural, since the operators A are often injective. Prominent examples are the Radon transform =-=[25]-=-, solution operators for partial differential equations, e.g. in heat conduction problems [6] or inverse boundary value problems like electrical impedance tomography [26]. The combination with a synth... |

5 |
Approximate Methods in Optimization Problems. Number 32
- Demyanov, Rubinov
- 1970
(Show Context)
Citation Context ... closed and convex constraint, yields the classical gradient projection method which is known to converge provided that certain assumptions are fulfilled and a suitable step-size rule has been chosen =-=[9,11]-=-. In the following, we assume that F is differentiable, F ′ is Lipschitz continuous with constant L and usually choose the step-sizes such that 0 < s ≤ sn ≤ s < 2/L. (2.3) Note that form the trivial c... |

3 |
Fornasier and Holger Rauhut. Recovery algorithms for vector-valued data with joint sparsity constraints
- Massimo
(Show Context)
Citation Context ...ing the broad range of applications.18 Kristian Bredies and Dirk A. Lorenz 5.1 Joint sparsity constraints First, we consider the situation of so-called joint sparsity for vector-valued problems, see =-=[1,17,32]-=-. The problems considered are set in the Hilbert space (ℓ2) N for some N ≥ 1 which is interpreted such that for u ∈ (ℓ2) N the k-th component uk is a vector in RN . Given a linear and continuous opera... |