## A randomized Kaczmarz algorithm with exponential convergence

### Cached

### Download Links

Citations: | 25 - 1 self |

### BibTeX

@MISC{Strohmer_arandomized,

author = {Thomas Strohmer and Roman Vershynin},

title = {A randomized Kaczmarz algorithm with exponential convergence},

year = {}

}

### OpenURL

### Abstract

The Kaczmarz method for solving linear systems of equations is an iterative algorithm that has found many applications ranging from computer tomography to digital signal processing. Despite the popularity of this method, useful theoretical estimates for its rate of convergence are still scarce. We introduce a randomized version of the Kaczmarz method for consistent, overdetermined linear systems and we prove that it converges with expected exponential rate. Furthermore, this is the first solver whose rate does not depend on the number of equations in the system. The solver does not even need to know the whole system, but only a small random part of it. It thus outperforms all previously known methods on general extremely overdetermined systems. Even for moderately overdetermined systems, numerical simulations as well as theoretical analysis reveal that our algorithm can converge faster than the celebrated conjugate gradient algorithm. Furthermore, our theory and numerical simulations confirm a prediction of Feichtinger et al. in the context of reconstructing bandlimited functions from nonuniform sampling. ∗ T.S. was supported by NSF DMS grant 0511461. R.V. was supported by the Alfred P.

### Citations

462 |
The mathematics of computerized tomography
- Natterer
- 2001
(Show Context)
Citation Context ...etermined systems is Kaczmarz’s method [26], which is a form of alternating projection method. This method is also known under the name Algebraic Reconstruction Technique (ART) in computer tomography =-=[22, 28]-=-, and in fact, it was implemented in the very first medical scanner [25]. It can also be considered as a special case of the POCS (Projection onto Convex Sets) method, which is a prominent tool in sig... |

299 |
Eigenvalues and Condition Numbers of Random Matrices
- Edelman
- 1989
(Show Context)
Citation Context ...y checks that κ(A) := �A�F �A −1 �2. 1 ≤ κ(A) √ n ≤ k(A). (3) Estimates on the condition numbers of some typical (i.e. random or Toeplitztype ) matrices are known from a large body of literature, see =-=[1, 5, 8, 9, 10, 30, 31, 35, 36]-=- and the references therein. 3s2 Randomized Kaczmarz algorithm and its rate of convergence It has been observed in numerical simulations [28, 3, 24] that the convergence rate of the Kaczmarz method ca... |

182 | Fast Monte-Carlo Algorithms for Finding Low-Rank Approximations
- Frieze, Kannan, et al.
- 1998
(Show Context)
Citation Context ...d, which chooses each row of A with probability proportional to its relevance – more precisely, proportional to the square of its Euclidean norm. This method of sampling from a matrix was proposed in =-=[13]-=- in the context of computing a low-rank approximation of A, see also [29] for subsequent work and references. Our algorithm thus takes the following form: Algorithm 1 (Random Kaczmarz algorithm). Let ... |

115 |
Image Reconstructions from Projections
- Herman
- 1980
(Show Context)
Citation Context ...etermined systems is Kaczmarz’s method [26], which is a form of alternating projection method. This method is also known under the name Algebraic Reconstruction Technique (ART) in computer tomography =-=[22, 28]-=-, and in fact, it was implemented in the very first medical scanner [25]. It can also be considered as a special case of the POCS (Projection onto Convex Sets) method, which is a prominent tool in sig... |

101 |
Conjugate Gradient Type Methods for Ill-Posed Problems
- Hanke
- 1995
(Show Context)
Citation Context ... recent years conjugate gradient (CG) type methods have emerged as the leading iterative algorithms for solving large linear systems of equations, since they often exhibit remarkably fast convergence =-=[16, 19]-=-. How does Algorithm 1 compare to the CG algorithms? The rate of convergence of CGLS applied to Ax = b is bounded by [16] �xk − x�A ∗ A ≤ 2�x0 − x�A ∗ A 11 � k(A) − 1 �k , (19) k(A) + 1sLeast squares ... |

97 |
A limit theorem for the norm of random matrices
- Geman
- 1980
(Show Context)
Citation Context ...complex transpose here. 12sare well studied, when the aspect ratio y := n/m < 1 is fixed and the size n of the matrix grows to infinity. Then the following almost sure convergence was proved by Geman =-=[15]-=- and Silvestein [33] respectively: Hence �A�2 √ m → 1 + √ y; Also, since �A�F √ mn → 1, we have 1/�A −1 �2 √ m → 1 − √ y. k(A) → 1 + √ y 1 − √ . (20) y κ(A) √ n → 1 1 − √ . (21) y For estimates that h... |

96 |
Angenäherte Auflösung von Systemen linearer Gleichungen, Bulletin de l’Académie Polonaise des Sciences at Lettres, A35
- Kaczmarz
- 1937
(Show Context)
Citation Context ...y a consistent linear system of equations Ax = b, (1) where A is a full rank m ×n matrix with m ≥ n, and b ∈ C m . One of the most popular solvers for such overdetermined systems is Kaczmarz’s method =-=[26]-=-, which is a form of alternating projection method. This method is also known under the name Algebraic Reconstruction Technique (ART) in computer tomography [22, 28], and in fact, it was implemented i... |

82 | Efficient numerical methods in non-uniform sampling theory
- Feichtinger, Gröchenig, et al.
- 1995
(Show Context)
Citation Context ...ling The reconstruction of a bandlimited function f from its nonuniformly spaced sampling values {f(tk)} is a classical problem in Fourier analysis, with a wide range of applications [2]. We refer to =-=[11, 12]-=- for various efficient numerical algorithms. Staying with the topic of this paper, we focus on the Kaczmarz method, also known as POCS (Projection Onto Convex Sets) method in signal processing [37]. A... |

76 |
der Vorst. The rate of convergence of conjugate gradients
- Sluis, can
- 1986
(Show Context)
Citation Context ...s applied to the nonuniform sampling problem described in the main text. where 1 �y�A ∗ A := � 〈Ay, Ay〉. It is known that the CG method may converge faster when the singular values of A are clustered =-=[34]-=-. For instance, take a matrix whose singular values, all but one, are equal to one, while the remaining singular value is very small, say 10 −8 . While this matrix is far from being well-conditioned, ... |

68 | Sampling from Large Matrices: An Approach through Geometric Functional Analysis
- Rudelson, Vershynin
(Show Context)
Citation Context ...ance – more precisely, proportional to the square of its Euclidean norm. This method of sampling from a matrix was proposed in [13] in the context of computing a low-rank approximation of A, see also =-=[29]-=- for subsequent work and references. Our algorithm thus takes the following form: Algorithm 1 (Random Kaczmarz algorithm). Let Ax = b be a linear system of equations as in (1) and let x0 be arbitrary ... |

65 | The strong limits of random matrix spectra for sample matrices of independent elements - Wachter - 1978 |

62 |
The probability that a numerical analysis problem is difficult
- Demmel
- 1988
(Show Context)
Citation Context ...ant M such that the inequality �Ax�2 ≥ 1 M �x�2 holds for all vectors x. The usual condition number of A is k(A) := �A�2�A −1 �2. A related version is the scaled condition number introduced by Demmel =-=[5]-=-: One easily checks that κ(A) := �A�F �A −1 �2. 1 ≤ κ(A) √ n ≤ k(A). (3) Estimates on the condition numbers of some typical (i.e. random or Toeplitztype ) matrices are known from a large body of liter... |

62 |
Theory and practice of irregular sampling
- Feichtinger, Gröchenig
- 1994
(Show Context)
Citation Context ...ling The reconstruction of a bandlimited function f from its nonuniformly spaced sampling values {f(tk)} is a classical problem in Fourier analysis, with a wide range of applications [2]. We refer to =-=[11, 12]-=- for various efficient numerical algorithms. Staying with the topic of this paper, we focus on the Kaczmarz method, also known as POCS (Projection Onto Convex Sets) method in signal processing [37]. A... |

54 |
Computerized transverse axial scanning (tomography): Part 1:Description of system
- Hounsfield
- 1973
(Show Context)
Citation Context ...projection method. This method is also known under the name Algebraic Reconstruction Technique (ART) in computer tomography [22, 28], and in fact, it was implemented in the very first medical scanner =-=[25]-=-. It can also be considered as a special case of the POCS (Projection onto Convex Sets) method, which is a prominent tool in signal and image processing [32, 3]. We denote the rows of A by a ∗ 1, . . ... |

43 |
Algebraic reconstruction techniques can be made computationally efficient
- Herman, Meyer
- 1993
(Show Context)
Citation Context ... been observed several times in the literature that using the rows of A in Kaczmarz’s method in random order, rather than in their given order, 2scan greatly improve the rate of convergence, see e.g. =-=[28, 3, 24]-=-. While this randomized Kaczmarz method is thus quite appealing for applications, no guarantees of its rate of convergence have been known. In this paper, we propose the first randomized Kaczmarz meth... |

41 |
Smoothed analysis of algorithms
- Spielman, Teng
- 2002
(Show Context)
Citation Context ...y checks that κ(A) := �A�F �A −1 �2. 1 ≤ κ(A) √ n ≤ k(A). (3) Estimates on the condition numbers of some typical (i.e. random or Toeplitztype ) matrices are known from a large body of literature, see =-=[1, 5, 8, 9, 10, 30, 31, 35, 36]-=- and the references therein. 3s2 Randomized Kaczmarz algorithm and its rate of convergence It has been observed in numerical simulations [28, 3, 24] that the convergence rate of the Kaczmarz method ca... |

35 |
The smallest eigenvalue of a large-dimensional Wishart matrix
- Silverstein
- 1985
(Show Context)
Citation Context ...re. 12sare well studied, when the aspect ratio y := n/m < 1 is fixed and the size n of the matrix grows to infinity. Then the following almost sure convergence was proved by Geman [15] and Silvestein =-=[33]-=- respectively: Hence �A�2 √ m → 1 + √ y; Also, since �A�F √ mn → 1, we have 1/�A −1 �2 √ m → 1 − √ y. k(A) → 1 + √ y 1 − √ . (20) y κ(A) √ n → 1 1 − √ . (21) y For estimates that hold for each finite ... |

31 | Random sampling of multivariate trigonometric polynomials
- Bass, Gröchenig
(Show Context)
Citation Context ...y checks that κ(A) := �A�F �A −1 �2. 1 ≤ κ(A) √ n ≤ k(A). (3) Estimates on the condition numbers of some typical (i.e. random or Toeplitztype ) matrices are known from a large body of literature, see =-=[1, 5, 8, 9, 10, 30, 31, 35, 36]-=- and the references therein. 3s2 Randomized Kaczmarz algorithm and its rate of convergence It has been observed in numerical simulations [28, 3, 24] that the convergence rate of the Kaczmarz method ca... |

31 | Inverse Littlewood-Offord theorems and the condition number of random discrete matrices
- Tao, Vu
(Show Context)
Citation Context |

27 | Spectral analysis of networks with random topologies - Grenander, Silverstein - 1977 |

25 |
Strong underrelaxation in Kaczmarz’s method for inconsistent systems
- Censor, Eggermont, et al.
- 1983
(Show Context)
Citation Context ...actice are inconsistent due to noise that contaminates the right hand side. In this case it has been shown that convergence to the least squares solution can be obtained with (strong under)relaxation =-=[4, 20]-=-. We refer to [20, 21] for suggestions for the choice of the relaxation parameter as well as further in-depth analysis for this case. While our theoretical analysis presented in this paper assumes con... |

23 |
The rate of convergence for the method of alternating projections
- Deutsch, Hundal
- 1997
(Show Context)
Citation Context .... Known estimates for the rate of convergence are based on quantities of the matrix A that are hard to compute and difficult to compare with convergence estimates of other iterative methods (see e.g. =-=[6, 7, 14]-=- and the references therein). What numerical analysts would like to have is estimates of the convergence rate in terms of a condition number of A. No such estimates have been known prior to this work.... |

18 |
Modern Sampling Theory
- Benedetto, Ferreira
- 2001
(Show Context)
Citation Context ...m nonuniform sampling The reconstruction of a bandlimited function f from its nonuniformly spaced sampling values {f(tk)} is a classical problem in Fourier analysis, with a wide range of applications =-=[2]-=-. We refer to [11, 12] for various efficient numerical algorithms. Staying with the topic of this paper, we focus on the Kaczmarz method, also known as POCS (Projection Onto Convex Sets) method in sig... |

18 | On the distribution of a scaled condition number
- Edelman
- 1992
(Show Context)
Citation Context |

15 | Tails of condition number distributions
- Edelman, Sutton
(Show Context)
Citation Context |

15 |
Relaxation methods for image reconstruction
- Herman, Lent, et al.
- 1978
(Show Context)
Citation Context ...case the iteration rule becomes xk+1 = xk + λk,i bi − 〈ai, xk〉 �ai�2 ai, (24) 2 where the λk,i, i = 1, . . .,m are relaxation parameters. For consistent systems the relaxation parameters must satisfy =-=[23]-=- 0 < liminfk→∞λk,i ≤ limsup k→∞λk,i < 2 (25) to ensure convergence. We have observed in our numerical simulations that for instance for Gaussian matrices a good choice for the relaxation parameter is ... |

15 | Error analysis in regular and irregular sampling theory
- Feichtinger, Gröchenig
- 1993
(Show Context)
Citation Context ...ling The reconstruction of a bandlimited function f from its nonuniformly spaced sampling values {f(tk)} is a classical problem in Fourier analysis, with a wide range of applications [2]. We refer to =-=[12, 13]-=- for various efficient numerical algorithms. Staying with the topic of this paper, we focus on the Kaczmarz method, also known as POCS (Projection Onto Convex Sets) method in signal processing [40]. A... |

12 | Eigenvalue distribution in some ensembles of random matrices - Marchenko, Pastur - 1967 |

11 |
Iterative and one-step reconstruction from nonuniform samples by convex projections
- Yeh, Stark
- 1990
(Show Context)
Citation Context ...11, 12] for various efficient numerical algorithms. Staying with the topic of this paper, we focus on the Kaczmarz method, also known as POCS (Projection Onto Convex Sets) method in signal processing =-=[37]-=-. As efficient finite-dimensional model, appropriate for a numerical treatment of the nonuniform sampling problem, we consider trigonometric polynomials [18]. In this model the problem can be formulat... |

10 |
Applications of convex projection theory to image recovery in tomography and related areas
- Sezan, Stark
- 1987
(Show Context)
Citation Context ...lemented in the very first medical scanner [25]. It can also be considered as a special case of the POCS (Projection onto Convex Sets) method, which is a prominent tool in signal and image processing =-=[32, 3]-=-. We denote the rows of A by a ∗ 1, . . .,a ∗ m and let b = (b1, . . ., bm) T . The classical scheme of Kaczmarz’s method sweeps through the rows of A in a cyclic manner, projecting in each substep th... |

8 |
On the acceleration of Kaczmarz’s method for inconsistent linear systems
- Hanke, Niethammer
- 1990
(Show Context)
Citation Context ...actice are inconsistent due to noise that contaminates the right hand side. In this case it has been shown that convergence to the least squares solution can be obtained with (strong under)relaxation =-=[4, 20]-=-. We refer to [20, 21] for suggestions for the choice of the relaxation parameter as well as further in-depth analysis for this case. While our theoretical analysis presented in this paper assumes con... |

7 | Irregular sampling, Toeplitz matrices, and the approximation of entire functions of exponential type
- Gröchenig
- 1999
(Show Context)
Citation Context ...o Convex Sets) method in signal processing [37]. As efficient finite-dimensional model, appropriate for a numerical treatment of the nonuniform sampling problem, we consider trigonometric polynomials =-=[18]-=-. In this model the problem can be formulated as follows: Let 9sf(t) = � r l=−r xle 2πilt , where x = {xl} r l=−r ∈ C2r+1 . Assume we are given the nonuniformly spaced nodes {tk} m k=1 and the samplin... |

6 |
New variants of the POCS method using affine subspaces of finite codimension, with applications to irregular sampling
- Cenker, Feichtinger, et al.
- 1992
(Show Context)
Citation Context ...lemented in the very first medical scanner [25]. It can also be considered as a special case of the POCS (Projection onto Convex Sets) method, which is a prominent tool in signal and image processing =-=[32, 3]-=-. We denote the rows of A by a ∗ 1, . . .,a ∗ m and let b = (b1, . . ., bm) T . The classical scheme of Kaczmarz’s method sweeps through the rows of A in a cyclic manner, projecting in each substep th... |

5 |
On the rate of convergence of the alternating projection method in finite dimensional spaces
- Galántai
(Show Context)
Citation Context .... Known estimates for the rate of convergence are based on quantities of the matrix A that are hard to compute and difficult to compare with convergence estimates of other iterative methods (see e.g. =-=[6, 7, 14]-=- and the references therein). What numerical analysts would like to have is estimates of the convergence rate in terms of a condition number of A. No such estimates have been known prior to this work.... |

1 |
Reconstruction alrorithms in irregular sampling
- Gröchenig
- 1992
(Show Context)
Citation Context ...tates that this rate depends only on the scaled condition number κ(A), which is bounded by k(A) √ n by (3). The condition number k(A) for the trigonometric system (18) has been estimated by Gröchenig =-=[17]-=-. For instance we have the following Theorem 4 (Gröchenig). If the distance of every sampling point tj to its neighbor on the unit torus is at most δ < 1 1+2δr , then k(A) ≤ . In particular, 2r 1−2δr ... |

1 |
On the use of small relaxation parameters in Kaczmarz’s method
- Hanke, Niethammer
- 1990
(Show Context)
Citation Context ...nt due to noise that contaminates the right hand side. In this case it has been shown that convergence to the least squares solution can be obtained with (strong under)relaxation [4, 20]. We refer to =-=[20, 21]-=- for suggestions for the choice of the relaxation parameter as well as further in-depth analysis for this case. While our theoretical analysis presented in this paper assumes consistency of the system... |

1 | The fundamentals of computerized tomography - York - 1980 |

1 |
Invertibility of random matrices. I. The smallest singular value is of order n−1/2
- Rudelson, Vershynin
(Show Context)
Citation Context |

1 |
Invertibility of random matrices. II. The Littlewood-Offord theory
- Rudelson, Vershynin
(Show Context)
Citation Context |

1 | Numerical Analysis. Brooks/Cole Publ.Co., eighth edition - Burden, Faires - 2006 |