## Improved approximation algorithms for large matrices via random projections

### Download From

IEEE### Download Links

- [www.ilab.sztaki.hu]
- [www.ilab.sztaki.hu]
- [www.ilab.sztaki.hu]
- DBLP

### Other Repositories/Bibliography

Venue: | in Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science |

Citations: | 98 - 3 self |

### BibTeX

@INPROCEEDINGS{Sarlós_improvedapproximation,

author = {Tamás Sarlós},

title = {Improved approximation algorithms for large matrices via random projections},

booktitle = {in Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science},

year = {},

pages = {143--152}

}

### Years of Citing Articles

### OpenURL

### Abstract

Recently several results appeared that show significant reduction in time for matrix multiplication, singular value decomposition as well as linear (ℓ2) regression, all based on data dependent random sampling. Our key idea is that low dimensional embeddings can be used to eliminate data dependence and provide more versatile, linear time pass efficient matrix computation. Our main contribution is summarized as follows. • Independent of the recent results of Har-Peled and of Deshpande and Vempala, one of the first – and to the best of our knowledge the most efficient – relative-error (1 + ɛ) ‖A − Ak‖F approximation algorithms for the singular value decomposition of an m × n matrix A with M non-zero entries that requires 2 passes over the data and runs in time O M k + (n + m)k2 ɛ ɛ2) log 1 δ • The first o(nd 2) time (1+ɛ) relative-error approximation algorithm for n×d linear (ℓ2) regression. • A matrix multiplication algorithm that easily applies to implicitly given matrices. 1

### Citations

2969 | Authoritative sources in a hyperlinked environment
- Kleinberg
- 1998
(Show Context)
Citation Context ...ications of low-rank matrix approximation by SVD include recommendation systems [25], information retrieval via Latent Semantic Indexing [13, 48], Kleinberg’s celebrated HITS algorithm for web search =-=[43, 2]-=-, clustering [22, 47], and learning mixtures of distributions [42, 4] just to name a few. Classification can be solved by regularized regression [29] and text database querying by matrix-vector produc... |

829 |
Matrix multiplication via arithmetic progression
- Coppersmith, Winogard
- 1987
(Show Context)
Citation Context ...do not compute the final result itself, but reduce the problem two the product of two smaller (or sparser) matrices. If needed the latter can be more easily multiplied with the preferred exact method =-=[36, 17, 16]-=-. Returning to SVD, the best preliminary result with respect to the Frobenius norm was derived by Deshpande and Vempala [21] independently of our work, and shows that if we sample O(k 2 log k + k/ɛ) r... |

760 | Approximate nearest neighbors: towards removing the curse of dimensionality
- Indyk, Motwani
- 1998
(Show Context)
Citation Context ...1 U T . For further linear algebra we refer the reader to [36]. Random projections. Johnson-Lindenstrauss’s seminal paper [41] was followed by several variants and proofs of low-distortion embeddings =-=[33, 40, 19]-=-. Throughout this paper we will make extensive use of three flavors of ℓ2 → ℓ2 embeddings (Theorems 2 & 3, and Lemma 5); we list their properties now. Definition 1 A random matrix R ∈ Rk×n forms a Joh... |

713 | The space complexity of approximating the frequency moments
- Alon, Matias, et al.
- 1999
(Show Context)
Citation Context ...t 1 − δ for all u, v ∈ V it holds that 〈u, v〉 − ɛ �u� 2 �v� 2 ≤ 〈Su, Sv〉 ≤ 〈u, v〉 + ɛ �u� 2 �v� 2 . The last ingredient of our proofs was presented by Alon, Matias, and Szegedy in their seminal paper =-=[7]-=-. Lemma 5 (Tug-of-war sketch, [7, 6]) Let 0 < ɛ ≤ 1 and S = ɛR ∈ Rɛ−2 ×n be a random matrix such that rows of R are independent and each row consists of a vector of four-wise independent zero-mean {−1... |

570 | Using linear algebra for intelligent information retrieval
- Berry, Dumais
- 1994
(Show Context)
Citation Context ...despread use of these tools in data mining [10]. Prominent applications of low-rank matrix approximation by SVD include recommendation systems [25], information retrieval via Latent Semantic Indexing =-=[13, 48]-=-, Kleinberg’s celebrated HITS algorithm for web search [43, 2], clustering [22, 47], and learning mixtures of distributions [42, 4] just to name a few. Classification can be solved by regularized regr... |

421 |
Extension of Lipschitz mapping into a Hilbert Space, Conference in modern analysis and probability
- Johnson, Lindenstrauss
- 1984
(Show Context)
Citation Context ...in certain applications [14, 50]. Our key techniques to improve previous algorithms for singular value decomposition, ℓ2 regression and matrix multiplication are Johnson-Lindenstrauss type embeddings =-=[41]-=-. Ironically, one of the first approximate singular value decomposition algorithms [48] was also embedding-based. Our central result is a relative-error SVD algorithm (Theorem 14). Extending the work ... |

275 | Finding frequent items in data streams
- Charikar, Chen, et al.
- 2004
(Show Context)
Citation Context ... advice for implementing the sampling procedure any faster than solving the original problem. Low distortion embeddings also called “sketches” are known to outperform sampling in certain applications =-=[14, 50]-=-. Our key techniques to improve previous algorithms for singular value decomposition, ℓ2 regression and matrix multiplication are Johnson-Lindenstrauss type embeddings [41]. Ironically, one of the fir... |

269 | Latent semantic indexing: A proba- bilistic analysis
- Papadimitriou, Raghavan, et al.
- 1997
(Show Context)
Citation Context ...despread use of these tools in data mining [10]. Prominent applications of low-rank matrix approximation by SVD include recommendation systems [25], information retrieval via Latent Semantic Indexing =-=[13, 48]-=-, Kleinberg’s celebrated HITS algorithm for web search [43, 2], clustering [22, 47], and learning mixtures of distributions [42, 4] just to name a few. Classification can be solved by regularized regr... |

264 | Stable distributions, pseudorandom generators, embeddings and data stream computation
- Indyk
- 2000
(Show Context)
Citation Context ...pace generated by the sample achieves relativeerror (1 + ɛ) �A − Ak�F with probability at least 3/4. ) + That algorithm runs in time O(M(k 2 log k + k ɛ 1 We remark that the earlier work of Ar et al. =-=[52, 39]-=- contains essentially the same result.s(m + n)(k 2 log k + k ɛ )2 ), where M denotes the number of non-zeroes of A. While improving the running time, we also reduce the number of passes to 2. Historic... |

181 | Fast Monte-Carlo Algorithms for Finding Low Rank Approximations
- Frieze, Kannan, et al.
- 1998
(Show Context)
Citation Context ...based on the Lánczos or power method require Ω(log m) passes [44, 37]. Recently a large number of results appeared that prove bounds for non-uniform sampling to speed up approximate matrix operations =-=[35, 48, 3, 23, 28, 24, 26, 49, 20]-=-. These results provide error guarantees that depend on the Frobenius norm of the input matrices and hence may incur a large additive term. An exception among sampling based techniques is the sequel o... |

163 | Computing on data streams
- Henzinger, Raghavan, et al.
- 1999
(Show Context)
Citation Context ...ent applications. Even for sparse data it is often the case that the input far exceeds the main memory and hence we generally restrict ourselves to the pass efficient “streaming” model of computation =-=[38]-=-. Here access to the input is limited to a constant number of sequential scans and RAM usage depends sublinearly on input size. Also note that sparse iterative SVD methods [36] alone are not suitable ... |

155 |
Database-friendly random projections: Johnson-Lindenstrauss with binary coins
- Achlioptas
(Show Context)
Citation Context ...ntries are independent standard normal random variables. If k = Ω(ɛ −2 log d log(1/δ)) then S is a JLT(ɛ, δ, d) . For practical applications the N(0, 1) entries can be replaced by random ±1 variables =-=[1, 9]-=-. Recently Ailon and Chazelle showed [5] that a significantly sparser embedding matrix R suffices if inputs are preconditioned with a randomized Fast Fourier Transform and obtained a JLT(ɛ, 2/3, d) wh... |

150 | Spectral analysis of data
- Azar, Fiat, et al.
(Show Context)
Citation Context ...jects ASTOR and MOLINGV. as singular value decomposition (SVD), linear ℓ2 regression and the computation of matrix products. Our motivation comes from the widespread use of these tools in data mining =-=[10]-=-. Prominent applications of low-rank matrix approximation by SVD include recommendation systems [25], information retrieval via Latent Semantic Indexing [13, 48], Kleinberg’s celebrated HITS algorithm... |

144 | Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition
- Drineas, Kannan, et al.
(Show Context)
Citation Context ...based on the Lánczos or power method require Ω(log m) passes [40, 34]. Recently a large number of results appeared that prove bounds for non-uniform sampling to speed up approximate matrix operations =-=[32, 43, 3, 21, 26, 22, 24, 44, 18]-=-. These results provide error guarantees that depend on the Frobenius norm of the input matrices and hence may incur a large additive term. An exception among sampling based techniques is the sequel o... |

114 | Software reliability via run-time resultchecking
- Blum, Wasserman
- 1997
(Show Context)
Citation Context ...pace generated by the sample achieves relativeerror (1 + ɛ) �A − Ak�F with probability at least 3/4. ) + That algorithm runs in time O(M(k 2 log k + k ɛ 1 We remark that the earlier work of Ar et al. =-=[52, 39]-=- contains essentially the same result.s(m + n)(k 2 log k + k ɛ )2 ), where M denotes the number of non-zeroes of A. While improving the running time, we also reduce the number of passes to 2. Historic... |

110 | Tracking join and self-join sizes in limited storage
- Alon, Gibons, et al.
- 1999
(Show Context)
Citation Context ... that 〈u, v〉 − ɛ �u� 2 �v� 2 ≤ 〈Su, Sv〉 ≤ 〈u, v〉 + ɛ �u� 2 �v� 2 . The last ingredient of our proofs was presented by Alon, Matias, and Szegedy in their seminal paper [7]. Lemma 5 (Tug-of-war sketch, =-=[7, 6]-=-) Let 0 < ɛ ≤ 1 and S = ɛR ∈ Rɛ−2 ×n be a random matrix such that rows of R are independent and each row consists of a vector of four-wise independent zero-mean {−1, +1} random variables. Then for any... |

103 |
Approximate nearest neighbors and the fast Johnson–Lindenstrauss transform
- Ailon, Chazelle
- 2006
(Show Context)
Citation Context ...mber of reduced dimensions for sketches for example to O(d log d/ɛ) that matches to the enhanced bound of [30] for sampling. Plugging in the fast Johnson-Lindenstrauss transform of Ailon and Chazelle =-=[5]-=- allows us to obtain an O(nd log n) time algorithm for ɛ down to ω((d log d(d+ log 2 n)/(n log n)). As the simplest applications of our technique we derive algorithms for approximating matrix products... |

102 |
The Johnson- Lindenstrauss lemma and the sphericity of some graphs
- Frank1, Maehara
- 1988
(Show Context)
Citation Context ...1 U T . For further linear algebra we refer the reader to [36]. Random projections. Johnson-Lindenstrauss’s seminal paper [41] was followed by several variants and proofs of low-distortion embeddings =-=[33, 40, 19]-=-. Throughout this paper we will make extensive use of three flavors of ℓ2 → ℓ2 embeddings (Theorems 2 & 3, and Lemma 5); we list their properties now. Definition 1 A random matrix R ∈ Rk×n forms a Joh... |

97 | An algorithmic theory of learning: Robust concepts and random projection
- Arriaga, Vempala
- 1999
(Show Context)
Citation Context ...element subset V ⊂ Rn log d , where k = Ω( ɛ2 f(δ)) with probability at least 1 − δ for all v ∈ V it holds that (1 − ɛ) �v� 2 2 ≤ �Rv�2 2 ≤ (1 + ɛ) �v�2 2 . Theorem 2 (The Johnson-Lindenstrauss Lemma =-=[19, 9]-=-) Let 0 < ɛ, δ < 1 and S = 1 √ k R ∈ R k×n matrix such that the Rij ∼ N(0, 1) entries are independent standard normal random variables. If k = Ω(ɛ −2 log d log(1/δ)) then S is a JLT(ɛ, δ, d) . For pra... |

89 | Spectral Partitioning of random graphs
- McSherry
(Show Context)
Citation Context ... matrix approximation by SVD include recommendation systems [25], information retrieval via Latent Semantic Indexing [13, 48], Kleinberg’s celebrated HITS algorithm for web search [43, 2], clustering =-=[22, 47]-=-, and learning mixtures of distributions [42, 4] just to name a few. Classification can be solved by regularized regression [29] and text database querying by matrix-vector products [15]. While polyno... |

79 | Clustering Large Graphs via the Singular Value Decomposition
- Drineas, Frieze, et al.
(Show Context)
Citation Context ... matrix approximation by SVD include recommendation systems [25], information retrieval via Latent Semantic Indexing [13, 48], Kleinberg’s celebrated HITS algorithm for web search [43, 2], clustering =-=[22, 47]-=-, and learning mixtures of distributions [42, 4] just to name a few. Classification can be solved by regularized regression [29] and text database querying by matrix-vector products [15]. While polyno... |

68 | Sampling from Large Matrices: An Approach through Geometric Functional Analysis
- Rudelson, Vershynin
(Show Context)
Citation Context ...oof of inequalities (2) and (3) works unchanged for any matrix S such that |1 − σ 2 i (SU)| = o(1) and UST Sw ≈ U T w. Thus combining the above with Rudelson’s and Vershynin’s proof of Theorem 1.1 in =-=[49]-=- for bounding the singular values and Lemma 8 in appendix A.2 of [23] for bounding the norm of the approximate matrix product we have the following claim for sampling ℓ2 regression. Claim 13 Let r > 0... |

63 | Matrix approximation and projective clustering via adaptive sampling. manuscript
- Rademacher, Vempala, et al.
- 2004
(Show Context)
Citation Context ...based on the Lánczos or power method require Ω(log m) passes [44, 37]. Recently a large number of results appeared that prove bounds for non-uniform sampling to speed up approximate matrix operations =-=[35, 48, 3, 23, 28, 24, 26, 49, 20]-=-. These results provide error guarantees that depend on the Frobenius norm of the input matrices and hence may incur a large additive term. An exception among sampling based techniques is the sequel o... |

62 |
An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures and Algorithms
- Dasgupta, Gupta
(Show Context)
Citation Context ...1 U T . For further linear algebra we refer the reader to [36]. Random projections. Johnson-Lindenstrauss’s seminal paper [41] was followed by several variants and proofs of low-distortion embeddings =-=[33, 40, 19]-=-. Throughout this paper we will make extensive use of three flavors of ℓ2 → ℓ2 embeddings (Theorems 2 & 3, and Lemma 5); we list their properties now. Definition 1 A random matrix R ∈ Rk×n forms a Joh... |

62 | Fast monte carlo algorithms for matrices I: Approximating matrix multiplication. Manuscript. Available via http://cs-www.cs.yale.edu/homes/mmahoney
- Drineas, Kannan, et al.
(Show Context)
Citation Context ...based on the Lánczos or power method require Ω(log m) passes [44, 37]. Recently a large number of results appeared that prove bounds for non-uniform sampling to speed up approximate matrix operations =-=[35, 48, 3, 23, 28, 24, 26, 49, 20]-=-. These results provide error guarantees that depend on the Frobenius norm of the input matrices and hence may incur a large additive term. An exception among sampling based techniques is the sequel o... |

54 | On spectral learning of mixtures of distributions
- Achlioptas, McSherry
- 2005
(Show Context)
Citation Context ...on systems [25], information retrieval via Latent Semantic Indexing [13, 48], Kleinberg’s celebrated HITS algorithm for web search [43, 2], clustering [22, 47], and learning mixtures of distributions =-=[42, 4]-=- just to name a few. Classification can be solved by regularized regression [29] and text database querying by matrix-vector products [15]. While polynomial, all the three matrix operations mentioned ... |

48 | Group-theoretic algorithms for matrix multiplication, preprint
- Cohn, Kleinberg, et al.
(Show Context)
Citation Context ...do not compute the final result itself, but reduce the problem two the product of two smaller (or sparser) matrices. If needed the latter can be more easily multiplied with the preferred exact method =-=[36, 17, 16]-=-. Returning to SVD, the best preliminary result with respect to the Frobenius norm was derived by Deshpande and Vempala [21] independently of our work, and shows that if we sample O(k 2 log k + k/ɛ) r... |

48 | Estimating the largest eigenvalue by the power and lanczos algorithms with a random start
- Kuczynski, Wozniakowski
- 1992
(Show Context)
Citation Context ...r convergence speed is unknown a priori and thus generally they require too many passes over the input. Similarly, approximate SVD schemes based on the Lánczos or power method require Ω(log m) passes =-=[44, 37]-=-. Recently a large number of results appeared that prove bounds for non-uniform sampling to speed up approximate matrix operations [35, 48, 3, 23, 28, 24, 26, 49, 20]. These results provide error guar... |

47 |
Competitive recommendation systems
- Drineas, Kerenidis, et al.
(Show Context)
Citation Context ...tion of matrix products. Our motivation comes from the widespread use of these tools in data mining [10]. Prominent applications of low-rank matrix approximation by SVD include recommendation systems =-=[25]-=-, information retrieval via Latent Semantic Indexing [13, 48], Kleinberg’s celebrated HITS algorithm for web search [43, 2], clustering [22, 47], and learning mixtures of distributions [42, 4] just to... |

44 | The spectral method for general mixture models
- Kannan, Salmasian, et al.
(Show Context)
Citation Context ...on systems [25], information retrieval via Latent Semantic Indexing [13, 48], Kleinberg’s celebrated HITS algorithm for web search [43, 2], clustering [22, 47], and learning mixtures of distributions =-=[42, 4]-=- just to name a few. Classification can be solved by regularized regression [29] and text database querying by matrix-vector products [15]. While polynomial, all the three matrix operations mentioned ... |

34 | Approximating matrix multiplication for pattern recognition tasks
- COHEN, LEWIS
- 1999
(Show Context)
Citation Context ...lustering [22, 47], and learning mixtures of distributions [42, 4] just to name a few. Classification can be solved by regularized regression [29] and text database querying by matrix-vector products =-=[15]-=-. While polynomial, all the three matrix operations mentioned above are computationally intensive when performed exactly. For example dense SVD methods require O(m 2 n) time and O(mn) space on an m × ... |

34 | Spectral techniques applied to sparse random graphs
- Feige, Ofek
(Show Context)
Citation Context ... is easy to see that given a k dimensional subspace V , embedding it into O(k 2 log(k)/ɛ 2 ) dimensions preserves the length of all vectors from V . However, it follows from a lemma of Feige and Ofek =-=[32]-=- based on putting a grid on the unit sphere that mere O(k/ɛ 2 ) dimensions are sufficient. We remark that the same lemma and grid construction also appeared in [12, 51]; [45] contains a weaker form. E... |

33 |
Adaptive sampling and fast low-rank matrix approximation
- Deshpande, Vempala
- 2006
(Show Context)
Citation Context ...eparately by showing that the adaptive sampling theorem of Deshpande et al. [20] holds with tug-of-war projections as well and then apply Theorem 14 with ɛ = 1 only. Deshpande and Vempala also proved =-=[21]-=- that for any matrix A, there exists a subset R of O(k log k + k/ɛ) rows of A such that �A − ΠR,k(A)� F ≤s(1 + ɛ) �A − Ak� F and their approximate SVD method indeed finds an O(k 2 log k + k/ɛ) element... |

31 | Web search via hub synthesis
- Achlioptas, Fiat, et al.
- 2001
(Show Context)
Citation Context ...ications of low-rank matrix approximation by SVD include recommendation systems [25], information retrieval via Latent Semantic Indexing [13, 48], Kleinberg’s celebrated HITS algorithm for web search =-=[43, 2]-=-, clustering [22, 47], and learning mixtures of distributions [42, 4] just to name a few. Classification can be solved by regularized regression [29] and text database querying by matrix-vector produc... |

30 | Dimensionality reductions that preserve volumes and distance to affine spaces, and their algorithmic applications
- Magen
- 2002
(Show Context)
Citation Context ...a lemma of Feige and Ofek [32] based on putting a grid on the unit sphere that mere O(k/ɛ 2 ) dimensions are sufficient. We remark that the same lemma and grid construction also appeared in [12, 51]; =-=[45]-=- contains a weaker form. Even though the dimension of the target subspace is significantly higher than k, the embedding will still turn out to be useful as it can be constructed without knowing the su... |

28 |
The Johnson-Lindenstrauss lemma meets compressed sensing
- Baraniuk, Davenport, et al.
- 2006
(Show Context)
Citation Context ...lows from a lemma of Feige and Ofek [32] based on putting a grid on the unit sphere that mere O(k/ɛ 2 ) dimensions are sufficient. We remark that the same lemma and grid construction also appeared in =-=[12, 51]-=-; [45] contains a weaker form. Even though the dimension of the target subspace is significantly higher than k, the embedding will still turn out to be useful as it can be constructed without knowing ... |

26 | A fast random sampling algorithm for sparsifying matrices
- Arora, Hazan, et al.
- 2006
(Show Context)
Citation Context ...esulted in more practical algorithms and refined analysis both for SVD [28, 24, 49, 20] and approximate matrix products [23]. Other line of research is based on random sparsification and quantization =-=[3, 23, 8]-=-. Although at first it may seem contradictory, approximate matrix product algorithms do not compute the final result itself, but reduce the problem two the product of two smaller (or sparser) matrices... |

25 | A randomized algorithm for a tensor-based generalization of the singular value decomposition. Linear algebra and its applications
- Drineas, Mahoney
(Show Context)
Citation Context ... 14 twice and sampling according to the row lengths of V ΠSA,k(A) in time O((M(k log k + k/ɛ) + (n + m)(k log k + k/ɛ) 2 )log(1/δ)) and 4 passes altogether. Lastly, by a result of Drineas and Mahoney =-=[27]-=- Theorem 14 also yields improved low-rank approximation of higher order tensors in the “unfolding” model. 5. Conclusion We conclude with two open problems. Does there exist a fast, pass efficient algo... |

24 |
Sampling algorithms for ℓ2 regression and applications
- Drineas, Mahoney, et al.
(Show Context)
Citation Context ...berg’s celebrated HITS algorithm for web search [43, 2], clustering [22, 47], and learning mixtures of distributions [42, 4] just to name a few. Classification can be solved by regularized regression =-=[29]-=- and text database querying by matrix-vector products [15]. While polynomial, all the three matrix operations mentioned above are computationally intensive when performed exactly. For example dense SV... |

19 |
Fast Computation of Low Rank Approximations
- Achlioptas, McSherry
- 2001
(Show Context)
Citation Context |

19 | To randomize or not to randomize: space optimal summaries for hyperlink analysis
- Sarlós, Benczúr, et al.
- 2006
(Show Context)
Citation Context ... advice for implementing the sampling procedure any faster than solving the original problem. Low distortion embeddings also called “sketches” are known to outperform sampling in certain applications =-=[14, 50]-=-. Our key techniques to improve previous algorithms for singular value decomposition, ℓ2 regression and matrix multiplication are Johnson-Lindenstrauss type embeddings [41]. Ironically, one of the fir... |

18 | Subspace sampling and relative-error matrix approximation: Columnbased methods
- Drineas, Mahoney, et al.
- 2006
(Show Context)
Citation Context ...m, i.e. given an n-by-d, n > d, matrix A of reals and a d dimensional real vector b we wish to obtain xopt = A + b minimizing �Ax − b� 2 . Recall that the preliminary results proven by Drineas et al. =-=[29, 30]-=- show that if we sample r ′ = poly(ɛ −1 , d) rows from A and b with the sampling probabilities satisfying certain criteria, then with high probability the optimum solution of the r ′ -by-d downsampled... |

18 |
Probabilistic machines can use less running time
- Freivalds
- 1977
(Show Context)
Citation Context ...s. The ℓ2 regression and SVD results are based on precisely these properties. En route we also use embeddings to estimate the Frobenius norm of implicitly formed matrices akin to Freivalds’ technique =-=[34]-=- (Lemma 8). 1 This estimate then can be used as a black box tool to boost the probability of correctness. The rest of the paper is organized as follows. After describing related results and basic fact... |

18 | Low-rank matrix approximation in linear time, manuscript
- Har-peled
- 2006
(Show Context)
Citation Context ...r convergence speed is unknown a priori and thus generally they require too many passes over the input. Similarly, approximate SVD schemes based on the Lánczos or power method require Ω(log m) passes =-=[44, 37]-=-. Recently a large number of results appeared that prove bounds for non-uniform sampling to speed up approximate matrix operations [35, 48, 3, 23, 28, 24, 26, 49, 20]. These results provide error guar... |

15 |
Sampling lower bounds via information theory
- Bar-Yossef
- 2003
(Show Context)
Citation Context ...ood enough approximation for matrix products if the sampling probabilities are proportional to the column and row lengths of the matrices in question [23]. In fact uniform sampling is insufficient as =-=[11]-=- shows. In [29, 30, 31] these results are then applied to products arising from the singular value decomposition of the input. However, as noted, it is unknown whether the required nonuniform sampling... |

15 | M.: A randomized algorithm for the approximation of matrices
- Martinsson, Rokhlin, et al.
- 2011
(Show Context)
Citation Context ...ction 2. Based on these in Section 3 we give our new linear (ℓ2) regression results. These results are used finally in Section 4 in our SVD algorithm. 1.1. Comparison with previous results Except for =-=[48, 46]-=-, to the best of our knowledge, all prior work on speeding up matrix operations is based on sampling. Cohen and Lewis set up random walks to approximate non-negative matrix products [15]. In their gro... |

9 | A.: Variable Latent Semantic Indexing
- Dasgupta, Kumar, et al.
(Show Context)
Citation Context ...st) the best AST i SiB. Similarly to Freivalds’ technique for checking matrix products our norm estimation method requires a few extra matrix-vector products only [34] and was motivated by Lemma 4 of =-=[18]-=-.sLemma 8 Let C be an m × n matrix, 0 < λ < 1, and Q a λ−2 × n tug-of-war random matrix as in Lemma 5. Define X = � �CQT�� 2 . Then E (X) = �C�2 F F and Var (X) ≤ 2λ2 �C� 4 F . PROOF: We proceed simil... |

9 |
Spaces with large distance to ℓ n ∞ and random matrices
- Szarek
- 1990
(Show Context)
Citation Context ...lows from a lemma of Feige and Ofek [32] based on putting a grid on the unit sphere that mere O(k/ɛ 2 ) dimensions are sufficient. We remark that the same lemma and grid construction also appeared in =-=[12, 51]-=-; [45] contains a weaker form. Even though the dimension of the target subspace is significantly higher than k, the embedding will still turn out to be useful as it can be constructed without knowing ... |

8 | Approximating a Gram matrix for improved kernel-based learning
- Drineas, Mahoney
(Show Context)
Citation Context |

1 | Spaces with large distance to ℓn ∞ and random matrices - Szarek - 1990 |