Results 21  30
of
132
Sampling Methods for the Nyström Method
 JOURNAL OF MACHINE LEARNING RESEARCH
"... The Nyström method is an efficient technique to generate lowrank matrix approximations and is used in several largescale learning applications. A key aspect of this method is the procedure according to which columns are sampled from the original matrix. In this work, we explore the efficacy of a v ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
The Nyström method is an efficient technique to generate lowrank matrix approximations and is used in several largescale learning applications. A key aspect of this method is the procedure according to which columns are sampled from the original matrix. In this work, we explore the efficacy of a variety of fixed and adaptive sampling schemes. We also propose a family of ensemblebased sampling algorithms for the Nyström method. We report results of extensive experiments that provide a detailed comparison of various fixed and adaptive sampling techniques, and demonstrate the performance improvement associated with the ensemble Nyström method when used in conjunction with either fixed or adaptive sampling schemes. Corroborating these empirical findings, we present a theoretical analysis of the Nyström method, providing novel error bounds guaranteeing a better convergence rate of the ensemble Nyström method in comparison to the standard Nyström method.
A Fast and Efficient Algorithm for Lowrank Approximation of a Matrix
"... The lowrank matrix approximation problem involves finding of a rank k version of a m × n matrix A, labeled Ak, such that Ak is as ”close ” as possible to the best SVD approximation version of A at the same rank level. Previous approaches approximate matrix A by nonuniformly adaptive sampling some ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
(Show Context)
The lowrank matrix approximation problem involves finding of a rank k version of a m × n matrix A, labeled Ak, such that Ak is as ”close ” as possible to the best SVD approximation version of A at the same rank level. Previous approaches approximate matrix A by nonuniformly adaptive sampling some columns (or rows) of A, hoping that this subset of columns contain enough information about A. The submatrix is then used for the approximation process. However, these approaches are often computationally intensive due to the complexity in the adaptive sampling. In this paper, we propose a fast and efficient algorithm which at first preprocesses matrix A in order to spread out information (energy) of every columns (or rows) of A, then randomly selects some of its columns (or rows). Finally, a rankk approximation is generated from the row space of these selected sets. The preprocessing step is performed by uniformly randomizing signs of entries of A and transforming all columns of A by an orthonormal matrix F with existing fast implementation (e.g. Hadamard, FFT, DCT...). Our main contribution is summarized as follows. 1) We show that by uniformly selecting at random d rows of the preprocessed matrix with d = O 1
Simple and deterministic matrix sketching
 CoRR
"... We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix A ∈ R n×m one after the other in a streaming fashion. For ℓ = ⌈1/ε ⌉ it maintains a sketch matrix B ∈ R ℓ×m such that for any unit vector x ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
(Show Context)
We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix A ∈ R n×m one after the other in a streaming fashion. For ℓ = ⌈1/ε ⌉ it maintains a sketch matrix B ∈ R ℓ×m such that for any unit vector x ‖Ax ‖ 2 ≥ ‖Bx ‖ 2 ≥ ‖Ax ‖ 2 − ε‖A ‖ 2 f. Sketch updates per row in A require amortized O(mℓ) operations. This gives the first algorithm whose error guaranty decreases proportional to 1/ℓ using O(mℓ) space. Prior art algorithms produce bounds proportional to 1 / √ ℓ. Our experiments corroborate that the faster convergence rate is observed in practice. The presented algorithm also stands out in that it is: deterministic, simple to implement, and elementary to prove. Regardless of streaming aspects, the algorithm can be used to compute a 1+ε ′ approximation to the best rank k approximation of any matrix A ∈ R n×m. This requires O(mnℓ ′ ) operations and O(mℓ ′ ) space where ℓ ′ =
Compressed sensing and robust recovery of low rank matrices
 in Proc. 40th Asilomar Conf. Signals, Systems and Computers
, 2008
"... Abstract — In this paper, we focus on compressed sensing and recovery schemes for lowrank matrices, asking under what conditions a lowrank matrix can be sensed and recovered from incomplete, inaccurate, and noisy observations. We consider three schemes, one based on a certain Restricted Isometry P ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
(Show Context)
Abstract — In this paper, we focus on compressed sensing and recovery schemes for lowrank matrices, asking under what conditions a lowrank matrix can be sensed and recovered from incomplete, inaccurate, and noisy observations. We consider three schemes, one based on a certain Restricted Isometry Property and two based on directly sensing the row and column space of the matrix. We study their properties in terms of exact recovery in the ideal case, and robustness issues for approximately lowrank matrices and for noisy measurements.
Recovery of lowrank plus compressed sparse matrices with application to unveiling traffic anomalies
 IEEE TRANS. INFO. THEORY
, 2013
"... Given the noiseless superposition of a lowrank matrix plus the product of a known fat compression matrix times a sparse matrix, the goal of this paper is to establish deterministic conditions under which exact recovery of the lowrank and sparse components becomes possible. This fundamental identif ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
Given the noiseless superposition of a lowrank matrix plus the product of a known fat compression matrix times a sparse matrix, the goal of this paper is to establish deterministic conditions under which exact recovery of the lowrank and sparse components becomes possible. This fundamental identifiability issue arises with traffic anomaly detection in backbone networks, and subsumes compressed sensing as well as the timely lowrank plus sparse matrix recovery tasks encountered in matrix decomposition problems. Leveraging the ability of and nuclear norms to recover sparse and lowrank matrices, a convex program is formulated to estimate the unknowns. Analysis and simulations confirm that the said convex program can recover the unknowns for sufficiently lowrank and sparse enough components, along with a compression matrix possessing an isometry property when restricted to operate on sparse vectors. When the lowrank, sparse, and compression matrices are drawn from certain random ensembles, it is established that exact recovery is possible with high probability. Firstorder algorithms are developed to solve the nonsmooth convex optimization problem with provable iteration complexity guarantees. Insightful tests with synthetic and real network data corroborate the effectiveness of the novel approach in unveiling traffic anomalies across flows and time, and its ability to outperform existing alternatives.
Spectral norm of products of random and deterministic matrices
"... Abstract. We study the spectral norm of matrices M that can be factored as M = BA, where A is a random matrix with independent mean zero entries and B is a fixed matrix. Under the (4 + ε)th moment assumption on the entries of A, we show that the spectral norm of such an m×n matrix M is bounded by √ ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
Abstract. We study the spectral norm of matrices M that can be factored as M = BA, where A is a random matrix with independent mean zero entries and B is a fixed matrix. Under the (4 + ε)th moment assumption on the entries of A, we show that the spectral norm of such an m×n matrix M is bounded by √ m + √ n, which is sharp. In other words, in regard to the spectral norm, products of random and deterministic matrices behave similarly to random matrices with independent entries. This result along with the previous work of M. Rudelson and the author implies that the smallest singular value of a random m × n matrix with i.i.d. mean zero entries and bounded (4 + ε)th moment is bounded below by √ m − √ n − 1 with high probability. 1.
Column Subset Selection, Matrix Factorization, and Eigenvalue Optimization
, 2008
"... Given a fixed matrix, the problem of column subset selection requests a column submatrix that has favorable spectral properties. Most research from the algorithms and numerical linear algebra communities focuses on a variant called rankrevealing QR, which seeks a wellconditioned collection of colu ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
Given a fixed matrix, the problem of column subset selection requests a column submatrix that has favorable spectral properties. Most research from the algorithms and numerical linear algebra communities focuses on a variant called rankrevealing QR, which seeks a wellconditioned collection of columns that spans the (numerical) range of the matrix. The functional analysis literature contains another strand of work on column selection whose algorithmic implications have not been explored. In particular, a celebrated result of Bourgain and Tzafriri demonstrates that each matrix with normalized columns contains a large column submatrix that is exceptionally well conditioned. Unfortunately, standard proofs of this result cannot be regarded as algorithmic. This paper presents
Dimension reduction by random hyperplane tessellations
 Discrete & Computational Geometry
, 2011
"... Abstract. Given a subset K of the unit Euclidean sphere, we estimate the minimal number m = m(K) of hyperplanes that generate a uniform tessellation of K, in the sense that the fraction of the hyperplanes separating any pair x, y ∈ K is nearly proportional to the Euclidean distance between x and y. ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
Abstract. Given a subset K of the unit Euclidean sphere, we estimate the minimal number m = m(K) of hyperplanes that generate a uniform tessellation of K, in the sense that the fraction of the hyperplanes separating any pair x, y ∈ K is nearly proportional to the Euclidean distance between x and y. Random hyperplanes prove to be almost ideal for this problem; they achieve the almost optimal bound m = O(w(K)2) where w(K) is the Gaussian mean width of K. Using the map that sends x ∈ K to the sign vector with respect to the hyperplanes, we conclude that every bounded subset K of Rn embeds into the Hamming cube {−1, 1}m with a small distortion in the GromovHaussdorff metric. Since for many sets K one has m = m(K) n, this yields a new discrete mechanism of dimension reduction for sets in Euclidean spaces.
Who Moderates the Moderators? Crowdsourcing Abuse Detection in Usergenerated Content
"... A large fraction of usergenerated content on the Web, such as posts or comments on popular online forums, consists of abuse or spam. Due to the volume of contributions on popular sites, a few trusted moderators cannot identify all such abusive content, so viewer ratings of contributions must be use ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
A large fraction of usergenerated content on the Web, such as posts or comments on popular online forums, consists of abuse or spam. Due to the volume of contributions on popular sites, a few trusted moderators cannot identify all such abusive content, so viewer ratings of contributions must be used for moderation. But not all viewers who rate content are trustworthy and accurate. What is a principled approach to assigning trust and aggregating user ratings, in order to accurately identify abusive content? In this paper, we introduce a framework to address the problem of moderating online content using crowdsourced ratings. Our framework encompasses users who are untrustworthy or inaccurate to an unknown extent — that is, both the content and the raters are of unknown quality. With no knowledge whatsoever about the raters, it is impossible to do better than a random estimate. We present efficient algorithms to accurately detect abuse that only require knowledge about the identity of a single ‘good ’ agent, who rates contributions accurately more than half the time. We prove that our algorithm can infer the quality of contributions with error that rapidly converges to zero as the number of observations increases; we also numerically demonstrate that the algorithm has very high accuracy for much fewer observations. Finally, we analyze the robustness of our algorithms to manipulation by adversarial or strategic raters, an important issue in moderating online content, and quantify how the performance of the algorithm degrades with the number of manipulating agents. 1.
MATRIX CONCENTRATION INEQUALITIES VIA THE METHOD OF EXCHANGEABLE PAIRS
 SUBMITTED TO THE ANNALS OF PROBABILITY
, 2013
"... This paper derives exponential concentration inequalities and polynomial moment inequalities for the spectral norm of a random matrix. The analysis requires a matrix extension of the scalar concentration theory developed by Sourav Chatterjee using Stein’s method of exchangeable pairs. Whenapplied to ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
This paper derives exponential concentration inequalities and polynomial moment inequalities for the spectral norm of a random matrix. The analysis requires a matrix extension of the scalar concentration theory developed by Sourav Chatterjee using Stein’s method of exchangeable pairs. Whenapplied toasum of independentrandom matrices, this approach yields matrix generalizations of the classical inequalities due to Hoeffding, Bernstein, Khintchine, and Rosenthal. The same technique delivers bounds for sums of dependent random matrices and more general matrixvalued functions of dependent random variables. This paper is based on two independent manuscripts from mid2011 that both applied the method of exchangeable pairs to establish matrix concentration inequalities. One manuscript is by Mackey and Jordan; the other is by Chen, Farrell, and Tropp. The authors have combined this research into a single unified presentation, with equal contributions from both groups.