Results 1 
3 of
3
The data stream space complexity of cascaded norms
 In FOCS
, 2009
"... Abstract — We consider the problem of estimating cascaded aggregates over a matrix presented as a sequence of updates in a data stream. A cascaded aggregate P ◦ Q is defined by evaluating aggregate Q repeatedly over each row of the matrix, and then evaluating aggregate P over the resulting vector of ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
Abstract — We consider the problem of estimating cascaded aggregates over a matrix presented as a sequence of updates in a data stream. A cascaded aggregate P ◦ Q is defined by evaluating aggregate Q repeatedly over each row of the matrix, and then evaluating aggregate P over the resulting vector of values. This problem was introduced by Cormode and Muthukrishnan, PODS, 2005 [CM]. We analyze the space complexity of estimating cascaded norms on an n × d matrix to within a small relative error. Let Lp denote the pth norm, where p is a nonnegative integer. We abbreviate the cascaded norm L k ◦ Lp by L k,p. (1) For any constant k ≥ p ≥ 2, we obtain a 1pass Õ(n1−2/k d 1−2/p)space algorithm for estimating Lk,p. This is optimal up to polylogarithmic factors and resolves an open question of [CM] regarding the space complexity of L4,2. We also obtain 1pass spaceoptimal algorithms for estimating L∞,k and Lk,∞. (2) We prove a space lower bound of Ω(n1−1/k) on estimating Lk,0 and Lk,1, resolving an open question due to Indyk, IITK Data Streams Workshop (Problem 8), 2006. We also resolve two more questions of [CM] concerning Lk,2 estimation and block heavy hitter problems. Ganguly, Bansal and Dube (FAW, 2008) claimed an Õ(1)space algorithm for estimating Lk,p for any k, p ∈ [0,2]. Our lower bounds show this claim is incorrect. 1.
Fast Approximation of Matrix Coherence and Statistical Leverage
"... The statistical leverage scores of a matrix A are the squared rownorms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in recentlypopular problems such as matrix completion and Nyströmbased lowrank matrix ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
The statistical leverage scores of a matrix A are the squared rownorms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in recentlypopular problems such as matrix completion and Nyströmbased lowrank matrix approximation as well as in largescale statistical data analysis applications more generally; moreover, they are of interest since they define the key structural nonuniformity that must be dealt with in developing fast randomized matrix algorithms. Our main result is a randomized algorithm that takes as input an arbitrary n×d matrix A, with n ≫ d, and that returns as output relativeerror approximations to all n of the statistical leverage scores. The proposed algorithm runs (under assumptions on the precise values of n and d) in O(nd logn) time, as opposed to the O(nd 2) time required by the naïve algorithm that involves computing an orthogonal basis for the range of A. Our analysis may be viewed in terms of computing a relativeerror approximation to an underconstrained leastsquares approximation problem, or, relatedly, it may be viewed as an application of JohnsonLindenstrauss type ideas. Several practicallyimportant extensions of our basic result are also described, including the approximation of socalled crossleverage scores, the extension of these ideas to matrices with n≈d, and the extension to streaming environments.
Streaming Algorithms via Precision Sampling ∗
"... (STOC 2005) has inspired several recent advances in datastream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple datastream algorithms that maintain a randomized sketc ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(STOC 2005) has inspired several recent advances in datastream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple datastream algorithms that maintain a randomized sketch of an input vector x = (x1,x2,...,xn), which is useful for the following applications: • Estimating the Fkmoment of x, fork>2. • Estimating the ℓpnorm of x, forp ∈ [1, 2], with small update time. • Estimating cascaded norms ℓp(ℓq) for all p, q> 0. • ℓ1 sampling, where the goal is to produce an element i with probability (approximately) xi/‖x‖1. It extends to similarly defined ℓpsampling, for p ∈ [1, 2]. For all these applications the algorithm is essentially the same: scale the vector x entrywise by a wellchosen random vector, and run a heavyhitter estimation algorithm on the resulting vector. Our sketch is a linear function of x, thereby allowing general updates to the vector x. Precision Sampling itself addresses the problem of estimating a sum Pn i=1 ai from weak estimates of each real ai ∈ [0, 1]. More precisely, the estimator first chooses a desired precision ui ∈ (0, 1] for each i ∈ [n], and then it receives an estimate of every ai Pwithin additive ui. Its goal is to provide a good approximation P to ai while keeping a tab on the “approximation cost” i (1/ui). Here we refine previous work (Andoni, Krauthgamer, and Onak, FOCS 2010) which shows that as long as P ai =Ω(1), a good multiplicative approximation can be achieved using total precision of only O(n log n). Keywordsstreaming, sampling, moments, cascaded norms 1.