## On the exact space complexity of sketching and streaming small norms (2010)

### Cached

### Download Links

Venue: | In SODA |

Citations: | 18 - 10 self |

### BibTeX

@INPROCEEDINGS{Kane10onthe,

author = {Daniel M. Kane and Jelani Nelson and David P. Woodruff},

title = {On the exact space complexity of sketching and streaming small norms},

booktitle = {In SODA},

year = {2010}

}

### OpenURL

### Abstract

We settle the 1-pass space complexity of (1 ± ε)approximating the Lp norm, for real p with 1 ≤ p ≤ 2, of a length-n vector updated in a length-m stream with updates to its coordinates. We assume the updates are integers in the range [−M, M]. In particular, we show the space required is Θ(ε −2 log(mM) + log log(n)) bits. Our result also holds for 0 < p < 1; although Lp is not a norm in this case, it remains a well-defined function. Our upper bound improves upon previous algorithms of [Indyk, JACM ’06] and [Li, SODA ’08]. This improvement comes from showing an improved derandomization of the Lp sketch of Indyk by using k-wise independence for small k, as opposed to using the heavy hammer of a generic pseudorandom generator against space-bounded computation such as Nisan’s PRG. Our lower bound improves upon previous work of [Alon-Matias-Szegedy, JCSS ’99] and [Woodruff, SODA ’04], and is based on showing a direct sum property for the 1-way communication of the gap-Hamming problem. 1

### Citations

714 | The space complexity of approximating the frequency moments
- Alon, Matias, et al.
- 1999
(Show Context)
Citation Context ...eaming model and has become popular in the theory community, dating back to the works of Munro and Paterson [34] and Flajolet and Martin [15], and resurging with the work of Alon, Matias, and Szegedy =-=[2]-=-. For a survey of results, see the book by Muthukrishnan [35], or notes from Indyk’s course [24]. A fundamental problem in this area is that of norm estimation [2]. Formally, we have a vector x = (x1,... |

404 | Data streams: Algorithms and applications
- Muthukrishnan
- 2003
(Show Context)
Citation Context ... dating back to the works of Munro and Paterson [34] and Flajolet and Martin [15], and resurging with the work of Alon, Matias, and Szegedy [2]. For a survey of results, see the book by Muthukrishnan =-=[35]-=-, or notes from Indyk’s course [24]. A fundamental problem in this area is that of norm estimation [2]. Formally, we have a vector x = (x1, . . . , xn) initialized as x = ⃗0, and a stream of m updates... |

266 | Stable distributions, pseudorandom generators, embeddings and data stream computation
- Indyk
- 2000
(Show Context)
Citation Context ...fficiently approximated in a data stream. In particular, [5, 8] show that polynomial space in n, m is required for p > 2, whereas space polylogarithmic in these parameters is achievable for 0 < p ≤ 2 =-=[2, 23]-=-. 2 In this work, we focus on this feasible regime for p and consider the following question: what exactly 1 Note that for constant p bounded away from 0, which is the focus of our work, (1±ε)-approxi... |

191 | Pseudorandom generator for space bounded computation
- Nisan
- 1990
(Show Context)
Citation Context ...er bound for Lp-estimation, 0 < p < 2. In particular, we give an improved derandomization of Indyk’s algorithm [23] to use k-wise independence for small k as opposed to Nisan’s pseudorandom generator =-=[36]-=- against space-bounded computation. Our improved derandomization allows for an implementation using O(ε −2 log(mM)) bits of space. An algorithm achieving this bound was previously known only for p = 2... |

179 | A method for simulating stable random variables - Chambers, Mallows, et al. - 1976 |

162 | Information statistics approach to data stream and communication complexity
- Bar-Yossef, Jayram, et al.
- 2004
(Show Context)
Citation Context ...ment of x. 1 A large body of work has been done in this area, see, e.g., the references in [24, 35]. It is known that not all Lp norms can be efficiently approximated in a data stream. In particular, =-=[5, 8]-=- show that polynomial space in n, m is required for p > 2, whereas space polylogarithmic in these parameters is achievable for 0 < p ≤ 2 [2, 23]. 2 In this work, we focus on this feasible regime for p... |

149 |
One-Dimensional Stable Distributions
- ZOLOTAREV
- 1986
(Show Context)
Citation Context ...space O(ε −2 log(mM)) and output (1 ± ε)||x||p with probability at least 7/8. To understand the first step of Figure 1, we recall the definition of a p-stable distribution. Definition 2.1. (Zolotarev =-=[41]-=-) For 0 < p < 2, there exists a probability distribution Dp called the pstable distribution with E[eitZ ] = e−|t|p for Z ∼ Dp. For any n and vector x ∈ Rn , if Z1, . . . , Zn ∼ Dp are independent, the... |

132 | Sketchbased change detection: Methods, evaluation, and applications
- Krishnamurty, Sen, et al.
- 2003
(Show Context)
Citation Context ...to the L1 norm) [14], cascaded norm estimation of a matrix [26], and network traffic monitoring [13]. L2 estimation is useful for database query optimization [1] and network traffic anomaly detection =-=[28]-=-. Both L1 and L2 estimation subroutines are used in approximate histogram maintenance [19]. Norm estimation for fractional p was shown useful for mining tabular data in [11] (p = 0.5 and p = 0.25 were... |

124 |
Selection and sorting with limited storage
- Munro, Paterson
- 1980
(Show Context)
Citation Context ...and so algorithms must be both approximate and probabilistic. This model is known as the streaming model and has become popular in the theory community, dating back to the works of Munro and Paterson =-=[34]-=- and Flajolet and Martin [15], and resurging with the work of Alon, Matias, and Szegedy [2]. For a survey of results, see the book by Muthukrishnan [35], or notes from Indyk’s course [24]. A fundament... |

110 | Tracking join and self-join sizes in limited storage
- Alon, Gibbons, et al.
- 1999
(Show Context)
Citation Context ...k approximation of a matrix (with respect to the L1 norm) [14], cascaded norm estimation of a matrix [26], and network traffic monitoring [13]. L2 estimation is useful for database query optimization =-=[1]-=- and network traffic anomaly detection [28]. Both L1 and L2 estimation subroutines are used in approximate histogram maintenance [19]. Norm estimation for fractional p was shown useful for mining tabu... |

107 |
small-space algorithms for approximate histogram maintenance
- Fast
- 2002
(Show Context)
Citation Context ...oring [13]. L2 estimation is useful for database query optimization [1] and network traffic anomaly detection [28]. Both L1 and L2 estimation subroutines are used in approximate histogram maintenance =-=[19]-=-. Norm estimation for fractional p was shown useful for mining tabular data in [11] (p = 0.5 and p = 0.25 were specifically suggested), and Lp estimation for fractional p near 1 is used as a subroutin... |

92 |
An approximate L1 difference algorithm for massive data streams
- Feigenbaum, Kannan, et al.
(Show Context)
Citation Context ...proximation [22], approximate linear regression and best rank-k approximation of a matrix (with respect to the L1 norm) [14], cascaded norm estimation of a matrix [26], and network traffic monitoring =-=[13]-=-. L2 estimation is useful for database query optimization [1] and network traffic anomaly detection [28]. Both L1 and L2 estimation subroutines are used in approximate histogram maintenance [19]. Norm... |

91 | Stable distributions: Models for Heavy Tailed Data - Nolan - 2002 |

73 | Near-optimal lower bounds on the multi-party communication complexity of set disjointness
- Chakrabarti, Khot, et al.
- 2003
(Show Context)
Citation Context ...ment of x. 1 A large body of work has been done in this area, see, e.g., the references in [24, 35]. It is known that not all Lp norms can be efficiently approximated in a data stream. In particular, =-=[5, 8]-=- show that polynomial space in n, m is required for p > 2, whereas space polylogarithmic in these parameters is achievable for 0 < p ≤ 2 [2, 23]. 2 In this work, we focus on this feasible regime for p... |

70 |
Probabilistic counting
- Flajolet, Martin
- 1983
(Show Context)
Citation Context ...h approximate and probabilistic. This model is known as the streaming model and has become popular in the theory community, dating back to the works of Munro and Paterson [34] and Flajolet and Martin =-=[15]-=-, and resurging with the work of Alon, Matias, and Szegedy [2]. For a survey of results, see the book by Muthukrishnan [35], or notes from Indyk’s course [24]. A fundamental problem in this area is th... |

56 | Numerical linear algebra in the streaming model
- Clarkson, Woodruff
- 2009
(Show Context)
Citation Context ...strict turnstile model (where no xi can ever be negative). A discussion is in Section 3. Variants of our techniques were also useful for obtaining tight bounds for linear algebra problems in a stream =-=[10]-=- and in compressed sensing [3]. 1.2 Notation For integer z > 0, [z] denotes the set {1, . . . , z}. All our space bounds are measured in bits. The variables n, m, M denote vector length, stream length... |

56 | Optimal space lower bounds for all frequency moments
- WOODRUFF
- 2004
(Show Context)
Citation Context ... logarithmic dependence on n), showing that trivial solutions are already nearly optimal for such small ε. The previous lower bound was Ω(min{N, ε−2 + log N}), and is the result of a sequence of work =-=[2, 6, 38]-=-. See [25, 39] for simpler proofs. Given our lower bound and algorithm above, and the L2-estimation algorithm of Alon, Matias, and Szegedy [2], the space complexity of Lp-estimation is now resolved fo... |

53 | A nearoptimal algorithm for computing the entropy of a stream - Chakrabarti, Cormode, et al. - 2007 |

44 | Algorithms for dynamic geometric problems over data streams
- Indyk
- 2004
(Show Context)
Citation Context ...on for 0 < p ≤ 2? We remark that streaming approximations to Lp in this range area interesting for several reasons. L1 estimation is used as a subroutine for dynamic earthmover distance approximation =-=[22]-=-, approximate linear regression and best rank-k approximation of a matrix (with respect to the L1 norm) [14], cascaded norm estimation of a matrix [26], and network traffic monitoring [13]. L2 estimat... |

40 | János Komlós, and Endre Szemerédi. Storing a sparse table with O(1) worst case access time - Fredman - 1984 |

40 | The best constants in the khintchine inequality - Haagerup - 1982 |

33 | Lower bounds for sparse recovery
- Ba, Indyk, et al.
- 2010
(Show Context)
Citation Context ...o xi can ever be negative). A discussion is in Section 3. Variants of our techniques were also useful for obtaining tight bounds for linear algebra problems in a stream [10] and in compressed sensing =-=[3]-=-. 1.2 Notation For integer z > 0, [z] denotes the set {1, . . . , z}. All our space bounds are measured in bits. The variables n, m, M denote vector length, stream length, and the maximum absolute val... |

20 |
Fast mining of massive tabular data via approximate distance computations
- CORMODE, INDYK, et al.
(Show Context)
Citation Context ... traffic anomaly detection [28]. Both L1 and L2 estimation subroutines are used in approximate histogram maintenance [19]. Norm estimation for fractional p was shown useful for mining tabular data in =-=[11]-=- (p = 0.5 and p = 0.25 were specifically suggested), and Lp estimation for fractional p near 1 is used as a subroutine for estimating empirical entropy, which in turn is again useful for network traff... |

20 | Bounded Independence Fools Halfspaces
- Diakonikolas, Gopalan, et al.
- 2009
(Show Context)
Citation Context ...e remark here that our techniques seem possibly applicable to other derandomization questions. For example, using our techniques one can give an alternative proof of one of the key components used in =-=[12]-=- to show that bounded independence fools halfspaces. In particular, for a ∈ Rn with ||a||2 = 1 and θ ∈ R, consider the function fa,θ(x) = sgn(〈a, x〉 − θ) where x ∈ {−1, 1} n . The work of [12] showed ... |

16 |
Unimodality of infinitely divisible distribution functions of class
- Yamazato
- 1978
(Show Context)
Citation Context ...pactness, µp takes on some minimum value ηp in the interval [−2, 2]. Furthermore, ηp > 0 (strict inequality) since µp > 0 everywhere (this follows since it is known that µp is unimodal with mode zero =-=[40]-=-, and is non-zero for large |x| by Lemma 2.1). Then (2.9) E and (2.10) E Now if we let and [ I [−1+ε,1−ε] ( zi ||x||p )] ≤ 1 2 − ηpε = 1 − Θ(ε) 2 [ ( )] zi I [−1−ε,1+ε] ≥ ||x||p 1 2 + ηpε = 1 2 Z = 1 ... |

13 | The sketching complexity of pattern matching
- Bar-Yossef, Jayram, et al.
- 2004
(Show Context)
Citation Context ...on m). We give Alice a string x ∈ {0, 1} t , and Bob both an index i ∈ [t] together with xi+1, . . . , xt. This problem requires Ω(t) bits of communication if Alice sends only a single message to Bob =-=[4, 32]-=-. Alice splits x into b = ε2t equal-sized blocks X0, . . . , Xb−1. In the j-th block she uses the ε−2 bits in that block to create a stream that is similar to what she would have created SXj in the in... |

13 | On estimating frequency moments of data streams
- Cormode, Ganguly
- 2007
(Show Context)
Citation Context .../ε)/ log log(1/ε)) suffices, thus yielding an algorithm which also has optimal space, but with improved update time. Other work on Lp estimation for 0 < p ≤ 2 includes the work of Ganguly and Cormode =-=[18]-=-, which requires a suboptimal O(ε−(2+p) log O(1) (mM)) bits of space, but at the benefit of requiring log O(1) (mM) update time independent of ε. We remark here that our techniques seem possibly appli... |

12 | Coresets and sketches for high dimensional subspace approximation problems
- Feldman, Monemizadeh, et al.
(Show Context)
Citation Context ...easons. L1 estimation is used as a subroutine for dynamic earthmover distance approximation [22], approximate linear regression and best rank-k approximation of a matrix (with respect to the L1 norm) =-=[14]-=-, cascaded norm estimation of a matrix [26], and network traffic monitoring [13]. L2 estimation is useful for database query optimization [1] and network traffic anomaly detection [28]. Both L1 and L2... |

12 |
Krzysztof Onak. Sketching and streaming entropy via approximation theory
- Harvey, Nelson
- 2008
(Show Context)
Citation Context ...e specifically suggested), and Lp estimation for fractional p near 1 is used as a subroutine for estimating empirical entropy, which in turn is again useful for network traffic anomaly detection (see =-=[21]-=- and the references therein). Also, Lp estimation for all 0 < p ≤ 2 is used as a subroutine for weighted sampling in turnstile streams [33]. 1.1 Contributions We resolve the space complexity of Lp-est... |

12 | Improving compressed counting - Li |

11 | The data stream space complexity of cascaded norms
- Jayram, Woodruff
- 2009
(Show Context)
Citation Context ...ne for dynamic earthmover distance approximation [22], approximate linear regression and best rank-k approximation of a matrix (with respect to the L1 norm) [14], cascaded norm estimation of a matrix =-=[26]-=-, and network traffic monitoring [13]. L2 estimation is useful for database query optimization [1] and network traffic anomaly detection [28]. Both L1 and L2 estimation subroutines are used in approxi... |

11 |
Efficient and Private Distance Approximation in the Communication and Streaming Models
- Woodruff
- 2007
(Show Context)
Citation Context ...endence on n), showing that trivial solutions are already nearly optimal for such small ε. The previous lower bound was Ω(min{N, ε−2 + log N}), and is the result of a sequence of work [2, 6, 38]. See =-=[25, 39]-=- for simpler proofs. Given our lower bound and algorithm above, and the L2-estimation algorithm of Alon, Matias, and Szegedy [2], the space complexity of Lp-estimation is now resolved for all 0 < p ≤ ... |

10 |
Estimators and tail bounds for dimension reduction in lp (0 < p ≤ 2) using stable random projections
- Li
- 2008
(Show Context)
Citation Context ...og(mM)) bits of space. An algorithm achieving this bound was previously known only for p = 2 [2]. In the case of 0 < p < 2, the previously most space-efficient algorithms are due to Indyk [23] and Li =-=[30]-=-, both requiring O(ε −2 log(mM) log(N)) space with N = min{n, m}. A more prudent analysis of the seed length Nisan’s generator requires to fool Indyk’s algorithm can give a space bound of O(ε −2 log(m... |

9 | A multi-round communication lower bound for gap hamming and some consequences
- Brody, Chakrabarti
- 2009
(Show Context)
Citation Context ... logarithmic dependence on n), showing that trivial solutions are already nearly optimal for such small ε. The previous lower bound was Ω(min{N, ε−2 + log N}), and is the result of a sequence of work =-=[2, 6, 38]-=-. See [25, 39] for simpler proofs. Given our lower bound and algorithm above, and the L2-estimation algorithm of Alon, Matias, and Szegedy [2], the space complexity of Lp-estimation is now resolved fo... |

8 | The one-way communication complexity of gap hamming distance
- Jayram, Kumar, et al.
(Show Context)
Citation Context ...endence on n), showing that trivial solutions are already nearly optimal for such small ε. The previous lower bound was Ω(min{N, ε−2 + log N}), and is the result of a sequence of work [2, 6, 38]. See =-=[25, 39]-=- for simpler proofs. Given our lower bound and algorithm above, and the L2-estimation algorithm of Alon, Matias, and Szegedy [2], the space complexity of Lp-estimation is now resolved for all 0 < p ≤ ... |

6 |
One-dimensional Stable Distributions. Vol. 65 of Translations of Mathematical Monographs
- Zolotarev
- 1986
(Show Context)
Citation Context ...space O(ε −2 log(mM)) and output (1 ± ε)||x||p with probability at least 7/8. To understand the first step of Figure 1, we recall the definition of a p-stable distribution. Definition 2.1. (Zolotarev =-=[41]-=-) For 0 < p < 2, there exists a probability distribution Dp called the pstable distribution with E[eitZ ] = e−|t|p for Z ∼ Dp. For any n and vector x ∈ Rn , if Z1, . . . , Zn ∼ Dp are independent, the... |

5 |
Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson. On data structures and asymmetric communication complexity
- Bro
- 1998
(Show Context)
Citation Context ...on m). We give Alice a string x ∈ {0, 1} t , and Bob both an index i ∈ [t] together with xi+1, . . . , xt. This problem requires Ω(t) bits of communication if Alice sends only a single message to Bob =-=[4, 32]-=-. Alice splits x into b = ε2t equal-sized blocks X0, . . . , Xb−1. In the j-th block she uses the ε−2 bits in that block to create a stream that is similar to what she would have created SXj in the in... |

4 |
streaming and sublinear-space algorithms, 2007. Graduate course notes available at http://stellar.mit.edu/S/course/6/fa07/6.895
- Sketching
(Show Context)
Citation Context ...and Paterson [34] and Flajolet and Martin [15], and resurging with the work of Alon, Matias, and Szegedy [2]. For a survey of results, see the book by Muthukrishnan [35], or notes from Indyk’s course =-=[24]-=-. A fundamental problem in this area is that of norm estimation [2]. Formally, we have a vector x = (x1, . . . , xn) initialized as x = ⃗0, and a stream of m updates, where an update (i, v) ∈ [n] × {−... |

2 | Saddle-point integration of C ∞ “bump” functions. Manuscript. Available at http://math. mit. edu/˜ stevenj/bump-saddle. pdf - Johnson - 2007 |

1 |
Single pass relative-error lp sampling with applications
- Monemizadeh, Woodruff
- 2010
(Show Context)
Citation Context ...n is again useful for network traffic anomaly detection (see [21] and the references therein). Also, Lp estimation for all 0 < p ≤ 2 is used as a subroutine for weighted sampling in turnstile streams =-=[33]-=-. 1.1 Contributions We resolve the space complexity of Lp-estimation for 0 < p ≤ 2 up to constant factors. In particular, the space complexity is Θ(ε−2 log(mM)+ log log(n)) bits. For p strictly less t... |