Results 1  10
of
11
A Derandomized Sparse JohnsonLindenstrauss Transform
"... Recent work of [DasguptaKumarSarlós, STOC 2010] gave a sparse JohnsonLindenstrauss transform and left as a main open question whether their construction could be efficiently derandomized. We answer their question affirmatively by giving an alternative proof of their result requiring only bounded ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
Recent work of [DasguptaKumarSarlós, STOC 2010] gave a sparse JohnsonLindenstrauss transform and left as a main open question whether their construction could be efficiently derandomized. We answer their question affirmatively by giving an alternative proof of their result requiring only bounded independence hash functions. Furthermore, the sparsity bound obtained in our proof is improved. The main ingredient in our proof is a spectral moment bound for quadratic forms that was recently used in [DiakonikolasKaneNelson, FOCS 2010].
Sparser JohnsonLindenstrauss Transforms
"... We give two different constructions for dimensionality reduction in ℓ2 via linear mappings that are sparse: only an O(ε)fraction of entries in each column of our embedding matrices are nonzero to achieve distortion 1+ε with high probability, while still achieving the asymptotically optimal number ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
We give two different constructions for dimensionality reduction in ℓ2 via linear mappings that are sparse: only an O(ε)fraction of entries in each column of our embedding matrices are nonzero to achieve distortion 1+ε with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters. Both constructions are also very simple: a vector can be embedded in two for loops. Such distributions can be used to speed up applications where ℓ2 dimensionality reduction is used.
Almost optimal explicit JohnsonLindenstrauss transformations
 In Proceedings of the 15th International Workshop on Randomization and Computation (RANDOM
, 2011
"... Abstract. The JohnsonLindenstrauss lemma is a fundamental result in probability with several applications in the design and analysis of algorithms. Constructions of linear embeddings satisfying the JohnsonLindenstrauss property necessarily involve randomness and much attention has been given to ob ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Abstract. The JohnsonLindenstrauss lemma is a fundamental result in probability with several applications in the design and analysis of algorithms. Constructions of linear embeddings satisfying the JohnsonLindenstrauss property necessarily involve randomness and much attention has been given to obtain explicit constructions minimizing the number of random bits used. In this work we give explicit constructions with an almost optimal use of randomness: For 0 < ε, δ < 1/2, we obtain explicit generators G: {0, 1} r → R s×d for s = O(log(1/δ)/ε 2) such that for all ddimensional vectors w of Euclidean norm 1,
Sketching and Streaming HighDimensional Vectors
, 2011
"... A sketch of a dataset is a smallspace data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sk ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
A sketch of a dataset is a smallspace data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sketch itself be computable by a smallspace algorithm given just one pass over the data, a socalled streaming algorithm. Sketching and streaming have found numerous applications in network traffic monitoring, data mining, trend detection, sensor networks, and databases. In this thesis, I describe several new contributions in the area of sketching and streaming algorithms. • The first spaceoptimal streaming algorithm for the distinct elements problem. Our algorithm also achieves O(1) update and reporting times. • A streaming algorithm for Hamming norm estimation in the turnstile model which achieves the best known space complexity.
Sparser JohnsonLindenstrauss Transforms
"... We give two different JohnsonLindenstrauss distributions, each with column sparsity s = Θ(ε −1 log(1/δ)) and embedding into optimal dimension k = O(ε −2 log(1/δ)) to achieve distortion 1±ε with probability 1−δ. That is, only an O(ε)fraction of entries are nonzero in each embedding matrix in the s ..."
Abstract
 Add to MetaCart
We give two different JohnsonLindenstrauss distributions, each with column sparsity s = Θ(ε −1 log(1/δ)) and embedding into optimal dimension k = O(ε −2 log(1/δ)) to achieve distortion 1±ε with probability 1−δ. That is, only an O(ε)fraction of entries are nonzero in each embedding matrix in the supports of our distributions. These are the first distributions to provide o(k) sparsity for all values of ε, δ. Previously the best known construction obtained s = ˜ Θ(ε −1 log 2 (1/δ)) 1 [DasguptaKumarSarlós, STOC 2010] 2. In addition, one of our distributions can be sampled from a seed of O(log(1/δ) log d) uniform random bits. Some applications that use JohnsonLindenstrauss embeddings as a black box, such as those in approximate numerical linear algebra ([Sarlós, FOCS 2006], [ClarksonWoodruff, STOC 2009]), require exponentially small δ. Our linear dependence on log(1/δ) in the sparsity is thus crucial in these applications to obtain speedup.
Sparser JohnsonLindenstrauss Transforms
"... We give two different JohnsonLindenstrauss distributions, each with column sparsity s = Θ(ε −1 log(1/δ)) and embedding into optimal dimension k = O(ε −2 log(1/δ)) to achieve distortion 1±ε with probability 1−δ. That is, only an O(ε)fraction of entries are nonzero in each embedding matrix in the s ..."
Abstract
 Add to MetaCart
We give two different JohnsonLindenstrauss distributions, each with column sparsity s = Θ(ε −1 log(1/δ)) and embedding into optimal dimension k = O(ε −2 log(1/δ)) to achieve distortion 1±ε with probability 1−δ. That is, only an O(ε)fraction of entries are nonzero in each embedding matrix in the supports of our distributions. These are the first distributions to provide o(k) sparsity for all values of ε, δ. Previously the best known construction obtained s = ˜ Θ(ε −1 log 2 (1/δ)) 1 [DasguptaKumarSarlós, STOC 2010] 2. In addition, one of our distributions can be sampled from a seed of O(log(1/δ) log d) uniform random bits. Some applications that use JohnsonLindenstrauss embeddings as a black box, such as those in approximate numerical linear algebra ([Sarlós, FOCS 2006], [ClarksonWoodruff, STOC 2009]), require exponentially small δ. Our linear dependence on log(1/δ) in the sparsity is thus crucial in these applications to obtain speedup.
Beating the Direct Sum Theorem in Communication Complexity with Implications for Sketching
"... A direct sum theorem for two parties and a function f states that the communication cost of solving k copies of f simultaneously with error probability 1/3 is at least k · R1/3(f), where R1/3(f) is the communication required to solve a single copy of f with error probability 1/3. We improve this for ..."
Abstract
 Add to MetaCart
A direct sum theorem for two parties and a function f states that the communication cost of solving k copies of f simultaneously with error probability 1/3 is at least k · R1/3(f), where R1/3(f) is the communication required to solve a single copy of f with error probability 1/3. We improve this for a natural family of functions f, showing that the 1way communication required to solve k copies of f simultaneously with probability 2/3 is Ω(k · R1/k(f)). Since R1/k(f) may be as large as Ω(R1/3(f) · log k), we asymptotically beat the direct sum bound for such functions, showing that the trivial upper bound of solving each of the k copies of f with probability 1 − O(1/k) and taking a union bound is optimal! In order to achieve this, our direct sum involves a novel measure of information cost which allows a protocol to abort with constant probability, and otherwise must be correct with very high probability. Moreover, for the functions considered, we show strong lower bound on the communication cost of protocols with these relaxed guarantees; indeed, our lower bounds match those for protocols that are not allowed to abort. In the distributed and streaming models, where one wants to be correct not only on a single query, but simultaneously on a sequence of n queries, we obtain optimal lower bounds on the communication or space complexity. Lower bounds obtained from our direct sum result show that a number of techniques in the sketching literature are optimal, including the following: • (JL transform) Lower bound of Ω ( 1 ɛ2 log n) on the δ dimension of (oblivious) JohnsonLindenstrauss transforms. • (ℓpestimation) Lower bound for the size of encodings of n vectors in [±M] d that allow ℓ1 or ℓ2estimation of (log d + log M)). Ω(nɛ −2 log n δ • (Matrix sketching) Lower bound of Ω ( 1 ɛ2 log n) on the δ dimension of a matrix sketch S satisfying the entrywise guarantee (ASS T B)i,j − (AB)i,j  ≤ ɛ‖Ai‖2‖B j ‖2. • (Database joins) Lower bound of Ω(n 1 ɛ2 log n log M) for δ sketching frequency vectors of n tables in a database, each with M records, in order to allow join size estimation. 1
Sparser JohnsonLindenstrauss Transforms
"... We give two different and simple constructions for dimensionality reduction in ℓ2 via linear mappings that are sparse: only an O(ε)fraction of entries in each column of our embedding matrices are nonzero to achieve distortion 1 + ε with high probability, while still achieving the asymptotically op ..."
Abstract
 Add to MetaCart
We give two different and simple constructions for dimensionality reduction in ℓ2 via linear mappings that are sparse: only an O(ε)fraction of entries in each column of our embedding matrices are nonzero to achieve distortion 1 + ε with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas (JCSS 2003) and Dasgupta, Kumar, and Sarlós (STOC 2010). Such distributions can be used to speed up applications where ℓ2 dimensionality reduction is used. 1
Sparser JohnsonLindenstrauss Transforms
"... We give two different JohnsonLindenstrauss distributions, each with column sparsity s = Θ(ε −1 log(1/δ)) and embedding into optimal dimension k = O(ε −2 log(1/δ)) to achieve distortion 1 ± ε with probability 1 − δ. That is, only an O(ε)fraction of entries are nonzero in each embedding matrix in t ..."
Abstract
 Add to MetaCart
We give two different JohnsonLindenstrauss distributions, each with column sparsity s = Θ(ε −1 log(1/δ)) and embedding into optimal dimension k = O(ε −2 log(1/δ)) to achieve distortion 1 ± ε with probability 1 − δ. That is, only an O(ε)fraction of entries are nonzero in each embedding matrix in the supports of our distributions. These are the first distributions to provide o(k) sparsity for all values of ε, δ. Previously the best known construction obtained s = ˜ Θ(ε −1 log 2 (1/δ)) 1 [DasguptaKumarSarlós, STOC 2010] 2. One of our distributions can be sampled from using O(log(1/δ) log d) random bits. Some applications that use JohnsonLindenstrauss embeddings as a black box, such as those in approximate numerical linear algebra ([Sarlós, FOCS 2006], [ClarksonWoodruff, STOC 2009]), require exponentially small δ. Our linear dependence on log(1/δ) in the sparsity is thus crucial in these applications to obtain speedup. 1
Sparser JohnsonLindenstrauss Transforms
"... We give two different JohnsonLindenstrauss distributions, each with column sparsity s = Θ(ε −1 log(1/δ)) and embedding into optimal dimension k = O(ε −2 log(1/δ)) to achieve distortion 1 ± ε with probability 1 − δ. That is, only an O(ε)fraction of entries are nonzero in each embedding matrix in t ..."
Abstract
 Add to MetaCart
We give two different JohnsonLindenstrauss distributions, each with column sparsity s = Θ(ε −1 log(1/δ)) and embedding into optimal dimension k = O(ε −2 log(1/δ)) to achieve distortion 1 ± ε with probability 1 − δ. That is, only an O(ε)fraction of entries are nonzero in each embedding matrix in the supports of our distributions. These are the first distributions to provide o(k) sparsity for all values of ε, δ. Previously the best known construction obtained s = ˜ Θ(ε −1 log 2 (1/δ)) 1 [DasguptaKumarSarlós, STOC 2010] 2. One of our distributions can be sampled from using O(log(1/δ) log d) random bits. Some applications that use JohnsonLindenstrauss embeddings as a black box, such as those in approximate numerical linear algebra ([Sarlós, FOCS 2006], [ClarksonWoodruff, STOC 2009]), require exponentially small δ. Our linear dependence on log(1/δ) in the sparsity is thus crucial in these applications to obtain speedup. 1