Results 1  10
of
51
Quasirandom Rumor Spreading
 In Proc. of SODA’08
, 2008
"... We propose and analyse a quasirandom analogue to the classical push model for disseminating information in networks (“randomized rumor spreading”). In the classical model, in each round each informed node chooses a neighbor at random and informs it. Results of Frieze and Grimmett (Discrete Appl. Mat ..."
Abstract

Cited by 24 (10 self)
 Add to MetaCart
We propose and analyse a quasirandom analogue to the classical push model for disseminating information in networks (“randomized rumor spreading”). In the classical model, in each round each informed node chooses a neighbor at random and informs it. Results of Frieze and Grimmett (Discrete Appl. Math. 1985) show that this simple protocol succeeds in spreading a rumor from one node of a complete graph to all others within O(log n) rounds. For the network being a hypercube or a random graph G(n, p) with p ≥ (1+ε)(log n)/n, also O(log n) rounds suffice (Feige, Peleg, Raghavan, and Upfal, Random Struct. Algorithms 1990). In the quasirandom model, we assume that each node has a (cyclic) list of its neighbors. Once informed, it starts at a random position of the list, but from then on informs its neighbors in the order of the list. Surprisingly, irrespective of the orders of the lists, the above mentioned bounds still hold. In addition, we also show a O(log n) bound for sparsely connected random graphs G(n, p) with p = (log n+f(n))/n, where f(n) → ∞ and f(n) = O(log log n). Here, the classical model needs Θ(log 2 (n)) rounds. Hence the quasirandom model achieves similar or better broadcasting times with a greatly reduced use of random bits.
Rumour spreading and graph conductance
 IN PROCEEDINGS OF THE 21ST ACMSIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA
, 2010
"... We show that if a connected graph with n nodes has conductance φ then rumour spreading, also known as randomized broadcast, successfully broadcasts a message within O(log 4 n/φ 6) many steps, with high probability, using the PUSHPULL strategy. An interesting feature of our approach is that it draws ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
We show that if a connected graph with n nodes has conductance φ then rumour spreading, also known as randomized broadcast, successfully broadcasts a message within O(log 4 n/φ 6) many steps, with high probability, using the PUSHPULL strategy. An interesting feature of our approach is that it draws a connection between rumour spreading and the spectral sparsification procedure of Spielman and Teng [23].
Differentially Private Data Cubes: Optimizing Noise Sources and Consistency
"... Data cubes play an essential role in data analysis and decision support. In a data cube, data from a fact table is aggregated on subsets of the table’s dimensions, forming a collection of smaller tables called cuboids. When the fact table includes sensitive data such as salary or diagnosis, publishi ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
Data cubes play an essential role in data analysis and decision support. In a data cube, data from a fact table is aggregated on subsets of the table’s dimensions, forming a collection of smaller tables called cuboids. When the fact table includes sensitive data such as salary or diagnosis, publishing even a subset of its cuboids may compromise individuals ’ privacy. In this paper, we address this problem using differential privacy (DP), which provides provable privacy guarantees for individuals by adding noise to query answers. We choose an initial subset of cuboids to compute directly from the fact table, injecting DP noise as usual; and then compute the remaining cuboids from the initial set. Given a fixed privacy guarantee, we show that it is NPhard to choose the initial set of cuboids so that the maximal noise over all published cuboids is minimized, or so that the number of cuboids with noise below a given threshold (precise cuboids) is maximized. We provide an efficient procedure with running time polynomial in the number of cuboids to select the initial set of cuboids, such that the maximal noise in all published cuboids will be within a factor (ln L  +1) 2 of the optimal, where L  is the number of cuboids to be published, or the number of precise cuboids will be within a factor (1 − 1/e) of the optimal. We also show how to enforce consistency in the published cuboids while simultaneously improving their utility (reducing error). In an empirical evaluation on real and synthetic data, we report the amounts of error of different publishing algorithms, and show that our approaches outperform baselines significantly.
Improved approximation of linear threshold functions
 In Proc. 24nd Annual IEEE Conference on Computational Complexity (CCC
, 2009
"... We prove two main results on how arbitrary linear threshold functions f(x) = sign(w · x − θ) over the ndimensional Boolean hypercube can be approximated by simple threshold functions. Our first result shows that every nvariable threshold function f is ɛclose to a threshold function depending only ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
We prove two main results on how arbitrary linear threshold functions f(x) = sign(w · x − θ) over the ndimensional Boolean hypercube can be approximated by simple threshold functions. Our first result shows that every nvariable threshold function f is ɛclose to a threshold function depending only on Inf(f) 2 · poly(1/ɛ) many variables, where Inf(f) denotes the total influence or average sensitivity of f. This is an exponential sharpening of Friedgut’s wellknown theorem [Fri98], which states that every Boolean function f is ɛclose to a function depending only on 2 O(Inf(f)/ɛ) many variables, for the case of threshold functions. We complement this upper bound by showing that Ω(Inf(f) 2 + 1/ɛ 2) many variables are required for ɛapproximating threshold functions. Our second result is a proof that every nvariable threshold function is ɛclose to a threshold function with integer weights at most poly(n) · 2 Õ(1/ɛ2/3). This is an improvement, in the dependence on the error parameter ɛ, on an earlier result of [Ser07] which gave a poly(n) · 2 Õ(1/ɛ2) bound. Our improvement is obtained via a new proof technique that uses strong anticoncentration bounds from probability theory. The new technique also gives a simple and modular proof of the original [Ser07] result, and extends to give lowweight approximators for threshold functions under a range of probability distributions other than the uniform distribution.
Backyard Cuckoo Hashing: Constant WorstCase Operations with a Succinct Representation
, 2010
"... The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constanttime operations in the worst case with high probability, and in terms of space consumption ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constanttime operations in the worst case with high probability, and in terms of space consumption there are known constructions that use essentially optimal space. In this paper we settle two fundamental open problems: • We construct the first dynamic dictionary that enjoys the best of both worlds: we present a twolevel variant of cuckoo hashing that stores n elements using (1+ϵ)n memory words, and guarantees constanttime operations in the worst case with high probability. Specifically, for any ϵ = Ω((log log n / log n) 1/2) and for any sequence of polynomially many operations, with high probability over the randomness of the initialization phase, all operations are performed in constant time which is independent of ϵ. The construction is based on augmenting cuckoo hashing with a “backyard ” that handles a large fraction of the elements, together with a deamortized perfect hashing scheme for eliminating the dependency on ϵ.
Directed Spanners via FlowBased Linear Programs
, 2011
"... We examine directed spanners through flowbased linear programming relaxations. We design an Õ(n2/3)approximation algorithm for the directed kspanner problem that works for all k ≥ 1, which is the first sublinear approximation for arbitrary edgelengths. Even in the more restricted setting of unit ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We examine directed spanners through flowbased linear programming relaxations. We design an Õ(n2/3)approximation algorithm for the directed kspanner problem that works for all k ≥ 1, which is the first sublinear approximation for arbitrary edgelengths. Even in the more restricted setting of unit edgelengths, our algorithm improves over the previous Õ(n1−1/k) approximation [BGJ + 09] when k ≥ 4. For the special case of k = 3 we design a different algorithm achieving an Õ(√n)approximation, improving the previous Õ(n 2/3) [EP05, BGJ + 09] (independently of our work, an Õ(n 1−1/⌈k/2 ⌉ ) was recently devised [BRR10]). Both of our algorithms easily extend to the faulttolerant setting, which has recently attracted attention but not from an approximation viewpoint. We also prove a nearly matching integrality gap of ˜ Ω(n 1/3−ɛ) for every constant ɛ> 0. A virtue of all our algorithms is that they are relatively simple. Technically, we introduce a new yet natural flowbased relaxation, and show how to approximately solve it even when its size is not polynomial. The main challenge is to design a rounding scheme that “coordinates ” the choices of flowpaths between the many demand pairs while using few edges overall. We achieve this, roughly speaking, by randomization at the level of vertices.
Faulttolerant spanners: Better and simpler
 In PODC
, 2011
"... A natural requirement for many distributed structures is faulttolerance: after some failures in the underlying network, whatever remains from the structure should still be effective for whatever remains from the network. In this paper we examine spanners of general graphs that are tolerant to verte ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
A natural requirement for many distributed structures is faulttolerance: after some failures in the underlying network, whatever remains from the structure should still be effective for whatever remains from the network. In this paper we examine spanners of general graphs that are tolerant to vertex failures, and significantly improve their dependence on the number of faults r for all stretch bounds. For stretch k ≥ 3 we design a simple transformation that converts every kspanner construction with at most f(n) edges into an rfaulttolerant kspanner construction with at most O(r 3 log n) · f(2n/r) edges. Applying this to standard greedy spanner constructions gives rfault tolerant kspanners with Õ(r2 1+ 2 n k+1) edges. The previous construction by Chechik, Langberg, Peleg, and Roddity [CLPR09] depends similarly on n but exponentially on r (approximately like k r). For the case of k = 2 and unit edgelengths, an O(r log n)approximation is known from recent work of Dinitz and Krauthgamer [DK11], in which several spanner results are obtained using a common approach of rounding a natural flowbased linear programming relaxation. Here we use a different (stronger) LP relaxation and improve the approximation ratio to O(log n), which is, notably, independent of the number of faults r. We further strengthen this bound in terms of the maximum degree by using the Lovász Local Lemma. Finally, we show that most of our constructions are inherently local by designing equivalent distributed algorithms in the LOCAL model of distributed computation.
Fast Set Intersection in Memory
"... Set intersection is a fundamental operation in information retrieval and database systems. This paper introduces linear space data structures to represent sets such that their intersection can be computed in a worstcase efficient way. In general, given k (preprocessed) sets, with totally n elements ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Set intersection is a fundamental operation in information retrieval and database systems. This paper introduces linear space data structures to represent sets such that their intersection can be computed in a worstcase efficient way. In general, given k (preprocessed) sets, with totally n elements, we will show how to compute their intersection in expected time O(n / √ w + kr), where r is the intersection size and w is the number of bits in a machineword. In addition,we introduce a very simple version of this algorithm that has weaker asymptotic guarantees but performs even better in practice; both algorithms outperform the state of the art techniques for both synthetic and real data sets and workloads. 1.
Mistake bounds for maximum entropy discrimination
 In Advances in Neural Information Processing Systems
, 2004
"... We establish a mistake bound for an ensemble method for classification based on maximizing the entropy of voting weights subject to margin constraints. The bound is the same as a general bound proved for the Weighted Majority Algorithm, and similar to bounds for other variants of Winnow. We prove a ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We establish a mistake bound for an ensemble method for classification based on maximizing the entropy of voting weights subject to margin constraints. The bound is the same as a general bound proved for the Weighted Majority Algorithm, and similar to bounds for other variants of Winnow. We prove a more refined bound that leads to a nearly optimal algorithm for learning disjunctions, again, based on the maximum entropy principle. We describe a simplification of the online maximum entropy method in which, after each iteration, the margin constraints are replaced with a single linear inequality. The simplified algorithm, which takes a similar form to Winnow, achieves the same mistake bounds. 1
Routing in undirected graphs with constant congestion
 CoRR
"... Given an undirected graph G = (V, E), a collection (s1, t1),..., (sk, tk) of k demand pairs, and an integer c, the goal in the Edge Disjoint Paths with Congestion problem is to connect maximum possible number of the demand pairs by paths, so that the maximum load on any edge (called edge congestion) ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Given an undirected graph G = (V, E), a collection (s1, t1),..., (sk, tk) of k demand pairs, and an integer c, the goal in the Edge Disjoint Paths with Congestion problem is to connect maximum possible number of the demand pairs by paths, so that the maximum load on any edge (called edge congestion) does not exceed c. We show an efficient randomized algorithm to route Ω(OPT / poly log k) demand pairs with congestion at most 14, where OPT is the maximum number of pairs that can be simultaneously routed on edgedisjoint paths. The best previous algorithm that routed Ω(OPT / poly log n) pairs required congestion poly(log log n), and for the setting where the maximum allowed congestion is bounded by a constant c, the best previous algorithms could only guarantee the routing of OPT/n O(1/c) pairs. We also introduce a new type of vertex sparsifiers that we call integral flow sparsifiers, that approximately preserve both fractional and integral routings, and show an algorithm to construct such sparsifiers.