Results 1 -
7 of
7
An Improved Construction for Counting Bloom Filters
- 14th Annual European Symposium on Algorithms, LNCS 4168
, 2006
"... Abstract. A counting Bloom filter (CBF) generalizes a Bloom filter data structure so as to allow membership queries on a set that can be changing dynamically via insertions and deletions. As with a Bloom filter, a CBF obtains space savings by allowing false positives. We provide a simple hashing-bas ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Abstract. A counting Bloom filter (CBF) generalizes a Bloom filter data structure so as to allow membership queries on a set that can be changing dynamically via insertions and deletions. As with a Bloom filter, a CBF obtains space savings by allowing false positives. We provide a simple hashing-based alternative based on d-left hashing called a d-left CBF (dlCBF). The dlCBF offers the same functionality as a CBF, but uses less space, generally saving a factor of two or more. We describe the construction of dlCBFs, provide an analysis, and demonstrate their effectiveness experimentally. 1
Tight Bounds for Parallel Randomized Load Balancing
- Computing Research Repository
, 1992
"... We explore the fundamental limits of distributed balls-intobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that non-adaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We explore the fundamental limits of distributed balls-intobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that non-adaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ(loglogn/logloglogn) within the same number of rounds. We present an adaptive symmetric algorithm that achieves a bin load of two in log ∗ n + O(1) communication rounds using O(n) messages in total. Moreover, larger bin loads can be traded in for smaller time complexities. We prove a matching lower bound of (1−o(1))log ∗ n on the time complexity of symmetric algorithms that guarantee small bin loads at an asymptotically optimal message complexity of O(n). The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls are not globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time. As an application, we consider the following problem. Given a fully connected graph of n nodes, where each node needs to send and receive up to n messages, and in each round each node may send one message over each link, deliver all messages as quickly as possible to their destinations. We give a simple and robust algorithm of time complexity O(log ∗ n) for this task and provide a generalization to the case where all nodes initially hold arbitrary sets of messages. Completing the picture, we give a less practical, but asymptotically optimal algorithm terminating within O(1) rounds. All these bounds hold with high probability.
Balls and Bins with Structure: Balanced Allocations on Hypergraphs
"... In the standard balls-and-bins model of balanced allocations, m balls are placed sequentially into n bins. Each ball chooses d uniform-random bins and is placed in the least loaded bin. It is well known that when d = log Θ(1) n, after placing m = n balls, the maximum load (number of balls in a bin) ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In the standard balls-and-bins model of balanced allocations, m balls are placed sequentially into n bins. Each ball chooses d uniform-random bins and is placed in the least loaded bin. It is well known that when d = log Θ(1) n, after placing m = n balls, the maximum load (number of balls in a bin) is Θ(1) w.h.p. In this paper we show that as long as d = Ω(log n), independent random choices are not necessary to achieve a constant load balance: these choices may be structured in a very general way. Specifically, we allow each ball i to have an associated random set of bins Bi. We require that |Bi | = Ω(log n) and that bins are included in Bi with approximately the same probability; but the distributions of the Bis are otherwise arbitrary, so that there may be correlations in the choice of bins. We show that this model captures structure important to two applications, nearby server selection and load balance in distributed hash tables. 1
Balanced Allocations: Balls-into-Bins Revisited and Chains-into-Bins
, 2008
"... The study of balls-into-bins games or occupancy problems has a long history since these processes can be used to translate realistic problems into mathematical ones in a natural way. In general, the goal of a balls-into-bins game is to allocate a set of independent objects (tasks, jobs, balls) to a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The study of balls-into-bins games or occupancy problems has a long history since these processes can be used to translate realistic problems into mathematical ones in a natural way. In general, the goal of a balls-into-bins game is to allocate a set of independent objects (tasks, jobs, balls) to a set of resources (servers, bins, urns) and, thereby, to minimize the maximum load. In this paper we show two results. First, we analyse the maximum load for the chains-into-bins problem where we have n bins and the balls are connected in n/ℓ chains of length ℓ. In this process, the balls of one chain have to be allocated to ℓ consecutive bins. We allow each chain d i.u.r. bin choices. The chain is allocated using the rule that the maximum load of any bin receiving a ball of that chain is minimized. We show that, for d ≥ 2, the maximum load is (ln ln(n/ℓ))/ln d + O(1) with probability 1 − O(1 / lnln(n/ℓ)). This shows that the maximum load is decreasing with increasing chain length. Secondly, we analyse for which number of random choices d and which number of balls m < n, the maximum load of an off-line assignment can be upper bounded by one. This holds, for example, for m < 0.97677 · n and d = 4.
STATEMENT OF RESEARCH INTERESTS
"... The information age has witnessed a proliferation of personal data available in digital form that can be stored and analyzed. On the one hand, this has led to the development of data mining tools that aim to infer useful trends from this data. But, on the other hand, easy access to personal data pos ..."
Abstract
- Add to MetaCart
The information age has witnessed a proliferation of personal data available in digital form that can be stored and analyzed. On the one hand, this has led to the development of data mining tools that aim to infer useful trends from this data. But, on the other hand, easy access to personal data poses a threat to individual privacy. My main research interest lies in the area of designing algorithms for protecting the privacy of individuals in such large data sets while still allowing users to mine useful trends and statistics. PRIVACY PROTECTION: My work on privacy has spanned different fields such as algorithms, data mining, cryptography, and statistics. I have focused on the algorithmic problems mainly for three different privacy settings. The first two settings below are interactive, where the user queries the database through a privacy mechanism and the next setting is non-interactive, where the data is sanitized and then released. (a) Online Query Auditing: Consider a data set consisting of private information about individuals. Given a sequence of queries posed about the data, the online query auditing problem is to determine when queries should be denied to prevent privacy breaches. A related problem is the offline auditing problem where one is given a sequence of queries and all of their true answers and the goal is to determine if a privacy breach has already occurred. My research uncovers the fundamental issue that solutions to the offline auditing problem cannot be directly used to solve the online auditing problem since query denials may leak information.
Adaptation to Robot Failures and Shape Change in Decentralized Construction
"... Abstract — Our prior work [1] presented a decentralized algorithm for coordinating the construction of a truss structure out of multiple components. In this paper, we discuss adaptation in decentralized construction. We partition construction in two tasks, tool delivery and assembly. Each task is pe ..."
Abstract
- Add to MetaCart
Abstract — Our prior work [1] presented a decentralized algorithm for coordinating the construction of a truss structure out of multiple components. In this paper, we discuss adaptation in decentralized construction. We partition construction in two tasks, tool delivery and assembly. Each task is performed by a networked team of specialized robots. We analyze the performance of the algorithms using the balls into bins problem, and show their adaptation to failure of robots, dynamic constraints, multiple types of elements and reconfiguration. The algorithms can be used for general types of source elements. I.
Tight Bounds for Randomized Load Balancing on Arbitrary Network Topologies
, 1201
"... We consider the problem of balancing load items (tokens) on networks. Starting with an arbitrary load distribution, we allow in each round nodes to exchange tokens with their neighbors. Thegoalisto achieveadistribution whereall nodeshavenearlythe samenumber of tokens. For the continuous case where t ..."
Abstract
- Add to MetaCart
We consider the problem of balancing load items (tokens) on networks. Starting with an arbitrary load distribution, we allow in each round nodes to exchange tokens with their neighbors. Thegoalisto achieveadistribution whereall nodeshavenearlythe samenumber of tokens. For the continuous case where tokens are arbitrarily divisible, most load balancing schemes correspond to Markov chains whose convergence is rather well-understood in terms of their spectral gap. However, since for many applications load items cannot be divided arbitrarily, we focus on the discrete case where the load is composed of indivisible tokens. Unfortunately, this discretization entails a non-linear behavior due to its rounding errors, which makes the analysis much harder than in the continuous case. Therefore, it has been a major open problem to understand the limitations of discrete load balancing and its relation to the continuous case. We investigate several randomized protocols for different communication models in the discrete case. Ourresults demonstratethat there is almost no deviationbetween the discrete

