Results 1  10
of
28
Network Applications of Bloom Filters: A Survey
 Internet Mathematics
, 2002
"... Abstract. ABloomfilter is a simple spaceefficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been u ..."
Abstract

Cited by 491 (16 self)
 Add to MetaCart
(Show Context)
Abstract. ABloomfilter is a simple spaceefficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been used in database applications since the 1970s, but only in recent years have they become popular in the networking literature. The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications. 1.
Measurement, Modeling, and Analysis of a PeertoPeer FileSharing Workload
, 2003
"... Peertopeer (P2P) file sharing accounts for an astonishing volume of current Internet tra#c. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for futu ..."
Abstract

Cited by 474 (7 self)
 Add to MetaCart
(Show Context)
Peertopeer (P2P) file sharing accounts for an astonishing volume of current Internet tra#c. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for future multimedia workloads. Our research uses a threetiered approach. First, we analyze a 200day trace of over 20 terabytes of Kazaa P2P tra#c collected at the University of Washington. Second, we develop a model of multimedia workloads that lets us isolate, vary, and explore the impact of key system parameters. Our model, which we parameterize with statistics from our trace, lets us confirm various hypotheses about filesharing behavior observed in the trace. Third, we explore the potential impact of localityawareness in Kazaa.
The Bloomier filter: An efficient data structure for static support lookup tables
 in Proc. Symposium on Discrete Algorithms
, 2004
"... “Oh boy, here is another David Nelson” ..."
(Show Context)
Controlling the Cost of Reliability in PeertoPeer Overlays
, 2003
"... Structured peertopeer overlay networks provide a useful substrate for building distributed applications but there are general concerns over the cost of maintaining these overlays. The current approach is to configure the overlays statically and conservatively to achieve the desired reliability eve ..."
Abstract

Cited by 91 (5 self)
 Add to MetaCart
(Show Context)
Structured peertopeer overlay networks provide a useful substrate for building distributed applications but there are general concerns over the cost of maintaining these overlays. The current approach is to configure the overlays statically and conservatively to achieve the desired reliability even under uncommon adverse conditions. This results in high cost in the common case, or poor reliability in worse than expected conditions. We analyze the cost of overlay maintenance in realistic dynamic environments and design novel techniques to reduce this cost by adapting to the operating conditions. With our techniques, the concerns over the overlay maintenance cost are no longer warranted. Simulations using real traces show that they enable high reliability and performance even in very adverse conditions with low maintenance cost.
Theory and network applications of dynamic bloom filters
 In Proceedings of the 25th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM
, 2006
"... Abstract — A bloom filter is a simple, spaceefficient, randomized data structure for concisely representing a static data set, in order to support approximate membership queries. It has great potential for distributed applications where systems need to share information about what resources they ha ..."
Abstract

Cited by 38 (6 self)
 Add to MetaCart
(Show Context)
Abstract — A bloom filter is a simple, spaceefficient, randomized data structure for concisely representing a static data set, in order to support approximate membership queries. It has great potential for distributed applications where systems need to share information about what resources they have. The space efficiency is achieved at the cost of a small probability of false positive in membership queries. However, for many applications the space savings and short locating time consistently outweigh this drawback. In this paper, we introduce dynamic bloom filters (DBF) to support concise representation and approximate membership queries of dynamic sets, and study the false positive probability and union algebra operations. We prove that DBF can control the false positive probability at a low level by adjusting the number of standard bloom filters used according to the actual size of current dynamic set. The space complexity is also acceptable if the actual size of dynamic set does not deviate too much from the predefined threshold. Furthermore, we present multidimension dynamic bloom filters (MDDBF) to support concise representation and approximate membership queries of dynamic sets in multiple attribute dimensions, and study the false positive probability and union algebra operations through mathematic analysis and experimentation. We also explore the optimization approach and three network applications of bloom filters, namely bloom joins, informed search, and global index implementation. Our simulation shows that informed search based on bloom filters can obtain higher recall and success rate of query than the blind search protocol.
The Dynamic Bloom Filters
 In Proc. IEEE Infocom
, 2006
"... Abstract—A Bloom filter is an effective, spaceefficient data structure for concisely representing a set and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a suffic ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
(Show Context)
Abstract—A Bloom filter is an effective, spaceefficient data structure for concisely representing a set and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a sufficiently low level. By investigating mainstream applications based on the Bloom filter, we reveal that dynamic data sets are more common and important than static sets. However, existing variants of the Bloom filter cannot support dynamic data sets well. To address this issue, we propose dynamic Bloom filters to represent dynamic sets as well as static sets and design necessary item insertion, membership query, item deletion, and filter union algorithms. The dynamic Bloom filter can control the false positive probability at a low level by expanding its capacity as the set cardinality increases. Through comprehensive mathematical analysis, we show that the dynamic Bloom filter uses less expected memory than the Bloom filter when representing dynamic sets with an upper bound on set cardinality, and also that the dynamic Bloom filter is more stable than the Bloom filter due to infrequent reconstruction when addressing dynamic sets without an upper bound on set cardinality. Moreover, the analysis results hold in standalone applications as well as distributed applications. Index Terms—Bloom filters, dynamic Bloom filters, information representation.
Bloomier filters: A second look
 Algorithms  ESA 2008, 16th Annual European Symposium
"... Abstract. A Bloom filter is a space efficient structure for storing static sets, where the space efficiency is gained at the expense of a small probability of falsepositives. A Bloomier filter generalizes a Bloom filter to compactly store a function with a static support. In this article we give a ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Abstract. A Bloom filter is a space efficient structure for storing static sets, where the space efficiency is gained at the expense of a small probability of falsepositives. A Bloomier filter generalizes a Bloom filter to compactly store a function with a static support. In this article we give a simple construction of a Bloomier filter. The construction is linear in space and requires constant time to evaluate. The creation of our Bloomier filter takes linear time which is faster than the existing construction. We show how one can improve the space utilization further at the cost of increasing the time for creating the data structure. 1
Scooped, Again
 in Second International Workshop on PeertoPeer Systems (IPTPS 2003), ser. Lecture Notes in Computer Science
, 2003
"... The PeertoPeer (p2p) and Grid infrastructure communities are tackling an overlapping set of problems. In addressing these problems, p2p solutions are usually motivated by elegance or research interest. Grid researchers, under pressure from thousands of scientists with real file sharing and computa ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
The PeertoPeer (p2p) and Grid infrastructure communities are tackling an overlapping set of problems. In addressing these problems, p2p solutions are usually motivated by elegance or research interest. Grid researchers, under pressure from thousands of scientists with real file sharing and computational needs, are pooling their solutions from a wide range of sources in an attempt to meet user demand. Driven by this need to solve large scientific problems quickly, the Grid is being constructed with the tools at hand: FTP or RPC for data transfer, centralization for scheduling and authentication, and an assumption of correct, obediant nodes. If history is any guide, the World Wide Web depicts viscerally that systems that address user needs can have enormous staying power and affect future research. The Grid infrastructure is a great customer waiting for future p2p products. By no means should we make them our only customers, but we should at least put them on the list. If p2p research does not at least address the Grid, it may eventually be sidelined by defacto distributed algorithms that are less elegant but were used to solve Grid problems. In essense, we'll have been scooped, again.
Query protocols for highly resilient peertopeer networks
 In Proc. of ISCA PDCS’06
, 2006
"... Abstract — The decentralized and ad hoc nature of peertopeer (P2P) networks means that both the structure of the network, and the content stored within it are highly variable. Realworld studies indicate that only a small number of peers remain persistent over significant time periods, and that the ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
Abstract — The decentralized and ad hoc nature of peertopeer (P2P) networks means that both the structure of the network, and the content stored within it are highly variable. Realworld studies indicate that only a small number of peers remain persistent over significant time periods, and that the perceived importance of objects stored in the network, measured in terms of access or update frequency, may not follow a uniform distribution. In this paper, we present WARP, a P2P system that exploits these distinctions as an integral part of its design. WARP employs a novel faulttolerant mechanism to manage the dynamic nature of node arrivals and departures by allowing multiple physical nodes to service data mapped to a single node in the overlay. Moreover, the overlay supports different query types, distinguishing queries to popular or valuable data from queries to unpopular or less valuable data. We prove via a rigorous stochastic analysis that any query, regardless of type, will be successfully serviced with high probability. Further, we show that for a network with ¢ nodes, the hop complexity of the protocol is £¥¤§¦©¨���¢� � with high probability. We also define bandwidth complexity, a measure of congestion at any node, and prove that it is £¥¤§¦©¨�����¢� � with high probability. We provide a detailed simulation of the system and show that it conforms closely to our theoretical guarantees. I.
False Negative Problem of Counting Bloom Filter
"... Abstract—Bloom filter is effective, spaceefficient data structure for concisely representing a data set and supporting approximate membership queries. Traditionally, researchers often believe that it is possible that a Bloom filter returns a false positive, but it will never return a false negative ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Bloom filter is effective, spaceefficient data structure for concisely representing a data set and supporting approximate membership queries. Traditionally, researchers often believe that it is possible that a Bloom filter returns a false positive, but it will never return a false negative under wellbehaved operations. By investigating the mainstream variants, however, we observe that a Bloom filter does return false negatives in many scenarios. In this work, we show that the undetectable incorrect deletion of false positive items and detectable incorrect deletion of multiaddress items are two general causes of false negative in a Bloom filter. We then measure the potential and exposed false negatives theoretically and practically. Inspired by the fact that the potential false negatives are usually not fully exposed, we propose a novel Bloom filter scheme, which increases the ratio of bits set to a value larger than one without decreasing the ratio of bits set to zero. Mathematical analysis and comprehensive experiments show that this design can reduce the number of exposed false negatives as well as decrease the likelihood of false positives. To the best of our knowledge, this is the first work dealing with both the false positive and false negative problems of Bloom filter systematically when supporting standard usages of item insertion, query, and deletion operations. Index Terms—Bloom filter, false negative, multichoice counting Bloom filter. Ç 1