Results 1 - 10
of
12
Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload
, 2003
"... Peer-to-peer (P2P) file sharing accounts for an astonishing volume of current Internet tra#c. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for futu ..."
Abstract
-
Cited by 333 (6 self)
- Add to MetaCart
Peer-to-peer (P2P) file sharing accounts for an astonishing volume of current Internet tra#c. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for future multimedia workloads. Our research uses a three-tiered approach. First, we analyze a 200-day trace of over 20 terabytes of Kazaa P2P tra#c collected at the University of Washington. Second, we develop a model of multimedia workloads that lets us isolate, vary, and explore the impact of key system parameters. Our model, which we parameterize with statistics from our trace, lets us confirm various hypotheses about file-sharing behavior observed in the trace. Third, we explore the potential impact of localityawareness in Kazaa.
Network Applications of Bloom Filters: A Survey
- Internet Mathematics
, 2002
"... Abstract. ABloomfilter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been u ..."
Abstract
-
Cited by 257 (12 self)
- Add to MetaCart
Abstract. ABloomfilter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been used in database applications since the 1970s, but only in recent years have they become popular in the networking literature. The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications. 1.
Controlling the Cost of Reliability in Peer-to-Peer Overlays
, 2003
"... Structured peer-to-peer overlay networks provide a useful substrate for building distributed applications but there are general concerns over the cost of maintaining these overlays. The current approach is to configure the overlays statically and conservatively to achieve the desired reliability eve ..."
Abstract
-
Cited by 65 (4 self)
- Add to MetaCart
Structured peer-to-peer overlay networks provide a useful substrate for building distributed applications but there are general concerns over the cost of maintaining these overlays. The current approach is to configure the overlays statically and conservatively to achieve the desired reliability even under uncommon adverse conditions. This results in high cost in the common case, or poor reliability in worse than expected conditions. We analyze the cost of overlay maintenance in realistic dynamic environments and design novel techniques to reduce this cost by adapting to the operating conditions. With our techniques, the concerns over the overlay maintenance cost are no longer warranted. Simulations using real traces show that they enable high reliability and performance even in very adverse conditions with low maintenance cost.
The Bloomier Filter: An Efficient Data Structure for Static Support Lookup Tables
- In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA
, 2004
"... We introduce the Bloomier filter, a data structure for compactly encoding a function with static support in order to support approximate evaluation queries. Our construction generalizes the classical Bloom filter, an ingenious hashing scheme heavily used in networks and databases, whose main attribu ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
We introduce the Bloomier filter, a data structure for compactly encoding a function with static support in order to support approximate evaluation queries. Our construction generalizes the classical Bloom filter, an ingenious hashing scheme heavily used in networks and databases, whose main attribute -- space efficiency -- is achieved at the expense of a tiny false-positive rate. Whereas Bloom filters can handle only set membership queries, our Bloomier filters can deal with arbitrary functions. We give several designs varying in simplicity and optimality, and we provide lower bounds to prove the (near) optimality of our constructions.
Theory and network applications of dynamic bloom filters
- In Proceedings of the 25th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM
, 2006
"... Abstract — A bloom filter is a simple, space-efficient, randomized data structure for concisely representing a static data set, in order to support approximate membership queries. It has great potential for distributed applications where systems need to share information about what resources they ha ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
Abstract — A bloom filter is a simple, space-efficient, randomized data structure for concisely representing a static data set, in order to support approximate membership queries. It has great potential for distributed applications where systems need to share information about what resources they have. The space efficiency is achieved at the cost of a small probability of false positive in membership queries. However, for many applications the space savings and short locating time consistently outweigh this drawback. In this paper, we introduce dynamic bloom filters (DBF) to support concise representation and approximate membership queries of dynamic sets, and study the false positive probability and union algebra operations. We prove that DBF can control the false positive probability at a low level by adjusting the number of standard bloom filters used according to the actual size of current dynamic set. The space complexity is also acceptable if the actual size of dynamic set does not deviate too much from the predefined threshold. Furthermore, we present multidimension dynamic bloom filters (MDDBF) to support concise representation and approximate membership queries of dynamic sets in multiple attribute dimensions, and study the false positive probability and union algebra operations through mathematic analysis and experimentation. We also explore the optimization approach and three network applications of bloom filters, namely bloom joins, informed search, and global index implementation. Our simulation shows that informed search based on bloom filters can obtain higher recall and success rate of query than the blind search protocol.
Scooped, Again
- in Second International Workshop on Peer-to-Peer Systems (IPTPS 2003), ser. Lecture Notes in Computer Science
, 2003
"... The Peer-to-Peer (p2p) and Grid infrastructure communities are tackling an overlapping set of problems. In addressing these problems, p2p solutions are usually motivated by elegance or research interest. Grid researchers, under pressure from thousands of scientists with real file sharing and computa ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The Peer-to-Peer (p2p) and Grid infrastructure communities are tackling an overlapping set of problems. In addressing these problems, p2p solutions are usually motivated by elegance or research interest. Grid researchers, under pressure from thousands of scientists with real file sharing and computational needs, are pooling their solutions from a wide range of sources in an attempt to meet user demand. Driven by this need to solve large scientific problems quickly, the Grid is being constructed with the tools at hand: FTP or RPC for data transfer, centralization for scheduling and authentication, and an assumption of correct, obediant nodes. If history is any guide, the World Wide Web depicts viscerally that systems that address user needs can have enormous staying power and affect future research. The Grid infrastructure is a great customer waiting for future p2p products. By no means should we make them our only customers, but we should at least put them on the list. If p2p research does not at least address the Grid, it may eventually be sidelined by defacto distributed algorithms that are less elegant but were used to solve Grid problems. In essense, we'll have been scooped, again.
The Dynamic Bloom Filters
- In Proc. IEEE Infocom
, 2006
"... Abstract—A Bloom filter is an effective, space-efficient data structure for concisely representing a set and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a suffic ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract—A Bloom filter is an effective, space-efficient data structure for concisely representing a set and supporting approximate membership queries. Traditionally, the Bloom filter and its variants just focus on how to represent a static set and decrease the false positive probability to a sufficiently low level. By investigating mainstream applications based on the Bloom filter, we reveal that dynamic data sets are more common and important than static sets. However, existing variants of the Bloom filter cannot support dynamic data sets well. To address this issue, we propose dynamic Bloom filters to represent dynamic sets as well as static sets and design necessary item insertion, membership query, item deletion, and filter union algorithms. The dynamic Bloom filter can control the false positive probability at a low level by expanding its capacity as the set cardinality increases. Through comprehensive mathematical analysis, we show that the dynamic Bloom filter uses less expected memory than the Bloom filter when representing dynamic sets with an upper bound on set cardinality, and also that the dynamic Bloom filter is more stable than the Bloom filter due to infrequent reconstruction when addressing dynamic sets without an upper bound on set cardinality. Moreover, the analysis results hold in standalone applications as well as distributed applications. Index Terms—Bloom filters, dynamic Bloom filters, information representation.
Query protocols for highly resilient peer-to-peer networks
- In Proc. of ISCA PDCS’06
, 2006
"... Abstract — The decentralized and ad hoc nature of peer-topeer (P2P) networks means that both the structure of the network, and the content stored within it are highly variable. Real-world studies indicate that only a small number of peers remain persistent over significant time periods, and that the ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract — The decentralized and ad hoc nature of peer-topeer (P2P) networks means that both the structure of the network, and the content stored within it are highly variable. Real-world studies indicate that only a small number of peers remain persistent over significant time periods, and that the perceived importance of objects stored in the network, measured in terms of access or update frequency, may not follow a uniform distribution. In this paper, we present WARP, a P2P system that exploits these distinctions as an integral part of its design. WARP employs a novel fault-tolerant mechanism to manage the dynamic nature of node arrivals and departures by allowing multiple physical nodes to service data mapped to a single node in the overlay. Moreover, the overlay supports different query types, distinguishing queries to popular or valuable data from queries to unpopular or less valuable data. We prove via a rigorous stochastic analysis that any query, regardless of type, will be successfully serviced with high probability. Further, we show that for a network with ¢ nodes, the hop complexity of the protocol is £¥¤§¦©¨���¢� � with high probability. We also define bandwidth complexity, a measure of congestion at any node, and prove that it is £¥¤§¦©¨�����¢� � with high probability. We provide a detailed simulation of the system and show that it conforms closely to our theoretical guarantees. I.
DDLS: Extending open hypermedia systems into peer-to-peer environments
"... Peer-to-peer (P2P) computing is primarily characterised by decentralisation, scalability, anonymity, self-organisation and ad hoc connectivity. It attracted considerable attention in open hypermedia research due to its potential for supporting collaboration among a community of people sharing simila ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Peer-to-peer (P2P) computing is primarily characterised by decentralisation, scalability, anonymity, self-organisation and ad hoc connectivity. It attracted considerable attention in open hypermedia research due to its potential for supporting collaboration among a community of people sharing similar knowledge background. The aim of this re-search is to investigate the feasibility and potential benefits of incorporating the P2P paradigm in open hypermedia systems to support resource sharing-based collaboration. This is accomplished by utilising a distributed dynamic link service (DDLS) as a test bed, addressing issues that arise from implementing the paradigm, and demonstrating the efficiency of proposed techniques through simulation. This research begins with the development of a prototype DDLS using the open hypermedia paradigm for storing and presenting resources and a centralised P2P model which adopts a central service directory for publishing and discovering resources in a well-arranged environment. This is enhanced by an operational analysis and feature comparison between prototypes based on the traditional client-server and the centralised P2P models. Various P2P models are analysed to identify the key characteristics of
Distributed, Secure Load Balancing with Skew, Heterogeneity, and Churn
, 2004
"... Numerous proposals exist for load balancing in peer-to-peer (p2p) networks. Some focus on namespace balancing, making the distance between nodes as uniform as possible. This technique works well under ideal conditions, but not under those found empirically. Instead, researchers have found heavy-tail ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Numerous proposals exist for load balancing in peer-to-peer (p2p) networks. Some focus on namespace balancing, making the distance between nodes as uniform as possible. This technique works well under ideal conditions, but not under those found empirically. Instead, researchers have found heavy-tailed query distributions (skew), high rates of node join and leave (churn), and wide variation in node network and storage capacity (heterogeneity) . Other approaches tackle these less-than-ideal conditions, but give up on important security properties. We propose an algorithm that both facilitates good performance and does not dilute security. Our algorithm, kChoices, achieves load balance by greedily matching nodes' target workloads with actual applied workloads through limited sampling, and limits any fundamental decrease in security by basing each nodes' set of potential identifiers on a single certificate. Our algorithm compares favorably to four others in trace-driven simulations. We have implemented our algorithm and found that it improved aggregate throughput by 20% in a widely heterogeneous system in our experiments.

