Results 1 - 10
of
11
Trace driven analysis of the long term evolution of gnutella peer-to-peer traffic
- In Proceedings of the Passive and Active Measurment Conference (PAM’07
, 2007
"... Abstract. Peer-to-Peer (P2P) applications, such as Gnutella, are evolving to address some of the observed performance issues. In this paper, we analyze Gnutella behavior in 2003, 2005, and 2006. During this time, the protocol evolved from v0.4 to v0.6 to address problems with overhead of overlay mai ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Abstract. Peer-to-Peer (P2P) applications, such as Gnutella, are evolving to address some of the observed performance issues. In this paper, we analyze Gnutella behavior in 2003, 2005, and 2006. During this time, the protocol evolved from v0.4 to v0.6 to address problems with overhead of overlay maintenance and query traffic bandwidth. The goal of this paper is to understand whether the newer protocols address the prior concerns. We observe that the new architecture alleviated the bandwidth consumption for low capacity peers while increasing the bandwidth consumption at high capacity peers. We measured a decrease in incoming query rate. However, highly connected ultra-peers must maintain many connections to which they forward all queries thereby increasing the outgoing query traffic. We also show that these changes have not significantly improved search performance. The effective success rate experienced at a forwarding peer has only increased from 3.5 % to 6.9%. Over 90 % of queries forwarded by a peer do not result in any query hits. With an average query size of over 100 bytes and 30 neighbors for an ultra-peer, this results in almost 1 GB of wasted bandwidth in a 24 hour session. We outline solution approaches to solve this problem and make P2P systems viable for a diverse range of applications. 1
Load reduction in the kad peer-to-peer system
- In Fifth International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P
, 2007
"... Abstract. Distributed hash tables (DHTs) have been actively studied in literature and many different proposals have been made on how to organize peers in a DHT. However, very few DHTs have been implemented in real systems and deployed on a large scale. One exception is kad, a DHT based on Kademlia, ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Abstract. Distributed hash tables (DHTs) have been actively studied in literature and many different proposals have been made on how to organize peers in a DHT. However, very few DHTs have been implemented in real systems and deployed on a large scale. One exception is kad, a DHT based on Kademlia, which is part of eDonkey, a peer-topeer file sharing system with several million simultaneous users. In this paper, we investigate the publishing and searching mechanisms in kad. We designed and implemented Mistral, a content spy that can capture up to ten million references to published content in several hours. At first evaluation, we notice that publishing new content in a kad system is much more expensive than searching and retrieving existing content. Indeed, measurements show that of all the Internet traffic generated by kad-based peer-to-peer networks, 90 % is for publishing and 10 % for retrieving
Understanding the practical limits of the gnutella p2p system: An analysis of query terms and object name distributions
- in Proceedings of the ACM/SPIE Multimedia Computing and Networking (MMCN ’08
, 2008
"... A number of prior efforts analyzed the behavior of popular peer-to-peer (P2P) systems and proposed ways for maintaining the overlays as well as methods for searching for contents using these overlays. However, little was known about how successful users could be in locating the shared objects in the ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A number of prior efforts analyzed the behavior of popular peer-to-peer (P2P) systems and proposed ways for maintaining the overlays as well as methods for searching for contents using these overlays. However, little was known about how successful users could be in locating the shared objects in these system. There might be a mismatch between the way content creators named objects and the way such objects were queried by the consumers. Our aim was to examine the terms used in the queries and shared object names in the Gnutella file-sharing system. We analyzed the object names of over 20 million objects collected from 40,000 peers as well as terms from over 230,000 queries. We observed that almost half (44.4%) of the queries had no matching objects in the system regardless of the overlay or search mechanism used to locate the objects. We also evaluated the query success rates against random peer groups of various sizes (200, 1K, 2K, 3K, 4K, 5K, 10K and 20K peers sampled from the full 40,000 peers). We showed that the success rates increased rapidly from 200 to 5,000 peers, but only exhibited modest improvements when increasing the number of peers beyond 5,000. Finally, we observed Zipf-like distribution for query terms and the object names. However, the relative popularity of a term in the object names did not correlate with the terms popularity in the query workload. This observation affected the ability of hybrid P2P systems to guide searches by creating a synopsis of the peer object names. A synopsis created by using the distribution of terms in the object names need not represent relevant terms for the query. Our results can be used to guide the design of future P2P systems that are optimized for the observed object names and user query behavior.
A Case for Unstructured Distributed Hash Tables
"... Structured peer-to-peer overlays support compelling applications such as large-scale file systems and distributed backup using the distributed hash table (DHT) interface. While unstructured file-sharing systems continue to flourish, wide adoption of structured applications has been elusive. We explo ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Structured peer-to-peer overlays support compelling applications such as large-scale file systems and distributed backup using the distributed hash table (DHT) interface. While unstructured file-sharing systems continue to flourish, wide adoption of structured applications has been elusive. We explore an alternative path to deployment of these applications by asking the question, can structured applications be run on top of unstructured overlays? We build an unstructured distributed hash table (UDHT) on top of state of the art search and topology management mechanisms, and evaluate whether it can sufficiently emulate properties of DHTs to support structured applications.
Long Term Study of Peer Behavior in the KAD DHT
"... Abstract—Distributed hash tables (DHTs) have been actively studied in literature and many different proposals have been made on how to organize peers in a DHT. However, very few DHTs have been implemented in real systems and deployed on a large scale. One exception is KAD, a DHT based on Kademlia, w ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract—Distributed hash tables (DHTs) have been actively studied in literature and many different proposals have been made on how to organize peers in a DHT. However, very few DHTs have been implemented in real systems and deployed on a large scale. One exception is KAD, a DHT based on Kademlia, which is part of eDonkey, a peer-to-peer file sharing system with several million simultaneous users. We have been crawling a representative subset of KAD every five minutes for six months and obtained information about geographical distribution of peers, session times, daily usage, and peer lifetime. We have found that session times are Weibull distributed and we show how this information can be exploited to make the publishing mechanism much more efficient. Peers are identified by the so-called KAD ID, which up to now was assumed to be persistent. However, we observed that a fraction of peers changes their KAD ID as frequently as once a session. This change of KAD IDs makes it difficult to characterize end-user behavior. For this reason we have been crawling the entire KAD network once a day for more than a year to track end-users with static IP addresses, which allows us to estimate end-user lifetime and the fraction of end-users changing their KAD ID. Index Terms—Distributed hash table, distributed systems, measurement, peer-to-peer.
First Issue Coordinators
"... Responsibility for the contents rests upon the authors and not upon IARIA, nor on IARIA volunteers, staff, or contractors. IARIA is the owner of the publication and of editorial aspects. IARIA reserves the right to update the content for quality improvements. Abstracting is permitted with credit to ..."
Abstract
- Add to MetaCart
Responsibility for the contents rests upon the authors and not upon IARIA, nor on IARIA volunteers, staff, or contractors. IARIA is the owner of the publication and of editorial aspects. IARIA reserves the right to update the content for quality improvements. Abstracting is permitted with credit to the source. Libraries are permitted to photocopy or print, providing the reference is mentioned and that the resulting material is made available at no cost. Reference should mention:
Searching for Rare Objects using Index Replication
"... Abstract—Searching for objects is a fundamental problem for popular peer-to-peer file-sharing networks that contribute to much of the traffic on today’s Internet. While existing protocols can effectively locate highly popular files, studies show that they fail to locate a significant portion of exis ..."
Abstract
- Add to MetaCart
Abstract—Searching for objects is a fundamental problem for popular peer-to-peer file-sharing networks that contribute to much of the traffic on today’s Internet. While existing protocols can effectively locate highly popular files, studies show that they fail to locate a significant portion of existing files in the network. High recall for these “rare ” objects would drastically improve the user experience, and make these networks the ideal distribution infrastructure for user-generated content such as home videos and photo albums. In this paper, we examine simple techniques that can improve search recall for rare objects while minimizing the overhead incurred by participating peers. We propose several strategies for multi-hop index replication, and demonstrate their effectiveness and efficiency through both analysis and simulation. We further evaluate our simple techniques using detailed traces from a real Gnutella network, and show that they improve the performance of these overlays by orders of magnitude in both lookup success and overhead. I.
BIDIR-SAM: Large-Scale Content Distribution in Structured Overlay Networks
, 2009
"... IPTV, software replication and other large-scale distribution tasks urge the need for efficient multicast mechanisms in overlay networks. Current multicast solutions on the application layer are either efficient, structured, but inflexible, or flexible, unstructured, but of lesser efficiency. This ..."
Abstract
- Add to MetaCart
IPTV, software replication and other large-scale distribution tasks urge the need for efficient multicast mechanisms in overlay networks. Current multicast solutions on the application layer are either efficient, structured, but inflexible, or flexible, unstructured, but of lesser efficiency. This paper introduces Scalable Adaptive Multicast on BI-DIRectional shared trees, a new structured but flexible approach to content distribution. BIDIR-SAM is the first DHT-based overlay multicast that distributes any source multicast data according to source-specific shortest path trees. Built upon bi-directional shared prefix trees, the approach distributes packets uniquely via fully redundant paths, and allows for highly flexible network adaptivity. Guided by an overlay abstraction, it operates directly on top of a prefix routing and does not rely on any kind of rendezvous point or bootstrapping.
1 Characterization and Management of Popular Content in KAD
"... Abstract—The endeavor of this work is to study the impact of content popularity in a large-scale Peer-to-Peer network, namely KAD. Armed with the insights gained from an extensive measurement campaign,wepinpointseveraldeficienciesofthepresentKAD design in handling popular content, and provide a seri ..."
Abstract
- Add to MetaCart
Abstract—The endeavor of this work is to study the impact of content popularity in a large-scale Peer-to-Peer network, namely KAD. Armed with the insights gained from an extensive measurement campaign,wepinpointseveraldeficienciesofthepresentKAD design in handling popular content, and provide a series of solutions to address such shortcomings. Among them, we design and evaluate an adaptive load balancing mechanism. Our mechanism is backward compatible with KAD, asitonlymodifiesitsinneralgorithms, and presents several desirable properties: (i) it drives the process that selects the number and location of peers responsible to store references to objects, based on their popularity; (ii) it solves problems related to saturated peers, that would otherwise entail a significant drop in the diversity of references to objects, and (iii) if coupled with an enhanced content search procedure, it allows a more fair and efficient usage of peer resources, at a reasonable cost. Our evaluation uses a trace-driven simulator that features realistic peer churn and a precise implementation of the inner components of KAD. 1
P2Prec: a Social-based P2P Recommendation System for Large-scale Data Sharing
, 2010
"... Abstract. We propose P2Prec, a P2P recommendation system for large-scale data sharing, which exploits friendship links. The main idea is to recommend high quality contents related to query topics and contents of friends (or friends of friends), who are expert on the topics related to the query. Expe ..."
Abstract
- Add to MetaCart
Abstract. We propose P2Prec, a P2P recommendation system for large-scale data sharing, which exploits friendship links. The main idea is to recommend high quality contents related to query topics and contents of friends (or friends of friends), who are expert on the topics related to the query. Expertise is implicitly deduced based on the contents stored by a user. To exploit friendship links, we rely on Friend-Of-A-Friend (FOAF) descriptions. To disseminate information about experts, we propose new semantic-based gossip algorithms that provide scalability, robustness, simplicity and load balancing. By using information retrieval techniques, we propose an efficient query routing algorithm that recommends the best peers to serve a query. In our experimental evaluation, using the TREC09 dataset and Wiki vote social network, we show that using semantic gossiping increases recall by a factor of 2.5 compared with well known random gossiping. Furthermore, P2Prec has the ability to get reasonable recall with acceptable query processing load and network traffic.

