Results 1 -
4 of
4
Assisted Peer-to-Peer Search with Partial Indexing
, 2007
"... In the past few years, peer-to-peer (P2P) networks have become a promising paradigm for building a wide variety of distributed systems and applications. The most popular P2P application till today is file sharing, e.g., Gnutella, Kazza, etc. These systems are usually referred to as unstructured, an ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
(Show Context)
In the past few years, peer-to-peer (P2P) networks have become a promising paradigm for building a wide variety of distributed systems and applications. The most popular P2P application till today is file sharing, e.g., Gnutella, Kazza, etc. These systems are usually referred to as unstructured, and search in unstructured P2P networks usually involves flooding or random walking. On the other hand, in structured P2P networks (DHTs), search is usually performed by looking up a distributed inverted index. The efficiency of the search mechanism is the key to the scalability of a P2P content sharing system. So far, neither unstructured nor structured P2P networks alone can solve the search problem in a satisfactory way. In this paper, we propose to combine the strengths of both unstructured and structured P2P networks to achieve more efficient search. Specifically, we propose to enhance search in unstructured P2P overlay networks by building a partial index of shared data using a structured P2P network. The index maintains two types of information: the top interests of peers and globally unpopular data, both characterized by data properties. The proposed search protocol, assisted search with partial indexing, makes use of the index to improve search in three ways: First, the index assists peers to find other peers with similar interests and the unstructured search overlay is formed to reflect peer interests. Second, the index also provides search hints for those data difficult to locate by exploring peer interest locality, and these hints can be used for second-chance search. Third, the index helps to locate unpopular data items. Experiments based on a P2P file sharing trace show that the assisted search with a lightweight partial indexing service can significantly improve the success rate in locating data than Gnutella and a hit-rate-based protocol in unstructured P2P systems, while incurring low search latency and overheads.
Fast and low-cost search Schemes by exploiting localities in P2P networks
- J. PARALLEL DISTRIB. COMPUT.
, 2005
"... Existing peer-to-peer (P2P) search algorithms generally target either the performance objective of improving search quality from a client's perst'jq ve, or the objective of reducingsduci cos from an Internet managementpersgeme ve.Mos exis1q1 work ofdesRIRMj and optimizingstimi algorithms i ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
Existing peer-to-peer (P2P) search algorithms generally target either the performance objective of improving search quality from a client's perst'jq ve, or the objective of reducingsduci cos from an Internet managementpersgeme ve.Mos exis1q1 work ofdesRIRMj and optimizingstimi algorithms inunsRqfjSfq' P2P networks addresR; the trade-off between the two performance objectives In contrasR our goal inthis sis is to attempt to achieve both objectives Motivated by ourobsf' ations on the content locality in the peer community and thelocalities ofsj1qR interesj of individual peers wepropos content-abundant clus1A -selectively prefetching res'f1jSf peers (CAC-SPIRP), a fas and low-cos P2PsPj;'1'A algorithm. Our algorithmconsith of twocomponents The firs componentaims to reduce thesej1E cos byconsqEjSf;A a CAC, where content-abundantpeers srsjfqMEEjS , andsdjAEA ganizethemsjE es into an inter-connectedclusec providing a pool of popularobjects to be frequentlyaccesen by the peer community. A query will befirsroutedtotheCAC,andmos likely to besj1MqfE there,sre,j;E;A;j reducing the amount of network traffic and thesejA' sejA' The shejE component in our algorithmis client oriented andaims to improve the quality of P2PsPjfIq called SPIRP. A client individually identifies asI;E group ofpeers who have the sej interesM as itsre to prefetch their entire fileindices of the relatedinteresq' minimizing unnecesing outgoingqueries andsj1qIMqjS;q reducing queryresyj;' time. Building SPIRP on the CAC InternetinfrasjA'qRRjS our algorithm combines bothmerits of the twocomponents to achieve both performance objectives Our trace-drivensq;IRjS;EI s w that CAC-SPIRPsPIRPjII1Aj improves the overall performance from both client's perspective and Internet management perspective.
LightFlood: Minimizing Redundant Messages and Maximizing the Scope of Peer-to-Peer Search
"... Abstract—Flooding is a fundamental file search operation in unstructured peer-to-peer (P2P) file sharing systems, in which a peer starts the file search procedure by broadcasting a query to its neighbors, who continue to propagate it to their neighbors. This procedure repeats until a time-to-live (T ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Flooding is a fundamental file search operation in unstructured peer-to-peer (P2P) file sharing systems, in which a peer starts the file search procedure by broadcasting a query to its neighbors, who continue to propagate it to their neighbors. This procedure repeats until a time-to-live (TTL) counter is decremented to 0. Flooding can seriously limit system scalability, because the number of redundant query messages grows exponentially during the message propagation. Our study shows that more than 70 percent of the generated messages are redundant in a flooding with a TTL of 7 in a moderately connected Gnutella network. Existing efforts to address this issue have been focused on limiting the use of the flooding operation. We propose a new flooding scheme, called LightFlood, with the objective of minimizing the number of redundant messages and retaining a similar message-propagating scope as that of the standard flooding. In the scheme, each peer keeps track of the connectivities of every immediate and next indirect neighbor peers, which can be acquired locally. LightFlood identifies the neighbor with the highest connectivity and uses the link to that neighbor to form a suboverlay within the existing P2P overlay. In LightFlood, flooding is divided into two stages. The first stage is a standard flooding with a limited number of TTL hops, where a message can spread to a sufficiently large scope with a small number of redundant messages. In the second stage, message propagating is only conducted along the suboverlay, significantly reducing the number of redundant messages. Our analysis and simulation experiments show that the LightFlood scheme provides a low-overhead broadcast facility that can be effectively used in P2P search. For example, compared with standard flooding with seven TTL hops, we show that LightFlood with an additional two to three hops can reduce up to 69 percent
Using query transformation to improve Gnutella search performance
"... Abstract—Gnutella peers independently choose the way in which objects are named as well as queried. Using a long term analysis of the files shared and queries issued, we show that this flexibility leads to a mismatch between the way that objects were named and the way that users were issuing search ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Gnutella peers independently choose the way in which objects are named as well as queried. Using a long term analysis of the files shared and queries issued, we show that this flexibility leads to a mismatch between the way that objects were named and the way that users were issuing search queries. Thirty percent of the failed queries contained keywords that were not present in any file name while the remaining queries failed because no file name contained all the keywords in a particular query. Our earlier analysis of files shared in the popular iTunes music file sharing system showed that standardizing the file names to make them easier to search is not a viable alternative. Instead, we transform the queries to better match the objects available in the system. We investigated spell correction (using file name information from the neighborhood) as well as remove query keywords. We consider the results from the transformed query to be relevant to the intent of the original query if the transformed query used many of the original keywords and the number of matching files closely matched the number of matches for typical successful queries. Our approach is practical and uses information available within the immediate neighborhood of an ultra-peer. An overlay agnostic analysis shows that our transformation improves success rates from 45 % to between 72.5 % and 91.2%. Using our Hybrid mechanism as a Gnutella middleware, our transformation produced relevant results for about 61 % of the failed queries. Keywords-unstructured peer-to-peer, query transformation I.