• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Peer-to-Peer Information Retrieval Using Self-Organizing Semantic Overlay Networks (2003)

by Chunqiang Tang, Zhichen Xu, Sandhya Dwarkadas
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 133
Next 10 →

MAAN: A multi-attribute addressable network for grid information services

by Min Cai, Martin Frank, Jinbo Chen, Pedro Szekely - Journal of Grid Computing , 2003
"... Abstract. Recent structured Peer-to-Peer (P2P) systems such as Distributed Hash Tables (DHTs) offer scalable key-based lookup for distributed resources. However, they cannot be simply applied to grid information services because grid resources need to be registered and searched using multiple attrib ..."
Abstract - Cited by 59 (3 self) - Add to MetaCart
Abstract. Recent structured Peer-to-Peer (P2P) systems such as Distributed Hash Tables (DHTs) offer scalable key-based lookup for distributed resources. However, they cannot be simply applied to grid information services because grid resources need to be registered and searched using multiple attributes. This paper proposes a Multi-Attribute Addressable Network (MAAN) that extends Chord to support multi-attribute and range queries. MAAN addresses range queries by mapping attribute values to the Chord identifier space via uniform locality preserving hashing. It uses an iterative or single attribute dominated query routing algorithm to resolve multi-attribute based queries. Each node in MAAN only has O(log N) neighbors for N nodes. The number of routing hops to resolve a multi-attribute range query is O(log N + N × smin), where smin is the minimum range selectivity on all attributes. When smin = ε, it is logarithmic to the number of nodes, which is scalable to a large number of nodes and attributes. We also measured the performance of our MAAN implementation and the experimental results are consistent with our theoretical analysis.

Hybrid Global-Local Indexing for Efficient Peer-To-Peer Information Retrieval

by Chunqiang Tang, Sandhya Dwarkadas, Hya Dwarkadas , 2004
"... Content-based full-text search still remains a particularly challenging problem in peer-to-peer (P2P) systems. Traditionally, there have been two index partitioning structures---partitioning based on the document space or partitioning based on keywords. The former requires search of every node in th ..."
Abstract - Cited by 52 (1 self) - Add to MetaCart
Content-based full-text search still remains a particularly challenging problem in peer-to-peer (P2P) systems. Traditionally, there have been two index partitioning structures---partitioning based on the document space or partitioning based on keywords. The former requires search of every node in the system to answer a query whereas the latter transmits a large amount of data when processing multi-term queries. In this paper, we propose eSearch---a P2P keyword search system based on a novel hybrid indexing structure. In eSearch, each node is responsible for certain terms. Given a document, eSearch uses a modern information retrieval algorithm to select a small number of top (important) terms in the document and publishes the complete term list for the document to nodes responsible for those top terms. This selective replication of term lists allows a multi-term query to proceed local to the nodes responsible for query terms. We also propose automatic query expansion to alleviate the degradation of quality of search results due to the selective replication, overlay source multicast to reduce the cost of disseminating term lists, and techniques to balance term list distribution across nodes.

Ekta: An Efficient DHT Substrate for Distributed Applications in Mobile Ad Hoc Networks

by Himabindu Pucha, Saumitra M. Das, Y. Charlie Hu - in Proceedings of the 6th IEEE Workshop on Mobile Computing Systems and Applications (WMCSA 2004), English Lake District , 2004
"... Distributed Hash Tables (DHTs) have proven to be a novel and efficient platform for building a variety of scalable and robust distributed applications like content sharing and location in the Internet. Similar to those in the Internet, distributed applications and network services in mobile ad hoc n ..."
Abstract - Cited by 39 (0 self) - Add to MetaCart
Distributed Hash Tables (DHTs) have proven to be a novel and efficient platform for building a variety of scalable and robust distributed applications like content sharing and location in the Internet. Similar to those in the Internet, distributed applications and network services in mobile ad hoc networks (MANETs) can potentially benefit from the deployment of a DHT. However, bandwidth limitations, node mobility, and multi access interference pose unique challenges to deploying such DHTs in MANETs.

Exploiting semantic proximity in peer-to-peer content searching

by Spyros Voulgaris - In 10th International Workshop on Future Trends in Distributed Computing Systems (FTDCS 2004), Suzhu , 2004
"... A lot of recent work has dealt with improving performance of content searching in peer-to-peer file sharing systems. In this paper we attack this problem by modifying the overlay topology describing the peer relations in the system. More precisely, we create a semantic overlay, linking nodes that ar ..."
Abstract - Cited by 36 (10 self) - Add to MetaCart
A lot of recent work has dealt with improving performance of content searching in peer-to-peer file sharing systems. In this paper we attack this problem by modifying the overlay topology describing the peer relations in the system. More precisely, we create a semantic overlay, linking nodes that are “semantically close”, by which we mean that they are interested in similar documents. This semantic overlay provides the primary search mechanism, while the initial peer-to-peer system provides the fail-over search mechanism. We focus on implicit approaches for discovering semantic proximity. We evaluate and compare three candidate methods, and review open questions. 1.

Federated search of text-based digital libraries in hierarchical peer-to-peer networks

by Jie Lu, Jamie Callan - In Advances in Information Retrieval, 27th European Conference on IR Research (ECIR , 2005
"... Abstract. Peer-to-peer architectures are a potentially powerful model for developing large-scale networks of text-based digital libraries, but peer-to-peer networks have so far provided very limited support for text-based federated search of digital libraries using relevance-based ranking. This pape ..."
Abstract - Cited by 32 (3 self) - Add to MetaCart
Abstract. Peer-to-peer architectures are a potentially powerful model for developing large-scale networks of text-based digital libraries, but peer-to-peer networks have so far provided very limited support for text-based federated search of digital libraries using relevance-based ranking. This paper addresses the problems of resource representation, resource ranking and selection, and result merging for federated search of text-based digital libraries in hierarchical peer-to-peer networks. Existing approaches to text-based federated search are adapted and new methods are developed for resource representation and resource selection according to the unique characteristics of hierarchical peer-topeer networks. Experimental results demonstrate that the proposed approaches offer a better combination of accuracy and efficiency than more common alternatives for federated search in peer-to-peer networks. 1

On Scaling Latent Semantic Indexing for Large Peer-To-Peer Systems

by Chunqiang Tang, Sandhya Dwarkadas, Zhichen Xu - Proc. 27th Annual International ACM SIGIR Conference , 2004
"... The exponential growth of data demands scalable infrastructures capable of indexing and searching rich content such as text, music, and images. A promising direction is to combine information retrieval with peer-to-peer technology for scalability, fault-tolerance, and low administration cost. One pi ..."
Abstract - Cited by 26 (0 self) - Add to MetaCart
The exponential growth of data demands scalable infrastructures capable of indexing and searching rich content such as text, music, and images. A promising direction is to combine information retrieval with peer-to-peer technology for scalability, fault-tolerance, and low administration cost. One pioneering work along this direction is pSearch [32, 33]. pSearch places documents onto a peerto -peer overlay network according to semantic vectors produced using Latent Semantic Indexing (LSI). The search cost for a query is reduced since documents related to the query are likely to be co-located on a small number of nodes. Unfortunately, because of its reliance on LSI, pSearch also inherits the limitations of LSI. (1) When the corpus is large and heterogeneous, LSI's retrieval quality is inferior to methods such as Okapi. (2) The Singular Value Decomposition (SVD) used in LSI is unscalable in terms of both memory consumption and computation time.

An Adaptive Protocol for Efficient Support of Range Queries in DHT-based Systems

by Jun Gao, Peter Steenkiste - In ICNP ’04: Proceedings of the Network Protocols, 12th IEEE International Conference on (ICNP’04 , 2004
"... In recent years, Distributed Hash Tables (DHTs) have been proposed as a fundamental building block for large scale distributed applications. Important functionalities such as searching have been added to the DHT’s basic lookup capability. However, supporting range queries efficiently remains a diffi ..."
Abstract - Cited by 25 (1 self) - Add to MetaCart
In recent years, Distributed Hash Tables (DHTs) have been proposed as a fundamental building block for large scale distributed applications. Important functionalities such as searching have been added to the DHT’s basic lookup capability. However, supporting range queries efficiently remains a difficult problem. In this paper, we describe an adaptive mechanism that relies on a logical tree data structure, the Range Search Tree (RST), to support range queries efficiently. Nodes in the RST automatically group registrations based on their values. Queries are decomposed into a small number of sub-queries for efficient resolution. The system dynamically optimizes itself to minimize the registration and query cost based on observed load. The system is fully distributed and avoids bottleneck problems encountered in traditional tree-based systems. Extensive simulation results validate the effectiveness of the system. 1.

Etna: a fault-tolerant algorithm for atomic mutable dht data

by Athicha Muthitacharoen, Seth Gilbert, Robert Morris , 2004
"... This paper presents Etna, an algorithm for atomic reads and writes of replicated data stored in a distributed hash table. Etna correctly handles dynamically changing sets of replica hosts, and is optimized for reads, writes, and reconfiguration, in that order. Etna maintains a series of replica conf ..."
Abstract - Cited by 22 (2 self) - Add to MetaCart
This paper presents Etna, an algorithm for atomic reads and writes of replicated data stored in a distributed hash table. Etna correctly handles dynamically changing sets of replica hosts, and is optimized for reads, writes, and reconfiguration, in that order. Etna maintains a series of replica configurations as nodes in the system change, using new sets of replicas from the pool supplied by the distributed hash table system. It uses the Paxos protocol to ensure consensus on the members of each new configuration. For simplicity and performance, Etna serializes all reads and writes through a primary during the lifetime of each configuration. As a result, Etna completes read and write operations in only a single round from the primary. Experiments in an environment with high network delays show that Etna’s read latency is determined by round-trip delay in the underlying network, while write and reconfiguration latency is determined by the transmission time required to send data to each replica. Etna’s write latency is about the same as that of a non-atomic replicating DHT, and Etna’s read latency is about twice that of a non-atomic DHT due to Etna assembling a quorum for every read. 1

Vbi-tree: A peer-to-peer framework for supporting multi-dimensional indexing schemes

by H. V. Jagadish - In Proc. Intl. Conf. on Data Engineering (ICDE , 2006
"... Multi-dimensional data indexing has received much attention in a centralized database. However, not so much work has been done on this topic in the context of Peerto-Peer systems. In this paper, we propose a new Peer-to-Peer framework based on a balanced tree structure overlay, which can support ext ..."
Abstract - Cited by 20 (1 self) - Add to MetaCart
Multi-dimensional data indexing has received much attention in a centralized database. However, not so much work has been done on this topic in the context of Peerto-Peer systems. In this paper, we propose a new Peer-to-Peer framework based on a balanced tree structure overlay, which can support extensible centralized mapping methods and query processing based on a variety of multidimensional tree structures, including R-Tree, X-Tree, SS-Tree, and M-Tree. Specifically, in a network with N nodes, our framework guarantees that point queries and range queries can be answered within O(logN) hops. We also provide an effective load balancing strategy to allow nodes to balance their work load efficiently. An experimental assessment validates the practicality of our proposal. 1.

Bookmark-driven query routing in peer-to-peer web search

by Matthias Bender, Sebastian Michel, Gerhard Weikum, Christian Zimmer - Proceedings of the SIGIR Workshop on Peer-to-Peer Information Retrieval. (2004) 46–57 , 2004
"... Abstract: We consider the problem of collaborative Web search and query routing strategies in a peer-to-peer (P2P) environment. In our architecture every peer has a full-fledged search engine with a (thematically focused) crawler and a local index whose contents may be tailored to the user’s specifi ..."
Abstract - Cited by 19 (12 self) - Add to MetaCart
Abstract: We consider the problem of collaborative Web search and query routing strategies in a peer-to-peer (P2P) environment. In our architecture every peer has a full-fledged search engine with a (thematically focused) crawler and a local index whose contents may be tailored to the user’s specific interest profile. Peers are autonomous and post meta-information about their bookmarks and index lists to a global directory, which is efficiently implemented in a decentralized manner using Chordstyle distributed hash tables. A query posed by one peer is first evaluated locally; if the result is unsatisfactory the query is forwarded to selected peers. These peers are chosen based on a benefit/cost measure where benefit reflects the thematic similarity of peers ’ interest profiles, derived from bookmarks, and cost captures estimated peer load and response time. The meta-information that is needed for making these query routing decisions is efficiently looked up in the global directory; it can also be cached and proactively disseminated for higher availability and reduced network load. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University