Hybrid Global-Local Indexing for Efficient Peer-To-Peer Information Retrieval (2004)
| Citations: | 52 - 1 self |
BibTeX
@MISC{Tang04hybridglobal-local,
author = {Chunqiang Tang and Sandhya Dwarkadas and Hya Dwarkadas},
title = {Hybrid Global-Local Indexing for Efficient Peer-To-Peer Information Retrieval},
year = {2004}
}
Years of Citing Articles
OpenURL
Abstract
Content-based full-text search still remains a particularly challenging problem in peer-to-peer (P2P) systems. Traditionally, there have been two index partitioning structures---partitioning based on the document space or partitioning based on keywords. The former requires search of every node in the system to answer a query whereas the latter transmits a large amount of data when processing multi-term queries. In this paper, we propose eSearch---a P2P keyword search system based on a novel hybrid indexing structure. In eSearch, each node is responsible for certain terms. Given a document, eSearch uses a modern information retrieval algorithm to select a small number of top (important) terms in the document and publishes the complete term list for the document to nodes responsible for those top terms. This selective replication of term lists allows a multi-term query to proceed local to the nodes responsible for query terms. We also propose automatic query expansion to alleviate the degradation of quality of search results due to the selective replication, overlay source multicast to reduce the cost of disseminating term lists, and techniques to balance term list distribution across nodes.







