Results 1 - 10
of
58
Mercury: Supporting scalable multi-attribute range queries
- In SIGCOMM
, 2004
"... This paper presents the design of Mercury, a scalable protocol for supporting multi-attribute rangebased searches. Mercury differs from previous range-based query systems in that it supports multiple attributes as well as performs explicit load balancing. Efficient routing and load balancing are imp ..."
Abstract
-
Cited by 197 (5 self)
- Add to MetaCart
This paper presents the design of Mercury, a scalable protocol for supporting multi-attribute rangebased searches. Mercury differs from previous range-based query systems in that it supports multiple attributes as well as performs explicit load balancing. Efficient routing and load balancing are implemented using novel light-weight sampling mechanisms for uniformly sampling random nodes in a highly dynamic overlay network. Our evaluation shows that Mercury is able to achieve its goals of logarithmic-hop routing and near-uniform load balancing. We also show that a publish-subscribe system based on the Mercury protocol can be used to construct a distributed object repository providing efficient and scalable object lookups and updates. By providing applications a range-based query language to express their subscriptions to object updates, Mercury considerably simplifies distributed state management. Our experience with the design and implementation of a simple distributed multiplayer game built on top of this object management framework shows that indicates that this indeed is a useful building block for distributed applications. Keywords: Range queries, Peer-to-peer systems, Distributed applications, Multiplayer games 1
BATON: A Balanced Tree Structure for Peer-to-Peer Networks
- In VLDB
, 2005
"... We propose a balanced tree structure overlay on a peer-to-peer network capable of supporting both exact queries and range queries efficiently. In spite of the tree structure causing distinctions to be made between nodes at different levels in the tree, we show that the load at each node is approxima ..."
Abstract
-
Cited by 66 (11 self)
- Add to MetaCart
We propose a balanced tree structure overlay on a peer-to-peer network capable of supporting both exact queries and range queries efficiently. In spite of the tree structure causing distinctions to be made between nodes at different levels in the tree, we show that the load at each node is approximately equal. In spite of the tree structure providing precisely one path between any pair of nodes, we show that sideways routing tables maintained at each node provide sufficient fault tolerance to permit efficient repair. Specifically, in a network with N nodes, we guarantee that both exact queries and range queries can be answered in O(logN) steps and also that update operations (to both data and network) have an amortized cost of O(logN). An experimental assessment validates the practicality of our proposal. 1
One torus to rule them all: Multi-dimensional queries in p2p systems
- In WebDB ’04: Proceedings of the 7th International Workshop on the Web and Databases
, 2004
"... Peer-to-peer systems enable access to data spread over an extremely large number of machines. Most P2P systems support only simple lookup queries. However, many new applications, such as P2P photo sharing and massively multiplayer games, would benefit greatly from support for multidimensional range ..."
Abstract
-
Cited by 47 (1 self)
- Add to MetaCart
Peer-to-peer systems enable access to data spread over an extremely large number of machines. Most P2P systems support only simple lookup queries. However, many new applications, such as P2P photo sharing and massively multiplayer games, would benefit greatly from support for multidimensional range queries. We show how such queries may be supported in a P2P system by adapting traditional spatialdatabase technologies with novel P2P routing networks and load-balancing algorithms. We show how to adapt two popular spatial-database solutions – kd-trees and space-filling curves – and experimentally compare their effectiveness. 1.
A casestudy in building layered DHT applications
- In Proceedings of the 2005 SIGCOMM (Aug. 2005). [11] CHAWATHE, Y., RATNASAMY, S., BRESLAU, L., LANHAM, N., AND SHENKER, S. Making Gnutella-like P2P systems scalable.In Proceedings of the 2003 SIGCOMM
, 2003
"... Recent research has shown that one can use Distributed Hash Tables (DHTs) to build scalable, robust and efficient applications. One question that is often left unanswered is that of simplicity of implementation and deployment. In this paper, we explore a case study of building an application for whi ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
Recent research has shown that one can use Distributed Hash Tables (DHTs) to build scalable, robust and efficient applications. One question that is often left unanswered is that of simplicity of implementation and deployment. In this paper, we explore a case study of building an application for which ease of deployment dominated the need for high performance. The application we focus on is Place Lab, an end-user positioning system. We evaluate whether it is feasible to use DHTs as an application-independent building block to implement a key component of Place Lab: its “mapping infrastructure.” We present Prefix Hash Trees, a data structure used by Place Lab for geographic range queries that is built entire on top of a standard DHT. By strictly layering Place Lab’s data structures on top of a generic DHT service, we were able to decouple the deployment and management of Place Lab from that of the underlying DHT. We identify the characteristics of Place Lab that made it amenable for deploying in this layered manner, and comment on its effect on performance.
Indexing Data-Oriented Overlay Networks
- In VLDB
, 2005
"... The application of structured overlay networks to implement index structures for data-oriented applications such as peer-to-peer databases or peer-to-peer information retrieval, requires highly efficient approaches for overlay construction, as changing application requirements frequently lead ..."
Abstract
-
Cited by 30 (9 self)
- Add to MetaCart
The application of structured overlay networks to implement index structures for data-oriented applications such as peer-to-peer databases or peer-to-peer information retrieval, requires highly efficient approaches for overlay construction, as changing application requirements frequently lead to re-indexing of the data and hence (re- )construction of overlay networks. This problem has so far not been addressed in the literature and thus we describe an approach for the efficient construction of data-oriented, structured overlay networks from scratch in a self-organized way. Standard maintenance algorithms for overlay networks cannot accomplish this efficiently, as they are inherently sequential. Our proposed algorithm is completely decentralized, parallel, and can construct a new overlay network with short latency. At the same time it ensures good loadbalancing for skewed data key distributions which result from preserving key order relationships as necessitated by data-oriented applications. We provide both a theoretical analysis of the basic algorithms and a complete system implementation that has been tested on PlanetLab. We use this implementation to support peer-to-peer information retrieval and database applications.
Lsh forest: self-tuning indexes for similarity search
- In WWW
, 2005
"... We consider the problem of indexing high-dimensional data for answering (approximate) similarity-search queries. Similarity indexes prove to be important in a wide variety of settings: Web search engines desire fast, parallel, main-memory-based indexes for similarity search on text data; database sy ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
We consider the problem of indexing high-dimensional data for answering (approximate) similarity-search queries. Similarity indexes prove to be important in a wide variety of settings: Web search engines desire fast, parallel, main-memory-based indexes for similarity search on text data; database systems desire disk-based similarity indexes for high-dimensional data, including text and images; peer-to-peer systems desire distributed similarity indexes with low communication cost. We propose an indexing scheme called LSH Forest which is applicable in all the above contexts. Our index uses the well-known technique of locality-sensitive hashing (LSH), but improves upon previous designs by (a) eliminating the different data-dependent parameters for which LSH must be constantly hand-tuned, and (b) improving on LSH’s performance guarantees for skewed data distributions while retaining the same storage and query overhead. We show how to construct this index in main memory, on disk, in parallel systems, and in peer-to-peer systems. We evaluate the design with experiments on multiple text corpora and demonstrate both the self-tuning nature and the superior performance of LSH Forest.
Using a distributed quadtree index in peer-to-peer networks
- VLDB Journal
, 2007
"... Abstract Peer-to-peer (P2P) networks have become a powerful means for online data exchange. Currently, users are primarily utilizing these networks to perform exact-match queries and retrieve complete files. However, future more data intensive applications, such as P2P auction networks, P2P job-sear ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
Abstract Peer-to-peer (P2P) networks have become a powerful means for online data exchange. Currently, users are primarily utilizing these networks to perform exact-match queries and retrieve complete files. However, future more data intensive applications, such as P2P auction networks, P2P job-search networks, P2P multi-player games, will require the capability to respond to more complex queries such as range queries involving numerous data types including those that have a spatial component. In this paper, a distributed quadtree index that adapts the MX-CIF quadtree is described that enables more powerful accesses to data in P2P networks. This index has been implemented for various prototype P2P applications and results of experiments are presented. Our index is easy to use, scalable, and exhibits good load-balancing properties. Similar indices can be constructed for various multi-dimensional data types with both spatial and non-spatial components. This work was supported in part by the National Science Foundation under grants EIA-99-00268, EIA-00-91474, and CCF-05-15241 as well as Microsoft Research.
P-Ring: An efficient and robust P2P range index structure
- In SIGMOD
, 2007
"... Peer-to-peer systems have emerged as a robust, scalable and decentralized way to share and publish data. In this paper, we propose P-Ring, a new P2P index structure that supports both equality and range queries. P-Ring is fault-tolerant, provides logarithmic search performance even for highly skewed ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Peer-to-peer systems have emerged as a robust, scalable and decentralized way to share and publish data. In this paper, we propose P-Ring, a new P2P index structure that supports both equality and range queries. P-Ring is fault-tolerant, provides logarithmic search performance even for highly skewed data distributions and efficiently supports large sets of data items per peer. We experimentally evaluate P-Ring using both simulations and a real distributed deployment on PlanetLab, and we compare its performance with
A scalable P2P platform for the knowledge Grid
- IEEE Transactions on Knowledge and Data Engineering
, 2005
"... Abstract—The Knowledge Grid needs to operate with a scalable platform to provide large-scale intelligent services. A key function of such a platform is to efficiently support various complex queries in a dynamic large-scale network environment. This paper proposes a platform to support index-based p ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
Abstract—The Knowledge Grid needs to operate with a scalable platform to provide large-scale intelligent services. A key function of such a platform is to efficiently support various complex queries in a dynamic large-scale network environment. This paper proposes a platform to support index-based path queries by incorporating a semantic overlay with an underlying structured P2P network that provides object location and management services. Various distributed indexing structures can be dynamically formed by publishing semantic objects as indexing nodes. Queries are forwarded along the chains of semantic object pointers to search for objects. We investigate the deployment of a scalable distributed trie index for broadcast queries on key strings, propose a decentralized load balancing method for solving the problem of uneven load distribution incurred by heterogeneity of loads and node capacities and by the distributed trie index, and give an approach for improving the availability of the semantic overlay and its trie index. Experiments demonstrate the scalability of the proposed platform. Index Terms—Peer-to-peer, semantic overlay, knowledge grid, path query, distributed trie index, load balancing, replication.
Efficient skyline query processing on peer-to-peer networks
- In IEEE International Conference on Data Engineering (ICDE) (2007
, 2007
"... Skyline query has been gaining much interest in database research communities in recent years. Most existing studies focus mainly on centralized systems, and resolving the problem in a distributed environment such as a peer-to-peer (P2P) network is still an emerging topic. The desiderata of efficien ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Skyline query has been gaining much interest in database research communities in recent years. Most existing studies focus mainly on centralized systems, and resolving the problem in a distributed environment such as a peer-to-peer (P2P) network is still an emerging topic. The desiderata of efficient skyline querying in P2P environment include: 1) progressive returning of answers, 2) low processing cost in terms of number of peers accessed and search messages, 3) balanced query loads among the peers. In this paper, we propose a solution that satisfies the three desiderata. Our solution is based on a balanced tree structured P2P network. By partitioning the skyline search space adaptively based on query accessing patterns, we are able to alleviate the problem of “hot ” spots present in the skyline query processing. By being able to estimate the peer nodes within the query subspaces, we are able to control the amount of query forwarding, limiting the number of peers involved and the amount of messages transmitted in the network. Load balancing is achieved in query load conscious data space splitting/merging during the joining/departure of nodes and through dynamic load migration. Experiments on real and synthetic datasets confirm the effectiveness and scalability of our algorithm on P2P networks. 1.

