Results 1 -
3 of
3
Load balancing for term-distributed parallel retrieval
, 2006
"... Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacity of any single machine. To handle the necessary data volumes and query throughput rates, parallel systems are used, in which the document and index data are split across tightly-clustered distributed ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacity of any single machine. To handle the necessary data volumes and query throughput rates, parallel systems are used, in which the document and index data are split across tightly-clustered distributed computing systems. The index data can be distributed either by document or by term. In this paper we examine methods for load balancing in term-distributed parallel architectures, and propose a suite of techniques for reducing net querying costs. In combination, the techniques we describe allow a 30 % improvement in query throughput when tested on an eight-node parallel computer system. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content analysis and indexing – indexing methods; H.3.2 [Information Storage and Retrieval]:
be cited as a National Research Council report.
"... Geographic information systems are a part of the "building whole systems" tradition of computer science, combining aspects from many disciplines within the field. To understand the research needs at the intersection of computing and GIS, we need to understand the goals and purpose of GIS users and d ..."
Abstract
- Add to MetaCart
Geographic information systems are a part of the "building whole systems" tradition of computer science, combining aspects from many disciplines within the field. To understand the research needs at the intersection of computing and GIS, we need to understand the goals and purpose of GIS users and developers. A GIS is not a "database with spatial spice" or "nomadic computing with location thrown in". Geographic information systems, like statistical systems or software development systems, serve the needs of real users and define a framework for modeling the world. This paper attempts to characterize GIS in the context of conventional computer science thinking. I also address several specific research themes: encouraging "whole system" development, scientific and end-user computing, and architectures for loosely-coupled distributed computing.
A Reliable Storage Management Layer for
- In 12th ACM International Conference on Information and Knowledge Management
, 2003
"... We present a storage management layer that facilitates the implementation of parallel information retrieval systems, and related applications, on networks of workstations. The storage management layer automates the process of adding and removing nodes, and implements a dispersed mirroring strategy t ..."
Abstract
- Add to MetaCart
We present a storage management layer that facilitates the implementation of parallel information retrieval systems, and related applications, on networks of workstations. The storage management layer automates the process of adding and removing nodes, and implements a dispersed mirroring strategy to improve reliability. When nodes are added and removed, the document collection managed by the system is redistributed for load balancing purposes. The use of dispersed mirroring minimizes the impact of node failures and system modifications on query performance.

