The state of the art in distributed query processing (2000)
Cached
Download Links
- [www.cs.duke.edu]
- [www.cs.duke.edu]
- [dcg.ethz.ch]
- [disco.ethz.ch]
- [distcomp.ethz.ch]
- [www.distcomp.ethz.ch]
- [www.dcg.ethz.ch]
- DBLP
Other Repositories/Bibliography
| Venue: | ACM Computing Surveys |
| Citations: | 181 - 2 self |
BibTeX
@ARTICLE{Kossmann00thestate,
author = {Donald Kossmann},
title = {The state of the art in distributed query processing},
journal = {ACM Computing Surveys},
year = {2000},
volume = {32},
pages = {2000}
}
Years of Citing Articles
OpenURL
Abstract
Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of issues which still make distributed data processing a complex undertaking: (1) distributed systems can become very large involving thousands of heterogeneous sites including PCs and mainframe server machines � (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system� (3) legacy systems need to be integrated|such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the \textbook " architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intra-query parallelism, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses di erent kinds of distributed systems such as client-server, middleware (multi-tier), and heterogeneous database systems and shows how query processing works in these systems. Categories and subject descriptors: E.5 [Data]:Files � H.2.4 [Database Management Systems]: distributed databases, query processing � H.2.5 [Heterogeneous Databases]: data translation General terms: algorithms � performance Additional key words and phrases: query optimization � query execution � client-server databases � middleware � multi-tier architectures � database application systems � wrappers� replication � caching � economic models for query processing � dissemination-based information systems 1







