Results 1 - 10
of
80
The state of the art in distributed query processing
- ACM Computing Surveys
, 2000
"... Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of ..."
Abstract
-
Cited by 181 (2 self)
- Add to MetaCart
Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of issues which still make distributed data processing a complex undertaking: (1) distributed systems can become very large involving thousands of heterogeneous sites including PCs and mainframe server machines � (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system� (3) legacy systems need to be integrated|such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the \textbook " architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intra-query parallelism, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses di erent kinds of distributed systems such as client-server, middleware (multi-tier), and heterogeneous database systems and shows how query processing works in these systems. Categories and subject descriptors: E.5 [Data]:Files � H.2.4 [Database Management Systems]: distributed databases, query processing � H.2.5 [Heterogeneous Databases]: data translation General terms: algorithms � performance Additional key words and phrases: query optimization � query execution � client-server databases � middleware � multi-tier architectures � database application systems � wrappers� replication � caching � economic models for query processing � dissemination-based information systems 1
Temporal and Real-Time Databases: A Survey
- IEEE Transactions on Knowledge and Data Engineering
, 1995
"... A temporal database contains time-varying data. In a real-time database transactions have deadlines or timing constraints. In this paper we review the substantial research in these two heretofore separate research areas. We first characterize the time domain, then investigate temporal and real-time ..."
Abstract
-
Cited by 154 (9 self)
- Add to MetaCart
A temporal database contains time-varying data. In a real-time database transactions have deadlines or timing constraints. In this paper we review the substantial research in these two heretofore separate research areas. We first characterize the time domain, then investigate temporal and real-time data models. We evaluate temporal and real-time query languages along several dimensions. Temporal and real-time DBMS implementation is examined. We conclude with a summary of the major accomplishments of the research to date, and list several research questions that should be addressed next. Keywords: object-oriented database, relational databases, query language, temporal data model, time-constrained database, transaction time, user-defined time, valid time 1 Introduction Time is an important aspect of all real-world phenomena. Events occur at specific points in time; objects and the relationships among objects exist over time. The ability to model this temporal dimension of the real worl...
A survey of schema versioning issues for database systems
- Information and Software Technology
, 1995
"... Schema versioning is one of a number of related areas dealing with the same general problem- that of using multiple heterogeneous schemata for various database related tasks. In particular, schema versioning, and its weaker companion, schema evolution, deal with the need to retain current data and s ..."
Abstract
-
Cited by 110 (3 self)
- Add to MetaCart
Schema versioning is one of a number of related areas dealing with the same general problem- that of using multiple heterogeneous schemata for various database related tasks. In particular, schema versioning, and its weaker companion, schema evolution, deal with the need to retain current data and software system functionality in the face of changing database structure. Schema versioning and schema evolution offer a solution to the problem by enabling intelligent handling of any temporal mismatch between data and data structure. This survey discusses the modelling, architectural and query language issues relating to the support of evolving schemata in database systems. An indication of the future directions of schema versioning research are also given.
The GMAP: a versatile tool for physical data independence
- VLDB Journal
, 1996
"... . Physical data independence is touted as a central feature of modern database systems. It allows users to frame queries in terms of the logical structure of the data, letting a query processor automatically translate them into optimal plans that access physical storage structures. Both ..."
Abstract
-
Cited by 73 (1 self)
- Add to MetaCart
.<F3.733e+05> Physical data independence is touted as a central feature of modern database systems. It allows users to frame queries in terms of the logical structure of the data, letting a query processor automatically translate them into optimal plans that access physical storage structures. Both relational and object-oriented systems, however, force users to frame their queries in terms of a logical schema that is directly tied to physical structures. We present an approach that eliminates this dependence. All storage structures are defined in a declarative language based on relational algebra as functions of a logical schema. We present an algorithm, integrated with a conventional query optimizer, that translates queries over this logical schema into plans that access the storage structures. We also show how to compile update requests into plans that update all relevant storage structures consistently and optimally. Finally, we report on experiments with a prototype implementation ...
An architecture for multi-user software development environments. Computing Systems
, 1993
"... We present an architecture for multi-user software development environments, covering general, processcentered and rule-based MUSDEs. Our architecture is founded on componentization, with particular concern for the capability to replace the synchronization component- to allow experimentation with no ..."
Abstract
-
Cited by 60 (29 self)
- Add to MetaCart
We present an architecture for multi-user software development environments, covering general, processcentered and rule-based MUSDEs. Our architecture is founded on componentization, with particular concern for the capability to replace the synchronization component- to allow experimentation with novel concurrency control mechanisms- with minimal effects on other components while still supporting integration. The architecture has been implemented for the Marvel SDE.
Transactional Client-Server Cache Consistency: Alternatives and Performance
- ACM Transactions on Database Systems
, 1997
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, N ..."
Abstract
-
Cited by 58 (3 self)
- Add to MetaCart
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or permissions@acm.org. 2 \Delta M. J. Franklin et al. 1. INTRODUCTION 1.1 Client-Server Database System Architectures Advances in distributed computing and object-orientation have combined to bring about the development of a new class of database systems. These systems employ a client-server computing model to provide both responsiveness to users and support for complex, shared data in a distributed environment. Current relational DBMS products are based on a query-shipping approach in which most query processing is performed at servers; clients are primarily used to manage the user interface. In contrast, object-oriented database systems (OODBMS), whi...
Performance and Scalability of Client-Server Database Architectures
, 1992
"... Recent developments in software and hardware changed the way database systems are built and operate. In this paper we present database architectures based on the Client-Server paradigm and study their performance and scalability under different query/update workloads. The architectures are: Standard ..."
Abstract
-
Cited by 49 (16 self)
- Add to MetaCart
Recent developments in software and hardware changed the way database systems are built and operate. In this paper we present database architectures based on the Client-Server paradigm and study their performance and scalability under different query/update workloads. The architectures are: Standard Client-Server, Client-Server with Multiple Disks, and Enhanced Client{Server. Data replication and client query result caching are used as the main mechanisms to improve the query throughput. The role of the server is to maintain system-wide data consistency and in the case of Enhanced Client-Server to selectively propagate updates on demand. Our study shows that except for the case of mostly update workloads, the Standard Client-Server architecture is outperformed by the other two architectures by one or more orders of magnitude. The Client-Server with Multiple Disks architecture offers performance comparable to that achieved by the Enhanced Client-Server for up to 100 clients, but the latter scales up a lot better for higher number of clients.
Local Disk Caching for Client-Server Database Systems
- In Proc. of the Conf. on Very Large Data Bases (VLDB
, 1993
"... The performance and scalability of a client-server database system can be improved by employing client disks for caching. Client disk caching is particularly useful due to the lower cost per byte (compared to memory) and non-volatility of disk storage. Because of performance considerations, however, ..."
Abstract
-
Cited by 41 (8 self)
- Add to MetaCart
The performance and scalability of a client-server database system can be improved by employing client disks for caching. Client disk caching is particularly useful due to the lower cost per byte (compared to memory) and non-volatility of disk storage. Because of performance considerations, however, disk caching is not a straightforward extension of memory caching. In this paper, we examine the performance impacts of adding client disks to the storage hierarchy of a client-server DBMS and investigate the tradeoffs inherent in keeping a large volume of disk-cached data consistent. We describe and analyze four algorithms for managing disk caches. We also address two extensions to cache management algorithms that arise due to the performance characteristics of large disk caches: 1) the need for methods to reduce the work performed by the server for ensuring transaction durability, and 2) techniques for bringing a large disk-resident cache up-to-date after an extended off-line period. 1 In...
HAC: Hybrid Adaptive Caching for Distributed Storage Systems
- In Proc. 17th ACM Symp. on Operating System Principles (SOSP
, 1997
"... This paper presents HAC, a novel technique for managing the client cache in a distributed, persistent object storage system. HAC is a hybrid between page and object caching that combines the virtues of both while avoiding their disadvantages. It achieves the low miss penalties of a page-caching syst ..."
Abstract
-
Cited by 41 (10 self)
- Add to MetaCart
This paper presents HAC, a novel technique for managing the client cache in a distributed, persistent object storage system. HAC is a hybrid between page and object caching that combines the virtues of both while avoiding their disadvantages. It achieves the low miss penalties of a page-caching system, but is able to perform well even when locality is poor, since it can discard pages while retaining their hot objects. It realizes the potentially lower miss rates of object-caching systems, yet avoids their problems of fragmentation and high overheads. Furthermore, HAC is adaptive: when locality is good it behaves like a page-caching system, while if locality is poor it behaves like an object-caching system. It is able to adjust the amount of cache space devoted to pages dynamically so that space in the cache can be used in the way that best matches the needs of the application. The paper also presents results of experiments that indicate that HAC outperforms other object storage systems across a wide range of cache sizes and workloads; it performs substantially better on the expected workloads, which have low to moderate locality. Thus we show that our hybrid, adaptive approach is the cache management technique of choice for distributed, persistent object systems. 1
Access Support Relations: An Indexing Method for Object Bases
- INFORMATION SYSTEMS
, 1992
"... In this work access support relations are introduced as a means for optimizing query processing in object-oriented database systems. The general idea is to maintain separate structures (dissociated from the object representation) to redundantly store those object references that are frequently trave ..."
Abstract
-
Cited by 38 (6 self)
- Add to MetaCart
In this work access support relations are introduced as a means for optimizing query processing in object-oriented database systems. The general idea is to maintain separate structures (dissociated from the object representation) to redundantly store those object references that are frequently traversed in database queries. The proposed access support relation technique is no longer restricted to relate an object (tuple) to an atomic (,alue (attribute value) as in conventional indexing. Rather, access support relations relate objects with each other and can span over reference chains which may contain collection-valued components in order to support queries involving path expressions. We present several alternative extensions and decompositions of access support relations for a given path expression, the best of which has to be determined according to the application-specific database usage profile. An analytical performance analysis of access support relations is developed. This analytical cost model is, in particular, used to determine the best access support relation extension and decomposition with respect to specific database configuration and usage characteristics.

