Results 1 - 10
of
17
Query optimization in database systems
- ACM Computing Surveys
, 1984
"... Efficient methods of processing unanticipated queries are a crucial prerequisite for the success of generalized database management systems. A wide variety of approaches to improve the performance of query evaluation algorithms have been proposed: logic-based and semantic transformations, fast imple ..."
Abstract
-
Cited by 194 (0 self)
- Add to MetaCart
Efficient methods of processing unanticipated queries are a crucial prerequisite for the success of generalized database management systems. A wide variety of approaches to improve the performance of query evaluation algorithms have been proposed: logic-based and semantic transformations, fast implementations of basic operations, and combinatorial or heuristic algorithms for generating alternative access plans and choosing among them. These methods are presented in the framework of a general query evaluation procedure using the relational calculus representation of queries. In addition, nonstandard query optimization issues such as higher level query evaluation, query optimization in distributed databases, and use of database machines are addressed. The focus, however, is on query optimization in centralized database systems.
The state of the art in distributed query processing
- ACM Computing Surveys
, 2000
"... Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of ..."
Abstract
-
Cited by 182 (2 self)
- Add to MetaCart
Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of issues which still make distributed data processing a complex undertaking: (1) distributed systems can become very large involving thousands of heterogeneous sites including PCs and mainframe server machines � (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system� (3) legacy systems need to be integrated|such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the \textbook " architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intra-query parallelism, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses di erent kinds of distributed systems such as client-server, middleware (multi-tier), and heterogeneous database systems and shows how query processing works in these systems. Categories and subject descriptors: E.5 [Data]:Files � H.2.4 [Database Management Systems]: distributed databases, query processing � H.2.5 [Heterogeneous Databases]: data translation General terms: algorithms � performance Additional key words and phrases: query optimization � query execution � client-server databases � middleware � multi-tier architectures � database application systems � wrappers� replication � caching � economic models for query processing � dissemination-based information systems 1
Concurrency control in advanced database applications
- ACM Computing Surveys
, 1991
"... Concurrency control has been thoroughly studied in the context of traditional database applications such as banking and airline reservations systems. There are relatively few studies, however, that address the concurrency control issues of advanced database applications such as CAD/CAM and software ..."
Abstract
-
Cited by 160 (16 self)
- Add to MetaCart
Concurrency control has been thoroughly studied in the context of traditional database applications such as banking and airline reservations systems. There are relatively few studies, however, that address the concurrency control issues of advanced database applications such as CAD/CAM and software development environments. The
Mariposa: A new architecture for distributed data
- Proc. 10th Int. Conf. on Data Engineering
, 1994
"... We describe the design of Mariposa, an experimental distributed data management system that provides high performance in an environment of high data mobility and heterogeneous host capabilities. The Mariposa design unifies the approaches taken by distributed file systems and distributed databases. I ..."
Abstract
-
Cited by 42 (3 self)
- Add to MetaCart
We describe the design of Mariposa, an experimental distributed data management system that provides high performance in an environment of high data mobility and heterogeneous host capabilities. The Mariposa design unifies the approaches taken by distributed file systems and distributed databases. In addition, Mariposa provides a general, flexible platform for the development of new algorithms for distributed query optimization, storage management, and scalable data storage structures. This flexibility is primarily due to a unique rule-based design that permits autonomous, local-knowledge decisions to be made regarding data placement, query execution location, and storage management. 1.
Distributed Active Catalogs and Meta-Data Caching in Descriptive Name Services
- In IEEE International Conference on Distributed Computing Systems
, 1993
"... Today's global internetworks challenge the ability of name services and other information services to locate data quickly. We introduce a distributed active catalog and meta-data caching for optimizing queries in this environment. Our active catalog constrains the search space for a query by returni ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
Today's global internetworks challenge the ability of name services and other information services to locate data quickly. We introduce a distributed active catalog and meta-data caching for optimizing queries in this environment. Our active catalog constrains the search space for a query by returning a list of data repositories where the answer to the query is likely to be found. Meta-data caching improves performance by keeping frequently used characterizations of the search space close to the user, and eliminating active catalog communication and processing costs. When searching for query responses, our techniques contact only the small percentage of the data repositories with actual responses, resulting in search times of a few seconds. We implemented a distributed active catalog and meta-data caching in a prototype descriptive name service called "Nomenclator. " We present performance results for Nomenclator in a search space of 1000 data repositories. 1. Introduction Users canno...
A Publish & Subscribe Architecture for Distributed Metadata Management
, 2002
"... The emergence of electronic marketplaces and other electronic services and applications on the Internet is creating a growing demand for effective management of resources. Due to the nature of the Internet such information changes rapidly. Furthermore, such information must be available for a large ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
The emergence of electronic marketplaces and other electronic services and applications on the Internet is creating a growing demand for effective management of resources. Due to the nature of the Internet such information changes rapidly. Furthermore, such information must be available for a large number of users and applications, and copies of pieces of information should be stored near the users that need this particular information. In this paper, we present the architecture of MDV, a distributed metadata management system. MDV has a 3-tier architecture and supports caching and replication in the middle-tier so that queries can be evaluated locally. Users and applications specify the information they need and that is replicated using a specialized subscription language. In order to keep replicas up-to-date and initiate the replication of new and relevant information, MDV implements a novel, scalable publish & subscribe algorithm. We describe this algorithm in detail, show how it can be implemented using a standard relational database system, and present the results of performance experiments conducted using our prototype implementation.
THALIA: Test harness for the assessment of legacy information integration approaches
- In Proceedings of the International Conference on Data Engineering (ICDE
, 2005
"... We introduce our new, publicly available testbed and benchmark called THALIA 1 (Test Harness for the Assessment of Legacy information Integration Approaches) for testing and evaluating integration ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
We introduce our new, publicly available testbed and benchmark called THALIA 1 (Test Harness for the Assessment of Legacy information Integration Approaches) for testing and evaluating integration
Descriptive Name Services For Large Internets
, 1993
"... This thesis addresses the challenge of locating people, resources, and other objects in the global Internet. As the Internet grows beyond a million hosts in tens of thousands of organizations, it is increasingly difficult to locate any particular object. Hierarchical name services are frustrating, b ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This thesis addresses the challenge of locating people, resources, and other objects in the global Internet. As the Internet grows beyond a million hosts in tens of thousands of organizations, it is increasingly difficult to locate any particular object. Hierarchical name services are frustrating, because users must guess the unique names for objects or navigate the name space to find information. Descriptive (i.e. relational) name services offer the promise of simple resource location through a non-procedural query language. Users locate resources by describing resource attributes. This thesis makes the promise of descriptive name services real by providing fast query processing in large internets. The key to speed in descriptive query processing is constraining the search space using two new techniques, called an active catalog and meta-data caching. The active catalog constrains the search space for a query by returning a list of data repositories where the answer to the query is li...
An Object Management System for Multi-User Programming environments An Object Management System for Multi-User Programming environments
, 1991
"... same services in different ways). This is in contrast to the monolithic approach in which components are tightly-connected and interdependent. In particular, when the components interconnect in a layered fashion, the higher the layer is, the more semantics it has about the domain. In programming en ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
same services in different ways). This is in contrast to the monolithic approach in which components are tightly-connected and interdependent. In particular, when the components interconnect in a layered fashion, the higher the layer is, the more semantics it has about the domain. In programming environments, the highest layer is the human programmer, the middle layer is the environment itself, with limited knowledge, and the lowest layers that support the environment have minimal semantic knowledge about the domain. The various layers should have great flexibility in implementing their functionality. This is the intuition for the first model presented in chapter 2 and its implementation in MARVEL, presented in 6. The second approach is to exploit the object-oriented paradigm in distribution, i.e., impose object-oriented decomposition on the participating nodes in a distributed system as a coarser 1 layer of abstraction on top of the data itself. This
AQuES: An Agent-based Query Evaluation System
- In Proc. Int'l. Conf. on Cooperative Information Systems
, 1997
"... Optimization for distributed and parallel database systems is an overly complex problem. This complexity arises due to the various kinds of parallelism and the coexistence of heterogeneous hardware modules. The query optimizers for those systems must exploit parallel algorithms for algebra operators ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Optimization for distributed and parallel database systems is an overly complex problem. This complexity arises due to the various kinds of parallelism and the coexistence of heterogeneous hardware modules. The query optimizers for those systems must exploit parallel algorithms for algebra operators, independent data parallelism and resource scheduling. In addition, unexpected deviations of the workload at runtime enforce new, dynamic execution strategies. This paper reports ongoing work in developing an agentbased query optimization and execution system to cope with this problem. It is based on the principles of distributed, symmetric processes which work in a cooperative, yet independent manner. This approach enables a dynamical adaptation to changes in the execution environment with local and simple actions. We present the architecture of the components and outline the principle of distributed query execution within this software system. Keywords: parallel, distributed DBMS; cooper...

