Results 1 - 10
of
48
The design and implementation of hierarchical software systems with reusable components
- ACM Transactions on Software Engineering and Methodology
, 1992
"... We present a domain-independent model of hierarchical software system design and construction that is based on interchangeable software components and largescale reuse. The model unifies the conceptualizations of two independent projects, Genesis and Avoca, that are successful examples of software c ..."
Abstract
-
Cited by 347 (71 self)
- Add to MetaCart
We present a domain-independent model of hierarchical software system design and construction that is based on interchangeable software components and largescale reuse. The model unifies the conceptualizations of two independent projects, Genesis and Avoca, that are successful examples of software component/building-block technologies and domain modeling. Building-block technologies exploit large-scale reuse, rely on open architecture software, and elevate the granularity of programming to the subsystem level. Domain modeling formalizes the similarities and differences among systems of a domain. We believe our model is a blue-print for achieving software component technologies in many domains.
Query optimization in database systems
- ACM Computing Surveys
, 1984
"... Efficient methods of processing unanticipated queries are a crucial prerequisite for the success of generalized database management systems. A wide variety of approaches to improve the performance of query evaluation algorithms have been proposed: logic-based and semantic transformations, fast imple ..."
Abstract
-
Cited by 194 (0 self)
- Add to MetaCart
Efficient methods of processing unanticipated queries are a crucial prerequisite for the success of generalized database management systems. A wide variety of approaches to improve the performance of query evaluation algorithms have been proposed: logic-based and semantic transformations, fast implementations of basic operations, and combinatorial or heuristic algorithms for generating alternative access plans and choosing among them. These methods are presented in the framework of a general query evaluation procedure using the relational calculus representation of queries. In addition, nonstandard query optimization issues such as higher level query evaluation, query optimization in distributed databases, and use of database machines are addressed. The focus, however, is on query optimization in centralized database systems.
The state of the art in distributed query processing
- ACM Computing Surveys
, 2000
"... Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of ..."
Abstract
-
Cited by 182 (2 self)
- Add to MetaCart
Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of issues which still make distributed data processing a complex undertaking: (1) distributed systems can become very large involving thousands of heterogeneous sites including PCs and mainframe server machines � (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system� (3) legacy systems need to be integrated|such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the \textbook " architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intra-query parallelism, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses di erent kinds of distributed systems such as client-server, middleware (multi-tier), and heterogeneous database systems and shows how query processing works in these systems. Categories and subject descriptors: E.5 [Data]:Files � H.2.4 [Database Management Systems]: distributed databases, query processing � H.2.5 [Heterogeneous Databases]: data translation General terms: algorithms � performance Additional key words and phrases: query optimization � query execution � client-server databases � middleware � multi-tier architectures � database application systems � wrappers� replication � caching � economic models for query processing � dissemination-based information systems 1
Query Optimization
, 1996
"... Imagine yourself standing in front of an exquisite buffet filled with numerous delicacies. Your goal is to try them all out, but you need to decide in what order. What exchange of tastes will maximize the overall pleasure of your palate? Although much less pleasurable and subjective, that is the typ ..."
Abstract
-
Cited by 102 (2 self)
- Add to MetaCart
Imagine yourself standing in front of an exquisite buffet filled with numerous delicacies. Your goal is to try them all out, but you need to decide in what order. What exchange of tastes will maximize the overall pleasure of your palate? Although much less pleasurable and subjective, that is the type of problem that query optimizers are called to solve. Given a query, there are many plans that a database management system (DBMS) can follow to process it and produce its answer. All plans are equivalent in terms of their final output but vary in their cost, i.e., the amount of time that they need to run. What is the plan that needs the least amount of time? Such query optimization is absolutely necessary in a DBMS. The cost difference between two alternatives can be enormous. For example, consider the following database schema, which will be...
Reducing the braking distance of an SQL query engine
- In Proc. of the 24th VLDB Conf
, 1998
"... In a recent paper, we proposed adding a STOP AFTER clause to SQL to permit the cardinality of a query result to be explicitly limited by query writers and query tools. We demonstrated the usefulness of having this clause, showed how to extend a traditional cost-based query optimizer to accommodate i ..."
Abstract
-
Cited by 84 (6 self)
- Add to MetaCart
In a recent paper, we proposed adding a STOP AFTER clause to SQL to permit the cardinality of a query result to be explicitly limited by query writers and query tools. We demonstrated the usefulness of having this clause, showed how to extend a traditional cost-based query optimizer to accommodate it, and demonstrated via DB2-based simulations that large performance gains are possible when STOP AFTER queries are explicitly supported by the database engine. In this paper, we present several new strategies for efficiently processing STOP AFTER queries. These strategies, based largely on the use of range partitioning techniques, offer significant additional savings for handling STOP AFTER queries that yield sizeable result sets. We describe classes of queries where such savings would indeed arise and present experimental measurements that show the benefits and tradeoffs associated with the new processing strategies. 1
An economic paradigm for query processing and data migration
- in Mariposa, Proc. 3rd International Conf. Parallel and Distributed Information Systems
, 1994
"... Many new database applications require very large volumes of data. Mariposa is a data base system under construction at Berkeley responding to this need. Mariposa objects can be stored over thousands of autonomous sites and on memory hierarchies with very large capacity. This scale of the system lea ..."
Abstract
-
Cited by 80 (1 self)
- Add to MetaCart
Many new database applications require very large volumes of data. Mariposa is a data base system under construction at Berkeley responding to this need. Mariposa objects can be stored over thousands of autonomous sites and on memory hierarchies with very large capacity. This scale of the system leads to complex query execution and storage management issues, unsolvable in practice with traditional techniques. We propose an economic paradigm as the solution. A query receives a budget which itspends to obtain the answers. Each site attempts to maximize income by buying and selling storage objects, and processing queries for locally stored objects. We present the protocols which underlie the Mariposa economy. 1.
Vertical Partitioning Algorithms for Database Design
- ACM Transactions on Database Systems
, 1984
"... This paper addresses the vertical partitioning of a set of logical records or a relation into fragments. The rationale behind vertical partitioning is to produce fragments, groups of attribute columns, that “closely match ” the requirements of transactions. Vertical partitioning is applied in three ..."
Abstract
-
Cited by 75 (8 self)
- Add to MetaCart
This paper addresses the vertical partitioning of a set of logical records or a relation into fragments. The rationale behind vertical partitioning is to produce fragments, groups of attribute columns, that “closely match ” the requirements of transactions. Vertical partitioning is applied in three contexts: a database stored on devices of a single type, a database stored in different memory levels, and a distributed database. In a two-level memory hierarchy, most transactions should be processed using the fragments in primary memory. In distributed databases, fragment allocation should maximize the amount of local transaction process-ing. Fragments may be nonoverlapping or overlapping. A two-phase approach for the determination of fragments is proposed; in the first phase, the design is driven by empirical objective functions which do not require specific cost information. The second phase performs cost optimization by incorporating the knowledge of a specific application environment. The algorithms presented in this paper have been implemented, and examples of their actual use are shown. 1.
Data allocation in distributed database systems
- ACM Transactions on Database Systems
, 1988
"... The problem of allocating the data of a database to the sites of a communication network is investigated. This problem deviates from the well-known file allocation problem in several aspects. First, the objects to be allocated are not known a priori; second, these objects are accessed by schedules t ..."
Abstract
-
Cited by 61 (1 self)
- Add to MetaCart
The problem of allocating the data of a database to the sites of a communication network is investigated. This problem deviates from the well-known file allocation problem in several aspects. First, the objects to be allocated are not known a priori; second, these objects are accessed by schedules that contain transmissions between objects to produce the result. A model that makes it possible to compare the cost of allocations is presented, the cost can be computed for different cost functions and for processing schedules produced by arbitrary query processing algorithms. For minimizing the total transmission cost, a method is proposed to determine the fragments to be allocated from the relations in the conceptual schema and the queries and updates executed by the users. For the same cost function, the complexity of the data allocation problem is investigated. Methods for obtaining optimal and heuristic solutions under various ways of computing the cost of an allocation are presented and compared. Two different approaches to the allocation management problem are presented and their merits are discussed.
Buffer management in relational database systems
- ACM Transactions on Database Systems
, 1986
"... The hot-set model, characterizing the buffer requirements of relational queries, is presented. This model allows the system to determine the optimal buffer space to be allocated to a query; it can also be used by the query optimizer to derive efficient execution plans accounting for the available bu ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
The hot-set model, characterizing the buffer requirements of relational queries, is presented. This model allows the system to determine the optimal buffer space to be allocated to a query; it can also be used by the query optimizer to derive efficient execution plans accounting for the available buffer space, and by a query scheduler to prevent thrashing. The hot-set model is compared with the working-set model. A simulation study is presented. Categories and Subject Descriptors: H.2.4 [Database Management]: Systems-query processing
Mariposa: A new architecture for distributed data
- Proc. 10th Int. Conf. on Data Engineering
, 1994
"... We describe the design of Mariposa, an experimental distributed data management system that provides high performance in an environment of high data mobility and heterogeneous host capabilities. The Mariposa design unifies the approaches taken by distributed file systems and distributed databases. I ..."
Abstract
-
Cited by 42 (3 self)
- Add to MetaCart
We describe the design of Mariposa, an experimental distributed data management system that provides high performance in an environment of high data mobility and heterogeneous host capabilities. The Mariposa design unifies the approaches taken by distributed file systems and distributed databases. In addition, Mariposa provides a general, flexible platform for the development of new algorithms for distributed query optimization, storage management, and scalable data storage structures. This flexibility is primarily due to a unique rule-based design that permits autonomous, local-knowledge decisions to be made regarding data placement, query execution location, and storage management. 1.

