Results 1 - 10
of
65
Query evaluation techniques for large databases
- ACM COMPUTING SURVEYS
, 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract
-
Cited by 592 (7 self)
- Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate it: In order to manipulate large sets of complex objects as efficiently as today’s database systems manipulate simple records, query processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and post-relational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
Weaving Relations for Cache Performance
, 2001
"... Relational database systems have traditionally optimzed for I/O performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages). Recent research, however, indicates that cache utilization and performance is becoming increasingly important on m ..."
Abstract
-
Cited by 83 (14 self)
- Add to MetaCart
Relational database systems have traditionally optimzed for I/O performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages). Recent research, however, indicates that cache utilization and performance is becoming increasingly important on modern platforms. In this paper, we first demonstrate that in-page data placement is the key to high cache performance and that NSM exhibits low cache utilization on modern platforms. Next, we propose a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page. Because PAX only affects layout inside the pages, it incurs no storage penalty and does not affect I/O behavior. According to our experimental results, when compared to NSM (a) PAX exhibits superior cache and memory bandwidth utilization, saving at least 75% of NSM's stall time due to data cache accesses, (b) range selection queries and updates on memoryresident relations execute 17-25% faster, and (c) TPC-H queries involving I/O execute 11-48% faster.
Data allocation in distributed database systems
- ACM Transactions on Database Systems
, 1988
"... The problem of allocating the data of a database to the sites of a communication network is investigated. This problem deviates from the well-known file allocation problem in several aspects. First, the objects to be allocated are not known a priori; second, these objects are accessed by schedules t ..."
Abstract
-
Cited by 61 (1 self)
- Add to MetaCart
The problem of allocating the data of a database to the sites of a communication network is investigated. This problem deviates from the well-known file allocation problem in several aspects. First, the objects to be allocated are not known a priori; second, these objects are accessed by schedules that contain transmissions between objects to produce the result. A model that makes it possible to compare the cost of allocations is presented, the cost can be computed for different cost functions and for processing schedules produced by arbitrary query processing algorithms. For minimizing the total transmission cost, a method is proposed to determine the fragments to be allocated from the relations in the conceptual schema and the queries and updates executed by the users. For the same cost function, the complexity of the data allocation problem is investigated. Methods for obtaining optimal and heuristic solutions under various ways of computing the cost of an allocation are presented and compared. Two different approaches to the allocation management problem are presented and their merits are discussed.
A Comprehensive Approach to Horizontal Class Fragmentation in a Distributed Object Based System
- International Journal of Distributed and Parallel Databases
, 1995
"... Optimal application performance on a Distributed Object Based System (DOBS) requires class fragmentation and the development of allocation schemes to place fragments at distributed sites so data transfer is minimized. Fragmentation enhances application performance by reducing the amount of irrele ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
Optimal application performance on a Distributed Object Based System (DOBS) requires class fragmentation and the development of allocation schemes to place fragments at distributed sites so data transfer is minimized. Fragmentation enhances application performance by reducing the amount of irrelevant data accessed and the amount of data transferred unnecessarily between distributed sites. Algorithms for effecting horizontal and vertical fragmentation of relations exist, but fragmentation techniques for class objects in a distributed object based system are yet to appear in the literature. This paper first reviews a taxonomy of the fragmentation problem in a distributed object base. The paper then contributes by presenting a comprehensive set of algorithms for horizontally fragmenting the four realizable class models on the taxonomy. The fundamental approach is top--down, where the entity of fragmentation is the class object. Our approach consists of first generating primary horizontal fragments of a class based on only applications accessing this class, and secondly generating derived horizontal fragments of the class arising from primary fragments of its subclasses, its complex attributes (contained classes), and/or its complex methods classes.
Implementation techniques of complex objects
- Proceedings of the International Conference on Very Large Data Bases
, 1986
"... Abstract: Eflcient support for retrieval and update of complex objects is a unifying requirement of many areas of computing such as business, artificial intelligence, ofice automation, and computer aided design. In this paper, we investigate and analyze a range of alternative techniques for the stor ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
Abstract: Eflcient support for retrieval and update of complex objects is a unifying requirement of many areas of computing such as business, artificial intelligence, ofice automation, and computer aided design. In this paper, we investigate and analyze a range of alternative techniques for the storage of complex objects. These alternatives vary between the direct storage representation of complex objects and the fully decomposed storage representation of complex objects. Qualitative arguments for each of the strategies are discussed. Analytical results and initial implementation results based on fully decomposed schemes are presented. 1.
A Distributed Execution Environment for Large-Scale Workflow Management Systems with Subnets and Server Migration
- Server Migration, IFCIS Conf. on Cooperative Information Systems (CoopIS
, 1997
"... If the number of users within a workflow management system (WFMS) increases, a central workflow server (WF-- server) and a single local area network (LAN) may become overloaded. The approach presented in this paper describes an execution environment which is able to manage a growing number of users ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
If the number of users within a workflow management system (WFMS) increases, a central workflow server (WF-- server) and a single local area network (LAN) may become overloaded. The approach presented in this paper describes an execution environment which is able to manage a growing number of users by adding new servers and subnets. The basic idea is to decompose processes into parts which are controlled by different WF--servers. That is, during the execution of a workflow instance its execution (step) control may migrate from one WF--server to another. By selecting the appropriate physical servers (for hosting the WF--servers) in the appropriate LANs, communication costs and individual WF--server workload can be reduced significantly. 1. Introduction Since a couple of years there has been a growing interest in using WFMS for implementing process--oriented application systems. As the benefit of such application systems increases with the number of applications being served, the number...
Data Page Layouts for Relational Databases on Deep Memory Hierarchies
, 2002
"... Relational database systems have traditionally optimized for I/0 performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages). ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
Relational database systems have traditionally optimized for I/0 performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages).
A Mixed Fragmentation Methodology for Initial Distributed Database Design
, 1995
"... We define mixed fragmentation as a process of simultaneously applying the horizontal and vertical fragmentation on a relation. It can be achieved in one of two ways: by performing horizontal fragmentation followed by vertical fragmentation or by performing vertical fragmentation followed by horizont ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
We define mixed fragmentation as a process of simultaneously applying the horizontal and vertical fragmentation on a relation. It can be achieved in one of two ways: by performing horizontal fragmentation followed by vertical fragmentation or by performing vertical fragmentation followed by horizontal fragmentation. The need for mixed fragmentation arises in distributed databases because database users usually access subsets of data which are vertical and horizontal fragments of global relations and there is a need to process queries or transactions that would access these fragments optimally. We present algorithms for generating candidate vertical and horizontal fragmentation schemes and propose a methodology for distributed database design using these fragmentation schemes. When applied together these schemes form a grid. This grid consisting of cells is then merged to form mixed fragments so as to minimize the number of disk accesses required to process the distributed transactions....
An Objective Function for Vertically Partitioning Relations in Distributed Databases and its Analysis
, 1992
"... The design of distributed databases is an optimization problem requiring solutions to several interrelated problems including: data fragmentation, allocation, and local optimization. Each problem can be solved with several different approaches thereby making the distributed database design a very di ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
The design of distributed databases is an optimization problem requiring solutions to several interrelated problems including: data fragmentation, allocation, and local optimization. Each problem can be solved with several different approaches thereby making the distributed database design a very difficult task. Although there is a large body of work on the design of data fragmentation, most of them are either ad hoc solutions or formal solutions for special cases (e. g., binary vertical partitioning). In this paper, we address the general vertical partitioning problem formally. We first provide a comparison of work in the area of data clustering and distributed databases to highlight the thrust of this work. We derive an objective function that generalizes and subsumes earlier work on vertical partitioning in databases. The objective function developed in this paper provides a basis for developing heuristic algorithms for vertical partitioning. The objective function also facilitates ...
Data Morphing: An Adaptive, Cache-Conscious Storage Technique
- In Proc. VLDB, 2003
, 2003
"... The number of processor cache misses has a critical impact on the performance of DBMSs running on servers with large main-memory configurations. ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
The number of processor cache misses has a critical impact on the performance of DBMSs running on servers with large main-memory configurations.

