Results 1 - 10
of
27
Query evaluation techniques for large databases
- ACM COMPUTING SURVEYS
, 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract
-
Cited by 592 (7 self)
- Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate it: In order to manipulate large sets of complex objects as efficiently as today’s database systems manipulate simple records, query processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and post-relational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
A Survey of Adaptive Sorting Algorithms
, 1992
"... Introduction and Survey; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems --- Sorting and Searching; E.5 [Data]: Files --- Sorting/searching; G.3 [Mathematics of Computing]: Probability and Statistics --- Probabilistic algorithms; E.2 [Data Storage Represe ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
Introduction and Survey; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems --- Sorting and Searching; E.5 [Data]: Files --- Sorting/searching; G.3 [Mathematics of Computing]: Probability and Statistics --- Probabilistic algorithms; E.2 [Data Storage Representation]: Composite structures, linked representations. General Terms: Algorithms, Theory. Additional Key Words and Phrases: Adaptive sorting algorithms, Comparison trees, Measures of disorder, Nearly sorted sequences, Randomized algorithms. A Survey of Adaptive Sorting Algorithms 2 CONTENTS INTRODUCTION I.1 Optimal adaptivity I.2 Measures of disorder I.3 Organization of the paper 1.WORST-CASE ADAPTIVE (INTERNAL) SORTING ALGORITHMS 1.1 Generic Sort 1.2 Cook--Kim division 1.3 Partition Sort 1.4 Exponential Search 1.5 Adaptive Merging 2.EXPECTED-CASE ADAPTIV
Multidatabase Query Optimization
- Distributed and Parallel Databases
, 1997
"... . A multidatabase system (MDBS) allows the users to simultaneously access heterogeneous, and autonomous databases using an integrated schema and a single global query language. The query optimization problem in MDBSs is quite different from the query optimization problem in distributed homogeneous d ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
. A multidatabase system (MDBS) allows the users to simultaneously access heterogeneous, and autonomous databases using an integrated schema and a single global query language. The query optimization problem in MDBSs is quite different from the query optimization problem in distributed homogeneous databases due to schema heterogeneity and autonomy of local database systems. In this work, we consider the optimization of query distribution in case of data replication and the optimization of intersite joins, that is, the join of the results returned by the local sites in response to the global subqueries. The algorithms presented for the optimization of intersite joins try to maximize the parallelism in execution and take the federated nature of the problem into account. It has also been shown through a comparative performance study that the proposed intersite join optimization algorithms are efficient. The approach presented can easily be generalized to any operation required for intersi...
TDBM: A DBM library with atomic transactions
- In Summer '92 USENIX
, 1992
"... The dbm database library [1] introduced disk-based extensible hashing to UNIX. The library consists of functions to use a simple database consisting of key/value pairs. A number of work-alikes have been developed, offering additional features [5] and free source code [14,25]. Recently, a new package ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
The dbm database library [1] introduced disk-based extensible hashing to UNIX. The library consists of functions to use a simple database consisting of key/value pairs. A number of work-alikes have been developed, offering additional features [5] and free source code [14,25]. Recently, a new package was developed that also offers improved performance [19]. None of these implementations, however, provide fault-tolerant behaviour. In many applications, a single high-level operation may cause many database items to be updated, created, or deleted. If the application crashes while processing the operation, the database could be left in an inconsistent state. Current versions of dbm do not handle this problem. Existing dbm implementations do not support concurrent access, even though the use of lightweight processes in a UNIX environment is growing. To address these deficiencies, tdbm was developed. Tdbm is a transaction processing database with a dbmlike interface. It provides nested atomic transactions, volatile and persistent databases, and support for very large objects and distributed operation. This paper describes the design and implementation of tdbm and examines its performance. 1.
Back to the Future: Dynamic Hierarchical Clustering
- PROC. OF ICDE
, 1995
"... We describe a new method for dynamically clustering hierarchical data which maintains good clustering within disk pages in the presence of insertions and deletions. This simple but effective method, which we call Enc, encodes the insertion order of children with respect to their parents and concaten ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
We describe a new method for dynamically clustering hierarchical data which maintains good clustering within disk pages in the presence of insertions and deletions. This simple but effective method, which we call Enc, encodes the insertion order of children with respect to their parents and concatenates the insertion numbers to form a compact key for the data. This compact key is stored only in the indexing structure and does not affect the logical database schema. Experimental results show that our Enc method is very efficient for hierarchical queries and performs reasonably well for random access queries.
On-line Reorganization of Sparsely-populated B+-trees
- In Proceedings of ACM/SIGMOD Annual Conference on Management of Data
, 1996
"... In this paper, we present an efficient method to do online reorganization of sparsely-populated B + -trees. It reorganizes the leaves first, compacting in short operations groups of leaves with the same parent. After compacting, optionally, the new leaves may swap locations or be moved into empty ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
In this paper, we present an efficient method to do online reorganization of sparsely-populated B + -trees. It reorganizes the leaves first, compacting in short operations groups of leaves with the same parent. After compacting, optionally, the new leaves may swap locations or be moved into empty pages so that they are in key order on the disk. After the leaves are reorganized, the method shrinks the tree by making a copy of the upper part of the tree while leaving the leaves in place. A new concurrency method is introduced so that only a minimum number of pages are locked during reorganization. During leaf reorganization, Forward Recovery is used to save all work already done while maintaining consistency after system crashes. A heuristic algorithm is developed to reduce the number of swaps needed during leaf reorganization, so that better concurrency and easier recovery can be achieved. A detailed description of switching from the old B + -tree to the new B + -tree is describe...
METU ObjectOriented DBMS
- In Advances in Object-Oriented Database Systems
, 1994
"... METU Object-Oriented DBMS 1 includes the implementation of a database kernel, an object-oriented SQL-like language and a graphical user interface. Kernel functions are divided between a SQL Interpreter and a C++ compiler. Thus the interpretation of functions are avoided increasing the e ciency of th ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
METU Object-Oriented DBMS 1 includes the implementation of a database kernel, an object-oriented SQL-like language and a graphical user interface. Kernel functions are divided between a SQL Interpreter and a C++ compiler. Thus the interpretation of functions are avoided increasing the e ciency of the system. The compiled by C++ functions are used by the system through the Function Manager. The system is realized on Exodus Storage Manager (ESM), thus exploiting some of the kernel functions readily provided by ESM. The additional functions provided by the MOOD kernel are the optimization and interpretation of SQL statements, dynamic linking of functions, and catalog management. An original query optimization strategy based on the object-oriented features of the language is developed. For this purpose formulas for the selectivity ofa path expression, and for the cost of forward and backward path traversals are derived, and join sizes are estimated. New strategies for ordering the joins and path expressions are also developed. A graphical user interface, namely MoodView is implemented on the MOOD kernel. MoodView provides the database programmer with tools and functionalities for every phase of OODBMS application development. Current version of MoodView allows a database user to design, browse, and modify database schema interactively. MoodView can automatically generate graphical displays for complex and multimedia database objects which can be updated through the object browser. Furthermore, a database administration tool, a full screen text-editor, a SQL based query manager, and a graphical indexing tool for the spatial data, i.e., R Trees are also implemented. 1
Efficient Differential Timeslice Computation
- IEEE Transactions on knowledge and data engineering
, 1994
"... Transaction-time databases record all previous database states and are ever-groving, leading to potentially huge quantities of data. For that reason, efficient query processing is of particular importance. Due to the large size of transaction-time relations, it is advantageous to utilize cheap write ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Transaction-time databases record all previous database states and are ever-groving, leading to potentially huge quantities of data. For that reason, efficient query processing is of particular importance. Due to the large size of transaction-time relations, it is advantageous to utilize cheap writeonce storage media for storage. This is facilitated by adopting a log-based storage structure. Timeslices, i.e., relation states or snapshots, are computed by traversing the logs, using previously computed and cached timeslices as outsets. When computing a new timeslice, the cache will contain two candidate outsets: an earlier outset and a later outset. We provide efficient means of always picking the optimal one. Specifically, we define and investigate the use of a new data structure, the B+tree-like Insertion Tree (I-tree), for this purpose. The cost of using an I-tree for picking the optimal outset is similar to that of using a B+tree. Being sparse, I-trees require li...
Decoupling partitioning and grouping: Overcoming shortcomings of spatial indexing with bucketing
- Also University of Maryland Computer Science TR-4526
"... The principle of decoupling the partitioning and grouping processes that form the basis of most spatial indexing methods that use tree directories of buckets is explored. The decoupling is designed to overcome the following drawbacks of traditional solutions: (1) multiple postings in disjoint space ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
The principle of decoupling the partitioning and grouping processes that form the basis of most spatial indexing methods that use tree directories of buckets is explored. The decoupling is designed to overcome the following drawbacks of traditional solutions: (1) multiple postings in disjoint space decomposition methods that lead to balanced trees such as the hB-tree where a node split in the event of node overflow may be such that one of the children of the node that was split becomes a child of both of the nodes resulting from the split; (2) multiple coverage and nondisjointness of methods based on object hierarchies such as the R-tree which lead to nonunique search paths; (3) directory nodes with similarly-shaped hyper-rectangle bounding boxes with minimum occupancy in disjoint space decomposition methods such as those based on quadtrees and k-d trees that make use of regular decomposition. The first two drawbacks are shown to be overcome by the BV-tree where as a result of decoupling the partitioning and grouping processes, the union of the regions associated with the nodes at a given level of the directory does not necessarily contain all of the data points although all searches take the same amount of time. The BV-tree is not plagued by the third drawback. The third drawback is
Implementing Deletion in B+-Trees
"... This paper describes algorithms for key deletion in B+-trees. There are published algorithms and pseudocode for searching and inserting keys, but deletion, due to its greater complexity and perceived lesser importance, is glossed over completely or left as an exercise to the reader. To remedy this s ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper describes algorithms for key deletion in B+-trees. There are published algorithms and pseudocode for searching and inserting keys, but deletion, due to its greater complexity and perceived lesser importance, is glossed over completely or left as an exercise to the reader. To remedy this situation, we provide a well documented flowchart, algorithm, and pseudo-code for deletion, their relation to search and insertion algorithms, and a reference to a freely available, complete B+-tree library written in the C programming language.

