Results 1 -
7 of
7
Query evaluation techniques for large databases
- ACM COMPUTING SURVEYS
, 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract
-
Cited by 592 (7 self)
- Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate it: In order to manipulate large sets of complex objects as efficiently as today’s database systems manipulate simple records, query processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and post-relational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
The Modified Object Buffer: A Storage Management Technique for Object-Oriented Databases
, 1995
"... Object-oriented databases store many small objects on disks. Disks perform poorly when reading and writing individual small objects. This thesis presents a new storage management architecture that substantially improves disk performance of a distributed object-oriented database system. The storage a ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Object-oriented databases store many small objects on disks. Disks perform poorly when reading and writing individual small objects. This thesis presents a new storage management architecture that substantially improves disk performance of a distributed object-oriented database system. The storage architecture is built around a large modified object buffer (MOB) that is stored in primary memory. The MOB provides volatile storage for modified objects. Modified objects are placed in the MOB instead of being immediately written out to disk. Modifications are written to disk lazily as the MOB fills up and space is required for new modifications. The MOB improves performance because even if an object is modified many times in a short period of time, the object has to be written out to disk only once. Furthermore, by the time an object modification has to be flushed from the MOB, many modifications to other objects on the same page may have accumulated. All of these modifications can be writ...
Optimism vs. Locking: A Study of Concurrency Control for Client-Server Object-Oriented Databases
, 1997
"... Many client-server object-oriented database systems (OODBs) run applications at clients and perform all accesses on cached copies of database objects. Moving both data and computation to the clients can improve response time, throughput, and scalability. For applications with good locality of refere ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
Many client-server object-oriented database systems (OODBs) run applications at clients and perform all accesses on cached copies of database objects. Moving both data and computation to the clients can improve response time, throughput, and scalability. For applications with good locality of reference, retaining cached state across transaction boundaries can result in further performance and scaling benefits. This thesis examines the question of what concurrency control scheme is best able to realize these potential benefits. It describes a new optimistic concurrency control scheme called AOCC (Adaptive Optimistic Concurrency Control) and compares its performance with that of ACBL (Adaptive-Granularity Callback Locking), the scheme shown to have the best performance in previous studies. Like all optimistic schemes, AOCC synchronizes transactions at the commit point, aborting transactions when synchronization fails; ACBL, like other locking schemes, synchronizes transactions while they execute. Earlier
Implementing Hypertext Database Relationships through Aggregations and Exceptions
, 1991
"... In order to combine hypertext with database facilities, we show how to extract an effective storage structure from given instance relationships. The schema of the structure recognizes clusters and exceptions. Extracting high-level structures is useful for providing a high performance browsing enviro ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
In order to combine hypertext with database facilities, we show how to extract an effective storage structure from given instance relationships. The schema of the structure recognizes clusters and exceptions. Extracting high-level structures is useful for providing a high performance browsing environment as well as efficient physical database design, especially when handling large amounts of data. This paper focuses on a clustering method, ACE, which generates aggregations and exceptions from the original graph structure in order to capture high level relationships. The problem of minimizing the cost function is NP-complete. We use a heuristic approach based on an extended Kernighan-Lin algorithm. We demonstrate our method on a hypertext application and on a standard random graph, compared with its analytical model. The storage reductions of input database size in main memory were 77.2% and 12.3%, respectively. It was also useful for secondary storage organization for efficient retriev...
Integrated Document Caching and Prefetching in Storage Hierarchies Based on Markov-Chain Predictions
, 1998
"... . Large multimedia document archives may hold a major fraction of their data in tertiary storage libraries for cost reasons. This paper develops an integrated approach to the vertical data migration between the tertiary, secondary, and primary storage in that it reconciles speculative pr ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
.<F3.733e+05> Large multimedia document archives may hold a major fraction of their data in tertiary storage libraries for cost reasons. This paper develops an integrated approach to the vertical data migration between the tertiary, secondary, and primary storage in that it reconciles speculative prefetching, to mask the high latency of the tertiary storage, with the replacement policy of the document caches at the secondary and primary storage level, and also considers the interaction of these policies with the tertiary and secondary storage request scheduling. The integrated migration policy is based on a continuoustime Markov chain model for predicting the expected number of accesses to a document within a specified time horizon. Prefetching is initiated only if that expectation is higher than those of the documents that need to be dropped from secondary storage to free up the necessary space. In addition, the possible resource contention at the tertiary and secondary storage is tak...
Vertical data migration in large near-line document archives based on markov-chain predictions
- In VLDB
, 1997
"... Large multimedia document archives hold most of their data in near-line tertiary storage libraries for cost reasons. This paper de-velops an integrated approach to the vertical data migration he-tween the tertiary and secondary storage in that it reconciles specu-lative preloading, to mask the high ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Large multimedia document archives hold most of their data in near-line tertiary storage libraries for cost reasons. This paper de-velops an integrated approach to the vertical data migration he-tween the tertiary and secondary storage in that it reconciles specu-lative preloading, to mask the high latency of the tertiary storage, with the replacement policy of the secondary storage. In addition, it considers the interaction of these policies with the tertiary storage scheduling and controls preloading aggressiveness by taking con-tention for tertiary storage drives into account. The integrated migration policy is based on a continuous-time Markov-chain (CTMC) model,fijr predicting the expected number of accesses to a document within a specified time horizon. The parameters of the CTMC model, the probabilities of co-accessing certain documents and the interaction times between successive accesses, are dynami-cally estimated and adjusted to evolving workload patterns by keep-ing online statistics. The integrated policy for vertical data migra-tion has been implemented in a prototype system. Detailed simulation studies with Web-server-like synthetic workloads indi-cate sign$cant gains in terms of client response time. The studies also show that the overhead of the statistical bookkeeping and the computations for the access predictions is affordable.
A Highly Effective Partition Selection Policy for Object Database Garbage Collection
- IEEE Transactions on Knowledge and Data Engineering
, 1998
"... We investigate methods to improve the performance of algorithms for automatic storage reclamation of object databases. These algorithms are based on a technique called partitioned garbage collection, in which a subset of the entire database is collected independently of the rest. We evaluate how d ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We investigate methods to improve the performance of algorithms for automatic storage reclamation of object databases. These algorithms are based on a technique called partitioned garbage collection, in which a subset of the entire database is collected independently of the rest. We evaluate how different application, database system, and garbage collection implementation parameters affect the performance of garbage collection in object database systems. We focus specifically on investigating the policy that is used to select which partition in the database should be collected. Three of the policies that we investigate are based on the intuition that the values of overwritten pointers provide good hints about where to find garbage. A fourth policy investigated chooses the partition with the greatest presence in the I/O buffer. Using simulations based on a synthetic database, we show that one of our policies requires less I/O to collect more garbage than any existing implementabl...

