Results 11 -
15 of
15
A Two-Phase Approach to Data Allocation in Distributed Databases
, 1995
"... In this paper, we propose a two-phase approach to the problem of optimal allocation of data objects (fragments) on a network in a distributed database system. In the first phase, we perform fragment clustering 1 , in which we form groupings of fragments that tend to be accessed together. In the se ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we propose a two-phase approach to the problem of optimal allocation of data objects (fragments) on a network in a distributed database system. In the first phase, we perform fragment clustering 1 , in which we form groupings of fragments that tend to be accessed together. In the second phase, we use a "divide and conquer " search technique to allocate clusters to the computing nodes (sites) in the network. We show, via complexity analysis, that the combined process of clustering and data allocation takes time that is polynomial with respect to the number of objects and sites. We also show, via experimental analysis, that our approach produces solutions that are close to optimal for a wide range of fragmentations, queries and network structures. 1 Introduction Data allocation is a critical aspect of distributed database systems: a poorly-designed data allocation can lead to inefficient computation, high access costs, and high network loads [15, 16] whereas a welldesig...
New Objective Function for Vertical Partitioning in Database System. © Thanh Hung Ngo
"... In this paper we introduce the objective function for vertical partitioning in database systems. It has been built with the new evaluative criterion: cache hit probability. Testing the validity of the derived evaluative formula via program simulation shows its high accuracy. Index terms: vertical pa ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this paper we introduce the objective function for vertical partitioning in database systems. It has been built with the new evaluative criterion: cache hit probability. Testing the validity of the derived evaluative formula via program simulation shows its high accuracy. Index terms: vertical partitioning, objective function, evaluation criterion, cache hit probability. 1
A New Technique for Database Fragmentation in Distributed Systems
"... Improving the performance of a database system is one of the key research issues now a day. Distributed processing is an effective way to improve reliability and performance of a database system. Distribution of data is a collection of fragmentation, allocation and replication processes. Previous re ..."
Abstract
- Add to MetaCart
Improving the performance of a database system is one of the key research issues now a day. Distributed processing is an effective way to improve reliability and performance of a database system. Distribution of data is a collection of fragmentation, allocation and replication processes. Previous research works provided fragmentation solution based on empirical data about the type and frequency of the queries submitted to a centralized system. These solutions are not suitable at the initial stage of a database design for a distributed system. In this paper we have presented a fragmentation technique that can be applied at the initial stage as well as in later stages of a distributed database system for partitioning the relations. Allocation of fragments is done simultaneously in our algorithm. Result shows that proposed technique can solve initial fragmentation problem of relational databases for distributed systems properly.
Vertical partitioning of relational OLTP databases using integer programming
"... Abstract — A way to optimize performance of relational row store databases is to reduce the row widths by vertically partitioning tables into table fractions in order to minimize the number of irrelevant columns/attributes read by each transaction. This paper considers vertical partitioning algorith ..."
Abstract
- Add to MetaCart
Abstract — A way to optimize performance of relational row store databases is to reduce the row widths by vertically partitioning tables into table fractions in order to minimize the number of irrelevant columns/attributes read by each transaction. This paper considers vertical partitioning algorithms for relational rowstore OLTP databases with an H-store-like architecture, meaning that we would like to maximize the number of single-sited transactions. We present a model for the vertical partitioning problem that, given a schema together with a vertical partitioning and a workload, estimates the costs (bytes read/written by storage layer access methods and bytes transferred between sites) of evaluating the workload on the given partitioning. The cost model allows for arbitrarily prioritizing load balancing of sites vs. total cost minimization. We show that finding a minimum-cost vertical partitioning in this model is NP-hard and therefore the problem should obviously not be solved manually by a human DBA. We present two algorithms returning solutions in which singlesitedness of read queries is preserved while allowing column replication (which may allow a drastically reduced cost compared to disjoint partitioning). The first algorithm is a quadratic integer program that finds optimal minimum-cost solutions with respect to the model, and the second algorithm is a more scalable heuristic based on simulated annealing. Experiments show that the algorithms can reduce the cost of the model objective by 37 % when applied to the TPC-C benchmark and the heuristic is shown to obtain solutions with costs close to the ones found using the quadratic program. I.

