Results 1 -
4 of
4
Network-Aware Join Processing in Global-Scale Database Federations
"... Abstract — We introduce join scheduling algorithms that employ a balanced network utilization metric to optimize the use of all network paths in a global-scale database federation. This metric allows algorithms to exploit excess capacity in the network, while avoiding narrow, long-haul paths. We giv ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract — We introduce join scheduling algorithms that employ a balanced network utilization metric to optimize the use of all network paths in a global-scale database federation. This metric allows algorithms to exploit excess capacity in the network, while avoiding narrow, long-haul paths. We give a twoapproximate, polynomial-time algorithm for serial (left-deep) join schedules. We also present extensions to this algorithm that explore parallel schedules, reduce resource usage, and define tradeoffs between computation and network utilization. We evaluate these techniques within the SkyQuery federation of Astronomy databases using spatial-join queries submitted by SkyQuery’s users. Experiments show that our algorithms realize near-optimal network utilization with minor computational overhead. I.
unknown title
"... We present distributed query scheduling algorithms that minimize network utilization for spatial joins in the Sky-Query federation of Astronomy databases. Unlike existing works that measure the quality of join schedules based on query response time, our metric both minimizes network utilization and ..."
Abstract
- Add to MetaCart
We present distributed query scheduling algorithms that minimize network utilization for spatial joins in the Sky-Query federation of Astronomy databases. Unlike existing works that measure the quality of join schedules based on query response time, our metric both minimizes network utilization and balances the utilization of heterogeneous network paths. Preliminary experiments show that our algorithms reduce network utilization dramatically when compared with SkyQuery’s existing scheduling algorithm. 1
Vertical partitioning impact on performance and manageability of distributed database systems (A Comparative study of some vertical partitioning algorithms)
- 18TH NATIONAL COMPUTER CONFERENCE
, 2006
"... Users of distributed database systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the cause of the performance problems can be very hard task. Bottlenecks can occur in any of the components through which the data flows: the applications, the op ..."
Abstract
- Add to MetaCart
Users of distributed database systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the cause of the performance problems can be very hard task. Bottlenecks can occur in any of the components through which the data flows: the applications, the operating systems, the network interfaces and hardware. Horizontal and vertical partitioning are important aspects of physical design in relational database system that has a significant impact on performance. The distribution design involves making decisions on the fragmentation and the allocation of data across the sites of a computer network. In this paper we address the fragmentation phase of distributed database systems. In this paper, vertical partitioning problem during the design of distributed databases is discussed by conducting a comparative study for different vertical partitioning algorithms to reach the most efficient vertical fragmentation scheme that leads to a proper data allocation and replication.
HYRISE—A Main Memory Hybrid Storage Engine
"... In this paper, we describe a main memory hybrid database system called HYRISE, which automatically partitions tables into vertical partitions of varying widths depending on how the columns of the table are accessed. For columns accessed as a part of analytical queries (e.g., via sequential scans), n ..."
Abstract
- Add to MetaCart
In this paper, we describe a main memory hybrid database system called HYRISE, which automatically partitions tables into vertical partitions of varying widths depending on how the columns of the table are accessed. For columns accessed as a part of analytical queries (e.g., via sequential scans), narrow partitions perform better, because, when scanning a single column, cache locality is improved if the values of that column are stored contiguously. In contrast, for columns accessed as a part of OLTP-style queries, wider partitions perform better, because such transactions frequently insert, delete, update, or access many of the fields of a row, and co-locating those fields leads to better cache locality. Using a highly accurate model of cache misses, HYRISE is able to predict the performance of different partitionings, and to automatically select the best partitioning using an automated database design algorithm. We show that, on a realistic workload derived from customer applications, HYRISE can achieve a 20 % to 400 % performance improvement over pure all-column or all-row designs, and that it is both more scalable and produces better designs than previous vertical partitioning approaches for main memory systems. 1.

