Results 1 
7 of
7
Fast Concurrent Access to Parallel Disks
"... High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is ..."
Abstract

Cited by 49 (11 self)
 Add to MetaCart
High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is perhaps the main inhibitor for the efficient adaptation of singledisk external memory algorithms to multiple disks. We solve this problem for arbitrary access patterns by randomly mapping blocks of a logical address space to the disks. We show that a shared buffer of O(D) blocks suffices to support efficient writing. The analysis uses the properties of negative association to handle dependencies between the random variables involved. This approach might be of independent interest for probabilistic analysis in general. If two randomly allocated copies of each block exist, N arbitrary blocks can be read within dN=De + 1 I/O steps with high probability. The redundancy can be further reduced from 2 to 1 + 1=r for any integer r without a big impact on reading efficiency. From the point of view of external memory models, these results rehabilitate Aggarwal and Vitter's "singledisk multihead" model [1] that allows access to D arbitrary blocks in each I/O step. This powerful model can be emulated on the physically more realistic independent disk model [2] with small constant overhead factors. Parallel disk external memory algorithms can therefore be developed in the multihead model first. The emulation result can then be applied directly or further refinements can be added.
Perfectly balanced allocation
 in Proceedings of the 7th International Workshop on Randomization and Approximation Techniques in Computer Science, Princeton, NJ, 2003, Lecture Notes in Comput. Sci. 2764
, 2003
"... Abstract. We investigate randomized processes underlying load balancing based on the multiplechoice paradigm: m balls have to be placed in n bins, and each ball can be placed into one out of 2 randomly selected bins. The aim is to distribute the balls as evenly as possible among the bins. Previousl ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
Abstract. We investigate randomized processes underlying load balancing based on the multiplechoice paradigm: m balls have to be placed in n bins, and each ball can be placed into one out of 2 randomly selected bins. The aim is to distribute the balls as evenly as possible among the bins. Previously, it was known that a simple process that places the balls one by one in the least loaded bin can achieve a maximum load of m/n + Θ(log log n) with high probability. Furthermore, it was known that it is possible to achieve (with high probability) a maximum load of at most ⌈m/n ⌉ +1using maximum flow computations. In this paper, we extend these results in several aspects. First of all, we show that if m ≥ cn log n for some sufficiently large c, thenaperfect distribution of balls among the bins can be achieved (i.e., the maximum load is ⌈m/n⌉) with high probability. The bound for m is essentially optimal, because it is known that if m ≤ c ′ n log n for some sufficiently small constant c ′ , the best possible maximum load that can be achieved is ⌈m/n ⌉ +1with high probability. Next, we analyze a simple, randomized load balancing process based on a local search paradigm. Our first result here is that this process always converges to a best possible load distribution. Then, we study the convergence speed of the process. We show that if m is sufficiently large compared to n,thenno matter with which ball distribution the system starts, if the imbalance is ∆, then the process needs only ∆·n O(1) steps to reach a perfect distribution, with high probability. We also prove a similar result for m ≈ n, and show that if m = O(n log n / log log n), then an optimal load distribution (which has the maximum load of ⌈m/n ⌉ +1) is reached by the random process after a polynomial number of steps, with high probability.
Reconciling Simplicity and Realism in Parallel Disk Models
 Parallel Computing
, 2001
"... For the design and analysis of algorithms that process huge data sets, a machine model is needed that handles parallel disks. There seems to be a dilemma between simple and flexible use of such a model and accurate modelling of details of the hardware. This paper explains how many aspects of this pr ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
For the design and analysis of algorithms that process huge data sets, a machine model is needed that handles parallel disks. There seems to be a dilemma between simple and flexible use of such a model and accurate modelling of details of the hardware. This paper explains how many aspects of this problem can be resolved. The programming model implements one large logical disk allowing concurrent access to arbitrary sets of variable size blocks. This model can be implemented efficienctly on multiple independent disks even if zones with different speed, communication bottlenecks and failed disks are allowed. These results not only provide useful algorithmic tools but also imply a theoretical justification for studying external memory algorithms using simple abstract models.
Algorithms for Scalable Storage Servers
 In SOFSEM 2004: Theory and Practice of Computer Science
, 2004
"... We survey a set of algorithmic techniques that make it possible to build a high performance storage server from a network of cheap components. Such a storage server oers a very simple programming model. To the clients it looks like a single very large disk that can handle many requests in parall ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
We survey a set of algorithmic techniques that make it possible to build a high performance storage server from a network of cheap components. Such a storage server oers a very simple programming model. To the clients it looks like a single very large disk that can handle many requests in parallel with minimal interference between the requests.
Minimum ConvexCost Tension Problems on SeriesParallel Graphs
, 2003
"... We present briefly some results we obtained with known methods to solve minimum cost tension problems, comparing their performance on nonspecific graphs and on seriesparallel graphs. These graphs are shown to be of interest to approximate many tension problems, like synchronization
in hypermedia d ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
We present briefly some results we obtained with known methods to solve minimum cost tension problems, comparing their performance on nonspecific graphs and on seriesparallel graphs. These graphs are shown to be of interest to approximate many tension problems, like synchronization
in hypermedia documents. We propose a new aggregation method to solve the minimum convex piecewise linear cost tension problem on seriesparallel graphs in O(m3) operations.
Minimum Convex Piecewise Linear Cost Tension Problem on Quasik SeriesParallel Graphs
, 2003
"... This article proposes an extension, combined with the outofkilter technique, of the aggregation method (that solves the minimum convex piecewise linear cost tension problem, or CPLCT, on seriesparallel graphs) to solve CPLCT on quasi seriesparallel graphs. To make this algorithm efficient, the k ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This article proposes an extension, combined with the outofkilter technique, of the aggregation method (that solves the minimum convex piecewise linear cost tension problem, or CPLCT, on seriesparallel graphs) to solve CPLCT on quasi seriesparallel graphs. To make this algorithm efficient, the key point is to find a "good" way of decomposing the graph into seriesparallel subgraphs. Decomposition techniques, based on the recognition of seriesparallel graphs, are thoroughly discussed.
Aggregation Approach for the Minimum Binary Cost Tension Problem
, 2005
"... The aggregation technique, dedicated to twoterminal seriesparallel graphs (or TTSPgraphs) and introduced lately to solve the minimum piecewise linear cost tension problem, is adapted here to solve the minimum binary cost tension problem (or BCT problem). Even on TTSPgraphs, the BCT problem has b ..."
Abstract
 Add to MetaCart
The aggregation technique, dedicated to twoterminal seriesparallel graphs (or TTSPgraphs) and introduced lately to solve the minimum piecewise linear cost tension problem, is adapted here to solve the minimum binary cost tension problem (or BCT problem). Even on TTSPgraphs, the BCT problem has been proved to be NPcomplete. As far as we know, the aggregation is the only algorithm, with mixed integer programming, proposed to solve exactly the BCT problem on TTSPgraphs. A comparison of the efficiency of both methods is presented here.