Results 11 - 20
of
47
A Parabolic Load Balancing Method
, 1995
"... This paper presents a diffusive load balancing method for scalable multicomputers. In contrast to other schemes which are provably correct the method scales to large numbers of processors with no increase in run time. In contrast to other schemes which are scalable the method is provably correct and ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
This paper presents a diffusive load balancing method for scalable multicomputers. In contrast to other schemes which are provably correct the method scales to large numbers of processors with no increase in run time. In contrast to other schemes which are scalable the method is provably correct and the paper analyzes the rate of convergence. To control aggregate cpu idle time it can be useful to balance the load to specifiable accuracy. The method achieves arbitrary accuracy by proper consideration of numerical error and stability. This paper presents the method, proves correctness, convergence and scalability, and simulates applications to generic problems in computational fluid dynamics (CFD). The applications reveal some useful properties. The method can preserve adjacency relationships among elements of an adapting computational domain. This makes it useful for partitioning unstructured computational grids in concurrent computations. The method can execute asynchronously to balanc...
Locality Optimizations For Adaptive Irregular Scientific Codes
, 2000
"... Irregular scientific codes experience poor cache performance due to their memory access patterns. We examine several data and computation locality transformations including GPART, a new technique based on hierarchical clustering. GPART constructs quality partitions quickly by clustering multiple n ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Irregular scientific codes experience poor cache performance due to their memory access patterns. We examine several data and computation locality transformations including GPART, a new technique based on hierarchical clustering. GPART constructs quality partitions quickly by clustering multiple neighboring nodes in a few passes, with priority on nodes with high degree. Overhead is kept low by considering only edges between partitions. We develop compiler analyses and transformations in SUIF to automatically apply locality transformations, and propose user annotations to locate coordinate information needed by geometric partitioning algorithms. We experimentally evaluate locality optimizations for both static and adaptive codes, where connection patterns dynamically change at intervals during program execution. We derive a simple cost model to guide locality optimizations when access patterns change. Experiments on several irregular scientific codes show locality optimization t...
Dynamic Load Balancing of Distributed SPMD Computations with Explicit Message-Passing
- In Proceedings of the IEEE Workshop on Heterogeneous Computing
, 1997
"... Distributed systems have the potentiality of becoming an alternative platform for parallel computations. However, there are still many obstacles to overcome, one of the most serious is that distributed systems typically consist of shared heterogeneous components with highly variable computational po ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Distributed systems have the potentiality of becoming an alternative platform for parallel computations. However, there are still many obstacles to overcome, one of the most serious is that distributed systems typically consist of shared heterogeneous components with highly variable computational power. In this paper we present a load balancing support that checks the load status and, if necessary, adapts the workload to dynamic platform conditions through data migrations from overloaded to underloaded nodes. Unlike task migration supports for task parallelism and other data migration frameworks for master/slavebased parallel applications, our support works for the entire class of SPMD regular applications with explicit communications such as linear algebra problems, partial differential equation solvers, image processing algorithms. Although we considered several variants (three activation mechanisms, three load monitoring techniques and four decision policies), we implemented only th...
Load-Balancing Iterative Computations on Heterogeneous Clusters
"... We focus on mapping iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive p ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
We focus on mapping iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive processors in the ring. The question is to determine how to slice the application data into chunks, and assign these chunks to the processors, so that the total execution time is minimized. A major
Automatic Selection of Load Balancing Parameters Using Compile-time and Run-time Information
, 1996
"... Clusters of workstations are emerging as an important architecture. Programming tools that aid in distributing applications on workstation clusters must address problems of mapping the application, heterogeneity and maximizing system utilization in the presence of varying resource availability. Both ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
Clusters of workstations are emerging as an important architecture. Programming tools that aid in distributing applications on workstation clusters must address problems of mapping the application, heterogeneity and maximizing system utilization in the presence of varying resource availability. Both computation and communication capabilities may vary with time due to other applications competing for resources so dynamic load balancing is a key requirement. For greatest benefit, the tool must support a relatively wide class of applications running on clusters with a range of computation and communication capabilities. We have developed a system that supportsdynamic load balancing of distributed applications consisting of parallelized DOALL and DOACROSS loops. The focus of this paper is on how the system automatically determines key load balancing parameters using run-time information and information provided by programming tools such as a parallelizing compiler. The parameters discussed...
Heterogeneous Partitioning in a Workstation Network
- Proc. of 1994 Heterogeneous Computing Workshop
, 1994
"... In this paper, we present several heterogeneous partitioning algorithms for parallel numerical applications. The goal is to adapt the partitioning to dynamic and unpredictable load changes on the nodes. The methods are based on existing homogeneous algorithms like orthogonal recursive bisection, par ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this paper, we present several heterogeneous partitioning algorithms for parallel numerical applications. The goal is to adapt the partitioning to dynamic and unpredictable load changes on the nodes. The methods are based on existing homogeneous algorithms like orthogonal recursive bisection, parallel strips, and scattering. We apply these algorithms to a parallel numerical application in a network of heterogeneous workstations. The behavior of the individual methods in a system with dynamical load changes and heterogeneous nodes is investigated. In addition, our new methods are compared with the conventional methods for homogeneous partitioning. 1 Introduction Workstations offer more and more performance, exceeding average requirements of users. Therefore, an increasing fraction of computing power will be available for other users by network interconnections. Distributed systems, consisting of a heterogeneous collection of general purpose computer systems connected by a network, p...
A Decomposition Advisory System for Heterogeneous Data-Parallel Processing
, 1994
"... Networked computing has become a popular method for using parallelism to solve a variety of computationally intense problems. However, high communication costs and processor heterogeneity may limit performance unless the problem space is carefully partitioned. We propose a decomposition advisory sys ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Networked computing has become a popular method for using parallelism to solve a variety of computationally intense problems. However, high communication costs and processor heterogeneity may limit performance unless the problem space is carefully partitioned. We propose a decomposition advisory system that is designed to help choose the best data partitioning strategy. The goal of this research is to determine the partitioning scheme(s) expected to yield the best performance for a particular data-parallel problem with known regular communication patterns on a collection of heterogeneous processors. Given information about the problem space and the network, the system provides a ranking of standard partitioning methods. 1 Introduction High performance computing, once only within the scope of supercomputers and expensive parallel machines, has become attainable through the use of networks of independent, possibly heterogeneous, computers. However, heterogeneous processing presents a n...
Optimal dynamic remapping of parallel computations
- IEEE Transactions on Computers
, 1990
"... A large class of computations are characterized by a sequence of phases, with phase changes occurring unpredictably. We consider the decision problem regarding the remapping of workload to processors in a parallel computation when (i) Ihe uiility of remapping md the future behavior of the workload i ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
A large class of computations are characterized by a sequence of phases, with phase changes occurring unpredictably. We consider the decision problem regarding the remapping of workload to processors in a parallel computation when (i) Ihe uiility of remapping md the future behavior of the workload is uncertain, and (ii) phases exhibit stable execution requirements during a given phase, but requirements may change radically between phases. For these problems a workload assignment gen-erated for one phase may hinder performance during the next phase. This problem is treated formally for a probabilistic model of computation with at most two phases. We address the fundamental prob-lem of balancing the expected remapping performance gain against the delay cost. Stochastic dynamic programming is used to show that the remapping decision policy minimizing the expected running time of the computation has an extremely simple structure: the optimal decision at any decision step is followed by comparing the probability of remapping gain against a threshold. However, threshold calculation requires a priori estimation of the performance gain achieved by remap ping. Because this gain may not be predictable, we examine the performance of a heuristic policy that does not require estimation of the gain. In most cases we find nearly optimal performance if remapping
Critical-Path and Priority Based Algorithms for Scheduling Workflows with Parameter Sweep Tasks on Global Grids
"... Parameter-sweep has been widely adopted in large numbers of scientific applications. Parameter-sweep features need to be incorporated into Grid workflows so as to increase the scale and scope of such applications. New scheduling mechanisms and algorithms are required to provide optimized policy for ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Parameter-sweep has been widely adopted in large numbers of scientific applications. Parameter-sweep features need to be incorporated into Grid workflows so as to increase the scale and scope of such applications. New scheduling mechanisms and algorithms are required to provide optimized policy for resource allocation and task arrangement in such a case. This paper addresses scheduling sequential parameter-sweep tasks in a fine-grained manner. The optimization is produced by pipelining the subtasks and dispatching each of them onto well-selected resources. Two types of scheduling algorithms are discussed and customized to adapt the characteristics of parameter-sweep, as well as their effectiveness has been compared under multifarious scenarios.
Data Redistribution Algorithms For Heterogeneous Processor Rings
, 2004
"... We consider the problem of redistributing data on homogeneous and heterogeneous ring of processors. The problem arises in several applications, each time after that a load-balancing mechanism is invoked (but we do not discuss the load-balancing mechanism itself). We provide algorithms that aim at op ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
We consider the problem of redistributing data on homogeneous and heterogeneous ring of processors. The problem arises in several applications, each time after that a load-balancing mechanism is invoked (but we do not discuss the load-balancing mechanism itself). We provide algorithms that aim at optimizing the data redistribution, both for unidirectional and bi-directional rings, and we give complete proofs of correctness. One major contribution of the paper is that we are able to prove the optimality of the proposed algorithms in all cases except that of a bi-directional heterogeneous ring, for which the problem remains open.

