Results 11 - 20
of
67
Critical-Path and Priority Based Algorithms for Scheduling Workflows with Parameter Sweep Tasks on Global Grids
"... Parameter-sweep has been widely adopted in large numbers of scientific applications. Parameter-sweep features need to be incorporated into Grid workflows so as to increase the scale and scope of such applications. New scheduling mechanisms and algorithms are required to provide optimized policy for ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
(Show Context)
Parameter-sweep has been widely adopted in large numbers of scientific applications. Parameter-sweep features need to be incorporated into Grid workflows so as to increase the scale and scope of such applications. New scheduling mechanisms and algorithms are required to provide optimized policy for resource allocation and task arrangement in such a case. This paper addresses scheduling sequential parameter-sweep tasks in a fine-grained manner. The optimization is produced by pipelining the subtasks and dispatching each of them onto well-selected resources. Two types of scheduling algorithms are discussed and customized to adapt the characteristics of parameter-sweep, as well as their effectiveness has been compared under multifarious scenarios.
A Parabolic Load Balancing Method
, 1995
"... This paper presents a diffusive load balancing method for scalable multicomputers. In contrast to other schemes which are provably correct the method scales to large numbers of processors with no increase in run time. In contrast to other schemes which are scalable the method is provably correct and ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
This paper presents a diffusive load balancing method for scalable multicomputers. In contrast to other schemes which are provably correct the method scales to large numbers of processors with no increase in run time. In contrast to other schemes which are scalable the method is provably correct and the paper analyzes the rate of convergence. To control aggregate cpu idle time it can be useful to balance the load to specifiable accuracy. The method achieves arbitrary accuracy by proper consideration of numerical error and stability. This paper presents the method, proves correctness, convergence and scalability, and simulates applications to generic problems in computational fluid dynamics (CFD). The applications reveal some useful properties. The method can preserve adjacency relationships among elements of an adapting computational domain. This makes it useful for partitioning unstructured computational grids in concurrent computations. The method can execute asynchronously to balanc...
Evaluation of a semi-static approach to mapping dynamic iterative tasks onto heterogeneous computing systems
- J. Parallel Distrib. Comput
, 1999
"... Abstract—To minimize the execution time of an iterative application in a heterogeneous parallel computing environment, an appropriate mapping scheme is needed for matching and scheduling the subtasks of the application onto the processors. When some of the characteristics of the application subtasks ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
(Show Context)
Abstract—To minimize the execution time of an iterative application in a heterogeneous parallel computing environment, an appropriate mapping scheme is needed for matching and scheduling the subtasks of the application onto the processors. When some of the characteristics of the application subtasks are unknown a priori and will change from iteration to iteration during execution-time, a semi-static methodology can be employed, that starts with an initial mapping but dynamically decides whether to perform a remapping between iterations of the application, by observing the effects of these dynamic parameters on the application’s execution time. The objective of this study is to implement and evaluate such a semi-static methodology. For analyzing the effectiveness of the proposed scheme, it is compared with two extreme approaches: a completely dynamic approach using a fast mapping heuristic and an ideal approach that uses a genetic algorithm on-line but ignores the time for remapping. Experimental results indicate that the semi-static approach outperforms the dynamic approach and is reasonably close to the ideal but infeasible approach. 1
Automatic Selection of Load Balancing Parameters Using Compile-time and Run-time Information
, 1996
"... Clusters of workstations are emerging as an important architecture. Programming tools that aid in distributing applications on workstation clusters must address problems of mapping the application, heterogeneity and maximizing system utilization in the presence of varying resource availability. Both ..."
Abstract
-
Cited by 11 (9 self)
- Add to MetaCart
(Show Context)
Clusters of workstations are emerging as an important architecture. Programming tools that aid in distributing applications on workstation clusters must address problems of mapping the application, heterogeneity and maximizing system utilization in the presence of varying resource availability. Both computation and communication capabilities may vary with time due to other applications competing for resources so dynamic load balancing is a key requirement. For greatest benefit, the tool must support a relatively wide class of applications running on clusters with a range of computation and communication capabilities. We have developed a system that supportsdynamic load balancing of distributed applications consisting of parallelized DOALL and DOACROSS loops. The focus of this paper is on how the system automatically determines key load balancing parameters using run-time information and information provided by programming tools such as a parallelizing compiler. The parameters discussed...
Locality Optimizations For Adaptive Irregular Scientific Codes
, 2000
"... Irregular scientific codes experience poor cache performance due to their memory access patterns. We examine several data and computation locality transformations including GPART, a new technique based on hierarchical clustering. GPART constructs quality partitions quickly by clustering multiple n ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
Irregular scientific codes experience poor cache performance due to their memory access patterns. We examine several data and computation locality transformations including GPART, a new technique based on hierarchical clustering. GPART constructs quality partitions quickly by clustering multiple neighboring nodes in a few passes, with priority on nodes with high degree. Overhead is kept low by considering only edges between partitions. We develop compiler analyses and transformations in SUIF to automatically apply locality transformations, and propose user annotations to locate coordinate information needed by geometric partitioning algorithms. We experimentally evaluate locality optimizations for both static and adaptive codes, where connection patterns dynamically change at intervals during program execution. We derive a simple cost model to guide locality optimizations when access patterns change. Experiments on several irregular scientific codes show locality optimization t...
Dynamic Load Balancing of Distributed SPMD Computations with Explicit Message-Passing
- In Proceedings of the IEEE Workshop on Heterogeneous Computing
, 1997
"... Distributed systems have the potentiality of becoming an alternative platform for parallel computations. However, there are still many obstacles to overcome, one of the most serious is that distributed systems typically consist of shared heterogeneous components with highly variable computational po ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Distributed systems have the potentiality of becoming an alternative platform for parallel computations. However, there are still many obstacles to overcome, one of the most serious is that distributed systems typically consist of shared heterogeneous components with highly variable computational power. In this paper we present a load balancing support that checks the load status and, if necessary, adapts the workload to dynamic platform conditions through data migrations from overloaded to underloaded nodes. Unlike task migration supports for task parallelism and other data migration frameworks for master/slavebased parallel applications, our support works for the entire class of SPMD regular applications with explicit communications such as linear algebra problems, partial differential equation solvers, image processing algorithms. Although we considered several variants (three activation mechanisms, three load monitoring techniques and four decision policies), we implemented only th...
Heterogeneous Partitioning in a Workstation Network
- Proc. of 1994 Heterogeneous Computing Workshop
, 1994
"... In this paper, we present several heterogeneous partitioning algorithms for parallel numerical applications. The goal is to adapt the partitioning to dynamic and unpredictable load changes on the nodes. The methods are based on existing homogeneous algorithms like orthogonal recursive bisection, par ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
In this paper, we present several heterogeneous partitioning algorithms for parallel numerical applications. The goal is to adapt the partitioning to dynamic and unpredictable load changes on the nodes. The methods are based on existing homogeneous algorithms like orthogonal recursive bisection, parallel strips, and scattering. We apply these algorithms to a parallel numerical application in a network of heterogeneous workstations. The behavior of the individual methods in a system with dynamical load changes and heterogeneous nodes is investigated. In addition, our new methods are compared with the conventional methods for homogeneous partitioning. 1 Introduction Workstations offer more and more performance, exceeding average requirements of users. Therefore, an increasing fraction of computing power will be available for other users by network interconnections. Distributed systems, consisting of a heterogeneous collection of general purpose computer systems connected by a network, p...
Data Redistribution Algorithms For Heterogeneous Processor Rings
, 2004
"... We consider the problem of redistributing data on homogeneous and heterogeneous ring of processors. The problem arises in several applications, each time after that a load-balancing mechanism is invoked (but we do not discuss the load-balancing mechanism itself). We provide algorithms that aim at op ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
(Show Context)
We consider the problem of redistributing data on homogeneous and heterogeneous ring of processors. The problem arises in several applications, each time after that a load-balancing mechanism is invoked (but we do not discuss the load-balancing mechanism itself). We provide algorithms that aim at optimizing the data redistribution, both for unidirectional and bi-directional rings, and we give complete proofs of correctness. One major contribution of the paper is that we are able to prove the optimality of the proposed algorithms in all cases except that of a bi-directional heterogeneous ring, for which the problem remains open.
Run-time Support for Parallelization of Data-Parallel Applications on Adaptive and Nonuniform Computational Environments,
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1995
"... In this paper we discuss the runtime support required for the parallelization of unstructured dataparallel applications on nonuniform and adaptive environments. The approach presented is reasonably general and is applicable to a wide variety of regular as well as irregular applications. We present p ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
In this paper we discuss the runtime support required for the parallelization of unstructured dataparallel applications on nonuniform and adaptive environments. The approach presented is reasonably general and is applicable to a wide variety of regular as well as irregular applications. We present performance results for the solution of an unstructured mesh on a cluster of heterogeneous workstations.
Decentralized Remapping of Data Parallel Applications in Distributed Memory Multiprocessors
- in Distributed Memory Multiprocessors. Concurrency: Practice and Experience
, 1997
"... In this paper we present a decentralized remapping method for data parallel applications on distributed memory multiprocessors. The method uses a generalized dimensionexchange (GDE) algorithm periodically during the execution of an application to balance (remap) the system's workload. We implem ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
In this paper we present a decentralized remapping method for data parallel applications on distributed memory multiprocessors. The method uses a generalized dimensionexchange (GDE) algorithm periodically during the execution of an application to balance (remap) the system's workload. We implemented this remapping method in parallel WaTor simulations and parallel image thinning applications, and found it to be effective in reducing the computation time. The average performance gain is about 20% in the WaTor simulation of a 256 \Theta 256 ocean grid on 16 processors, and up to 8% in the thinning of a typical image of size 128 \Theta 128 on 8 processors. The performance gains due to remapping in the image thinning case are reasonably substantial given the fact that the application by its very nature does not necessarily favor remapping. We also implemented this remapping method, using up to 32 processors, for partitioning and re-partitioning of grids in computational fluid dynamics. It w...