• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Load-balancing iterative computations in heterogeneous clusters with shared communication links (2003)

by A Legrand, H Renard, Y Robert, F Vivien
Add To MetaCart

Tools

Sorted by:
Results 1 - 7 of 7

Efficient assignment and scheduling for heterogeneous dsp systems

by Zili Shao, Qingfeng Zhuge, Edwin H. -m. Sha - IEEE Trans. on Parallel and Distributed Systems , 2005
"... This paper addresses high level synthesis for real-time digital signal processing (DSP) architec-tures using heterogeneous functional units (FUs). For such special purpose architecture synthesis, an important problem is how to assign a proper FU type to each operation of a DSP application and genera ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
This paper addresses high level synthesis for real-time digital signal processing (DSP) architec-tures using heterogeneous functional units (FUs). For such special purpose architecture synthesis, an important problem is how to assign a proper FU type to each operation of a DSP application and generate a schedule in such a way that all requirements can be met and the total cost can be minimized. We propose a two-phase approach to solve this problem. In the first phase, we solve heteroge-neous assignment problem, i.e., given the types of heterogeneous FUs, a Data-Flow Graph (DFG) in which each node has different execution times and costs (may relate to power, reliability, etc.) for different FU types, and a timing constraint, how to assign a proper FU type to each node such that the total cost can be minimized while the timing constraint is satisfied. In the second phase, based on the assignments obtained in the first phase, we propose a minimum resource scheduling algorithm to generate a schedule and a feasible configuration that uses as little resource as possible. We prove heterogeneous assignment problem is NP-complete. Efficient algorithms are proposed to find an optimal solution when the given DFG is a simple path or a tree. Three other algorithms are proposed to solve the general problem. The experiments show that our algorithms can effectively reduce the total cost compared with the previous work.

Mapping and load-balancing iterative computations on heterogeneous clusters

by Arnaud Legrand, Hélène Renard, Yves Robert, Frédéric Vivien
"... This paper is devoted to mapping iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between con ..."
Abstract - Cited by 5 (2 self) - Add to MetaCart
This paper is devoted to mapping iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive processors in the ring. The question is to determine how to slice the application data into chunks, and to assign these chunks to the processors, so that the total execution time is minimized. One major difficulty is to embed a processor ring into a network that typically is not fully connected, so that some communication links have to be shared by several processor pairs. We establish a complexity result that assesses the difficulty of this problem, and we design a practical heuristic that provides efficient mapping, routing, and data distribution schemes.

Mapping and Load-Balancing Iterative Computations

by Arnaud Legrand, Hélène Renard, Yves Robert, Frédéric Vivien - IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS , 2004
"... This paper is devoted to mapping iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between c ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
This paper is devoted to mapping iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive processors in the ring. The question is to determine how to slice the application data into chunks, and to assign these chunks to the processors, so that the total execution time is minimized. One major difficulty is to embed a processor ring into a network that typically is not fully connected, so that some communication links have to be shared by several processor pairs. We establish a complexity result that assesses the difficulty of this problem, and we design a practical heuristic that provides efficient mapping, routing, link-sharing, and data distribution schemes.

A First Step Towards Automatically Building Network Representations

by Lionel Eyraud, Dubois Arnaud, Legrand Martin, Quinson Frédéric Vivien
"... Abstract. To fully harness Grids, users or middlewares must have some knowledge on the topology of the platform interconnection network. As such knowledge is usually not available, one must uses tools which automatically build a topological network model through some measurements. In this article, w ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract. To fully harness Grids, users or middlewares must have some knowledge on the topology of the platform interconnection network. As such knowledge is usually not available, one must uses tools which automatically build a topological network model through some measurements. In this article, we define a methodology to assess the quality of these network model building tools, and we apply this methodology to representatives of the main classes of model builders and to two new algorithms. We show that none of the main existing techniques build models that enable to accurately predict the running time of simple application kernels for actual platforms. However some of the new algorithms we propose give excellent results in a wide range of situations. keywords: Network model, topology reconstruction, Grids. 1

Assessing the Quality of Automatically Built Network Representations

by Lionel Eyraud-dubois, École Normale, Supérieure Lyon
"... Abstract — In order to efficiently use Grid resources, users or middlewares must use some network information, and in particular some knowledge of the platform network. As such knowledge is usually not available, one must use tools which automatically build a topological network model through some m ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract — In order to efficiently use Grid resources, users or middlewares must use some network information, and in particular some knowledge of the platform network. As such knowledge is usually not available, one must use tools which automatically build a topological network model through some measurements. Our aim is to define a methodology to assess the quality of these network model building tools, and to apply this methodology to representatives of the main classes of model builders. Using this approach, we show that none of the main existing techniques build models that enable to accurately predict the running time of simple application kernels for actual platforms.

3.3. Providing Access to HPC Servers on the Grid 5

by unknown authors , 2004
"... d ' ctivity ..."
Abstract - Add to MetaCart
d ' ctivity

Load Balancing Hybrid Programming Models for SMP Clusters and Fully Permutable Loops

by unknown authors
"... This paper emphasizes on load balancing issues associated with hybrid programming models for the parallelization of fully permutable nested loops onto SMP clusters. Hybrid parallel programming models usually suffer from intrinsic load imbalance between threads, mainly because most existing message p ..."
Abstract - Add to MetaCart
This paper emphasizes on load balancing issues associated with hybrid programming models for the parallelization of fully permutable nested loops onto SMP clusters. Hybrid parallel programming models usually suffer from intrinsic load imbalance between threads, mainly because most existing message passing libraries generally provide limited multi-threading support, allowing only the master thread to perform inter-node message passing communication. In order to mitigate this effect, we propose a generic method for the application of static load balancing on the coarse-grain hybrid model for the appropriate distribution of the computational load to the working threads. We experimentally evaluate the efficiency of the proposed scheme against a micro-kernel benchmark, and demonstrate the potential of such load balancing schemes for the extraction of maximum performance out of hybrid parallel programs. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University