## ANDES: Evaluating Mapping Strategies with Synthetic Programs (1997)

Venue: | Euromicro J. of Systems Architecture |

Citations: | 1 - 1 self |

### BibTeX

@ARTICLE{Kitajima97andes:evaluating,

author = {João Paulo Kitajima and Brigitte Plateau and Pascal Bouvry and Denis Trystram},

title = {ANDES: Evaluating Mapping Strategies with Synthetic Programs},

journal = {Euromicro J. of Systems Architecture},

year = {1997},

pages = {42--351}

}

### OpenURL

### Abstract

This paper presents the ANDES performance evaluation tool. ANDES is based on the synthetic execution of parallel programs and it is used for the evaluation of mapping strategies. The Meganode, a distributed memory parallel computer, is considered as our target architecture. ANDES takes into account a benchmark of quantitative models of parallel algorithms and a set of mapping strategies (greedy and iterative algorithms are used). We show how this tool allows an extensive comparison of mapping strategies by using the benchmark, the mapping strategies and different cost functions. 1 Introduction Distributed memory multiprocessors (DMM) are the current trend of high-performance parallel computers. They represent a good balance between cost and performance, mainly because of the connection of several commercial, general and relatively cheap microprocessors. A distributed memory multiprocessor is a computer composed of autonomous processors connected by a high speed communication network....

### Citations

267 | A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems
- Casavant, Kuhl
- 1988
(Show Context)
Citation Context ...e for practical problems. Several algorithms can be found in the literature for solving the mapping problem. We can roughly distinguish two classes of methods, namely, exact algorithms and heuristics =-=[3]-=-. Exact algorithms can only be used when the space of solutions is small enough, for instance when only a few tasks have to be allocated to a machine with a small number of processors. Exact algorithm... |

165 |
On the mapping problem
- Bokhari
- 1981
(Show Context)
Citation Context ...cessors to locally improve the solution. Most such algorithms use random perturbations to leave local minima and to jump to better solutions. A well-known iterative algorithm is the Bokhari algorithm =-=[1]-=-. Its cost function (called cardinality) takes into account the number of tasks correctly mapped on the processor network and uses pair-wise exchanges of tasks to improve it. The basic hypothesis is t... |

159 |
Multiprocessor scheduling with the aid of network flow algorithms
- Stone
- 1977
(Show Context)
Citation Context ...he number of tasks is less than twice the number of processors and if at most two tasks are allocated to one processor. ffl Algorithms based on a minimal cut of bi-parted graphs can also be used [10] =-=[14]-=-. Greedy algorithms are easy to implement and have a polynomial complexity (often less than O(jT j 3 ), for instance LPTF is of jT j log jT j complexity). 2.2 Iterative algorithms All iterative algori... |

122 |
Heuristic algorithms for task assignment in distributed systems
- Lo
- 1988
(Show Context)
Citation Context ... if the number of tasks is less than twice the number of processors and if at most two tasks are allocated to one processor. ffl Algorithms based on a minimal cut of bi-parted graphs can also be used =-=[10]-=- [14]. Greedy algorithms are easy to implement and have a polynomial complexity (often less than O(jT j 3 ), for instance LPTF is of jT j log jT j complexity). 2.2 Iterative algorithms All iterative a... |

106 |
Practical multiprocessor scheduling algorithms for efficient parallel processing
- Kasahara, Narita
- 1984
(Show Context)
Citation Context ...terature. Other solutions can be obtained by mixing the preceding algorithms: an initial solution could be obtained by simulated annealing and then tabu or algorithms like the branch&bound algorithms =-=[6]-=- could be used to improve the mapping. 2.4 Quality of the solution Most solutions of the mapping problem are based on the optimization of cost functions. Let us denote by z such a function. Under the ... |

80 | Models of machines and computation for mapping in multicomputers
- Norman, Thanisch
- 1993
(Show Context)
Citation Context ...some realistic (or almost realistic) components. This experimental approach is rather new, considering that normally mapping strategies are compared according to different values of the cost function =-=[12]-=-. Future work is planned. ANDES currently runs on a Transputer machine. It will be ported on the IBM SP-1 multiprocessor. With the SP-1 version, mapping, scheduling and load balancing strategies will ... |

25 |
Parallel machine scheduling with nonsimultaneous machine available time; Discrete Applied Mathematics 30
- Lee
- 1991
(Show Context)
Citation Context ...ng Time First) is a heuristic whose criterion is restricted to load balancing. It is well-known that its performance in the worst case is about 4 3 from the optimal when considering independent tasks =-=[8]-=-. ffl Lo presents in [11] an algorithm based on a maximal matching which minimizes the costs of communications between tasks. This algorithm is optimal for UET (Unitary Execution Time) tasks if the nu... |

24 | Processor and link assignment in multicomputers using simulated annealing - Bollinger, Midkiff - 1988 |

22 |
occam2 reference manual
- Limited
- 1988
(Show Context)
Citation Context ...y 0.5. Originally, three types of inputs/outputs can be described: (1) boolean descriptions (like AND, OR). An AND input models a join of control threads, and an OR input models an Occam2 alternative =-=[9]-=-. An AND output models a fork of control threads, and an OR output models the Ccase instruction; (2) global operations, like data broadcast. A data broadcast can be considered as an AND output, but th... |

4 |
Algorithms for static task assignment and symetric contraction in distributed systems
- Lo
- 1988
(Show Context)
Citation Context ...istic whose criterion is restricted to load balancing. It is well-known that its performance in the worst case is about 4 3 from the optimal when considering independent tasks [8]. ffl Lo presents in =-=[11]-=- an algorithm based on a maximal matching which minimizes the costs of communications between tasks. This algorithm is optimal for UET (Unitary Execution Time) tasks if the number of tasks is less tha... |

3 |
Tabu Search, a chapter in Modern Heuristic Techniques for Combinatorial Problems
- Glover, Laguna
- 1992
(Show Context)
Citation Context ...ty decreasing with the temperature. It corresponds mathematically to giving a chance to leave a local minimum of the function to optimize. 2.2.3 Tabu search Tabu search is an iterative meta-heuristic =-=[5]-=-. It tries to find the best neighbor of a given solution. To avoid cycling and local optima, a tabu list is established. This tabu list contains information concerning the last moves. A tabu move is n... |

3 |
ALPES: a tool for the performance evaluation of parallel programs
- Kitajima, Tron, et al.
- 1993
(Show Context)
Citation Context ... of the mapping algorithms is presented in order to show that the tool is useful. Finally, some conclusions and perspectives are presented. ANDES is an evolution of the ALPES environment presented in =-=[7]-=-, which was based on the generation of source files of synthetic programs. The new approach is based on a more efficient synthetic execution, controlled by a kernel that accepts a synthetic workload d... |

2 |
Apostolos Gerasoulis. PYRROS: static scheduling and code generation for message passing multiprocessors
- Yang
- 1992
(Show Context)
Citation Context ... strategy. Therefore, a clustering algorithm is used to group the computation nodes of the DAG into clusters. In ANDES, clustering is done using the PYRROS DSC (Dominant Sequence Clustering)Algorithm =-=[15]. The DSC -=-algorithm "performs a sequence of clustering refinement steps and at each refinement step, it tries to zero an edge to reduce the parallel time" [15]. It has O((v + e)logv) time complexity a... |

1 |
Outils pour la programmation d'un multiprocesseur a memoires distribuees
- Pazat
- 1989
(Show Context)
Citation Context ...to improve it iteratively using a neighborhood relation. This solution leads directly to a local optimum. 2.2.2 Simulated annealing One of the most popular iterative methods is simulated annealing [2]=-=[13]-=-. This method is based on an analogy with statistical physics: The annealing technique allows a metal with the most regular structure as possible to be obtained. It consists of heating the metal and r... |