Results 1 
9 of
9
Nearest Neighbor Algorithms for Load Balancing in Parallel Computers
, 1995
"... With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly wellknown nearest neighbor algorithms, the dimensionexchange (DE, for shor ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly wellknown nearest neighbor algorithms, the dimensionexchange (DE, for short) and the diffusion (DF, for short) methods and their several variantsthe average dimensionexchange (ADE), the optimallytuned dimensionexchange (ODE), the local average diffusion (ADF) and the optimallytuned diffusion (ODF). The measures of interest are their efficiency in driving any initial workload distribution to a uniform distribution and their ability in controlling the growth of the variance among the processors' workloads. The comparison is made with respect to both oneport and allport communication architectures and in consideration of various implementation strategies including synchronous/asynchronous invocation policies and static/dynamic random workload behaviors. It t...
Software Support For Parallel Processing Of Irregular And Dynamic Computations
, 1996
"... Many real world scientific computations are irregular and dynamic, which pose great challenge to the effort of parallelization. In this thesis we study the efficient mapping of a subclass of these problems, namely the "stepwise slowly changing" problems, onto distributed memory multiprocessors using ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Many real world scientific computations are irregular and dynamic, which pose great challenge to the effort of parallelization. In this thesis we study the efficient mapping of a subclass of these problems, namely the "stepwise slowly changing" problems, onto distributed memory multiprocessors using the task graph scheduling approach. There exists a large class of applications which belong to this category. Intuitively, the irregularity requires sophisticated mapping algorithms, and the "slowness" in the changes of the computational structures between steps allows the scheduling cost to be amortized, justifying the approach. We study three representative and widelyused applications: The Nbody simulation in astrophysics, the VortexSheet RollUp and the Contour Dynamics Computation from Computational Fluid Dynamics. We sta...
Mapping LargeScale FEMGraphs to Highly Parallel Computers with GridLike Topology by SelfOrganization
, 1994
"... We consider the problem of mapping large scale FEM graphs for the solution of partial differential equations to highly parallel distributed memory computers. Typically, these programs show a lowdimensional gridlike communication structure. We argue that conventional domain decomposition methods th ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We consider the problem of mapping large scale FEM graphs for the solution of partial differential equations to highly parallel distributed memory computers. Typically, these programs show a lowdimensional gridlike communication structure. We argue that conventional domain decomposition methods that are usually employed today are not well suited for future highly parallel computers as they do not take into account the interconnection structure of the parallel computer resulting in a large communication overhead. Therefore we propose a new mapping heuristic which performs both, partitioning of the solution domain and processor allocation in one integrated step. Our procedure is based on the ability of Kohonen neural networks to exploit topological similarities of an input space and a gridlike structured network to compute a neighborhood preserving mapping between the set of discretization points and the parallel computer. We report about results of mapping up to 44,000node FEM graph...
Properties of the Task Allocation Problem
, 1996
"... This paper is structured as follows. Section 2 introduces application and machine representations that are used to model the performance characteristics of parallel static applications on parallel machines. Section 3 gives a detailed study on the structure of the phase space (or landscape) of the T ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper is structured as follows. Section 2 introduces application and machine representations that are used to model the performance characteristics of parallel static applications on parallel machines. Section 3 gives a detailed study on the structure of the phase space (or landscape) of the TAP. Section 4 is dedicated to the geometrical phase transition occurring in the TAP. In section 5 the following experimental methods are presented: Simulated Annealing (SA) [8], for finding optima, and Weinberger correlation for phase space structure characterisation [21]. In section 6 experimental results are presented, which are discussed in section 7. Finally, some concluding remarks and directions for future work are given in section 8. 2 Application and Machine Models
Mapping Large Parallel Simulation Programs To Multicomputer Systems
 High Performance Computing Conference 94, Part of the SCS 1994 Simulation Multiconference, La Jolla
, 1994
"... We consider the problem of mapping parallel simulation programs to distributed memory parallel machines. Since a large fraction of computer simulations consists of solving partial differential equations, the communication patterns of the resulting parallel programs can be exploited to construct effi ..."
Abstract
 Add to MetaCart
We consider the problem of mapping parallel simulation programs to distributed memory parallel machines. Since a large fraction of computer simulations consists of solving partial differential equations, the communication patterns of the resulting parallel programs can be exploited to construct efficient mappings which lead to low communication overhead. We report about the application of Kohonennetworks to find such mappings. 1 INTRODUCTION Most computer simulations deal with physical processes in space and time modeled by a set of partial differential equations (PDEs). Discretization leads to a large sparse system of linear equations or  in case of nonlinear PDEs or discretization in time  to a sequence of such systems. The simulation can therefore easily be parallelized along the spatial structure of the model resulting in programs that can be considered as gridlike graphs with vertices as computational nodes and edges as communication relations between them. Ideally, the paralle...
Mapping Tasks To Processors With The Aid Of Kohonen Networks
"... To execute a parallel program on a multicomputer system, the tasks of the program have to be mapped to the particular processors of the parallel machine. To keep communication delays low, communicating tasks should be placed closely together. Since both the communication structure of the program and ..."
Abstract
 Add to MetaCart
To execute a parallel program on a multicomputer system, the tasks of the program have to be mapped to the particular processors of the parallel machine. To keep communication delays low, communicating tasks should be placed closely together. Since both the communication structure of the program and the interconnection structure of the parallel machine can be represented as graphs, the mapping problem can be regarded as a graph embedding problem to minimize communication costs. As a new heuristic approach to this NPhard problem we apply Kohonen's selforganizing maps to establish a topologypreserving embedding. Results from simulation experiments are presented and compared to other approaches to this problem. 1 INTRODUCTION The ever increasing demand for computational power led to the development of large scale parallel computers that already represent the prevailing supercomputer architecture. Their broader acceptance and dissemination, however, is hampered by a serious lack of softw...
A.Tentner (ed.): High Performance Computing 1994, Proc. of the SCS Simulation Multiconference 1994, San Diego, 11 .15. April 1994. S. 285290.
 High Performance Computing Conference 94, Part of the SCS 1994 Simulation Multiconference, La Jolla
, 1994
"... We consider the problem of mapping parallel simulation programs to distributed memory parallel machines. Since a large fraction of computer simulations consists of solving partial differential equations, the communication patterns of the resulting parallel programs can be exploited to construct effi ..."
Abstract
 Add to MetaCart
We consider the problem of mapping parallel simulation programs to distributed memory parallel machines. Since a large fraction of computer simulations consists of solving partial differential equations, the communication patterns of the resulting parallel programs can be exploited to construct efficient mappings which lead to low communication overhead. We report about the application of Kohonennetworks to find such mappings.
Large Scale Simulations of Complex Systems Part I: Conceptual Framework
, 1997
"... In this working document, we report on a new approach to high performance simulation. The main inspiration to this approach is the concept of complex systems: disparate elements with well defined interactions rules and non nonlinear emergent macroscopic behavior. We provide arguments and mechanisms ..."
Abstract
 Add to MetaCart
In this working document, we report on a new approach to high performance simulation. The main inspiration to this approach is the concept of complex systems: disparate elements with well defined interactions rules and non nonlinear emergent macroscopic behavior. We provide arguments and mechanisms to abstract temporal and spatial locality from the application and to incorporate this locality into the complete design cycle of modeling and simulation on parallel architectures. Although the main application area discussed here is physics, the presented Virtual Particle (VIP) paradigm in the context of Dynamic Complex Systems (DCS), is applicable to other areas of compute intensive applications. Part I deals with the concepts behind the VIP and DCS models. A formal approach to the mapping of application taskgraphs to machine taskgraphs is presented. The major part of section 3 has recently (July 1997) been accepted for publication in Complexity. In Part II we will elaborate on the execution behavior of