Results 1 - 10
of
92
Improved Schemes for Mapping Arbitrary Algorithms Onto Processor Meshes
, 1995
"... We address the problem of efficient schemes for mapping arbitrary parallel algorithms onto distributed memory message passing multiprocessors with mesh topologies. We analyze a particular mapping scheme, find the reasons for its low efficiency, and show that mapped algorithms tend to be both wider a ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We address the problem of efficient schemes for mapping arbitrary parallel algorithms onto distributed memory message passing multiprocessors with mesh topologies. We analyze a particular mapping scheme, find the reasons for its low efficiency, and show that mapped algorithms tend to be both wider
Fault tolerant mapping onto VLSI/WSI processor arrays
- PROC. 20TH EUROMICRO CONF
, 1994
"... This paper deals with efficient methods for mapping arbitrary parallel algorithms onto faulty general purpose VLSI/WSI data-driven array. First, a brief overview of several architectural designs of the array is given. Next, three directions for the algorithmic improvement of a certain mapping scheme ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
This paper deals with efficient methods for mapping arbitrary parallel algorithms onto faulty general purpose VLSI/WSI data-driven array. First, a brief overview of several architectural designs of the array is given. Next, three directions for the algorithmic improvement of a certain mapping
Program Mapping onto Network Processors by Recursive Bipartitioning and Refining
- DAC 2007
, 2007
"... Mapping packet processing applications onto embedded network processors (NP) is a challenging task due to the unique constraints of NP systems and the characteristics of network application domains. A remarkable difference with general multiprocessor task scheduling is that NPs are often programmed ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Mapping packet processing applications onto embedded network processors (NP) is a challenging task due to the unique constraints of NP systems and the characteristics of network application domains. A remarkable difference with general multiprocessor task scheduling is that NPs are often programmed
Automated Mapping of Structured Communication Graphs onto Mesh Interconnects
"... Network contention has an increasingly adverse effect on the performance of parallel ap-plications with increasing size of parallel machines. Machines of the petascale era are forcing application developers to map tasks intelligently to job partitions to achieve the best perfor-mance possible. This ..."
Abstract
- Add to MetaCart
. This paper presents a framework for automated mapping of parallel appli-cations with structured communication graphs to two and three dimensional mesh networks. We present several heuristic techniques for mapping 2D object graphs to 2D and 3D processor graphs and compare their performance with other
Performance Analysis of Mesh Interconnection Networks with Deterministic Routing
- IEEE Transactions on Parallel and Distributed Systems
, 1994
"... This paper develops detailed analytical performance models for k-ary n-cube networks with single-flit or infinite buffers, wormhole routing, and the non-adaptive deadlock-free routing scheme proposed by Dally and Seitz. In contrast to previous performance studies of such networks, the system is mode ..."
Abstract
-
Cited by 71 (2 self)
- Add to MetaCart
is modeled as a closed queueing network that (1) includes the effects of blocking and pipelining of messages in the network, (2) allows for arbitrary source-destination probability distributions, and (3) explicitly models the virtual channels used in the deadlock-free routing algorithm. The models are used
An Improved Mapping of Cyclic Elimination onto Hypercubes using Data Replication
"... In this paper, we propose a new mapping of the Cyclic Elimination (CE) algorithm for the solution of block tridiagonal linear system of equations onto hypercube multiprocessors. Unlike the previous mapping schemes, in our mapping of the CE algorithm all communications are restricted to physically ad ..."
Abstract
- Add to MetaCart
In this paper, we propose a new mapping of the Cyclic Elimination (CE) algorithm for the solution of block tridiagonal linear system of equations onto hypercube multiprocessors. Unlike the previous mapping schemes, in our mapping of the CE algorithm all communications are restricted to physically
Arbitrary–Lagrangian–Eulerian One–Step WENO Finite Volume Schemes on Unstructured Triangular Meshes
- Communications in Computational Physics 2013
"... Abstract. In this article we present a new class of high order accurate Arbitrary– Eulerian–Lagrangian (ALE) one–step WENO finite volume schemes for solving non-linear hyperbolic systems of conservation laws on moving two dimensional unstruc-tured triangular meshes. A WENO reconstruction algorithm i ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract. In this article we present a new class of high order accurate Arbitrary– Eulerian–Lagrangian (ALE) one–step WENO finite volume schemes for solving non-linear hyperbolic systems of conservation laws on moving two dimensional unstruc-tured triangular meshes. A WENO reconstruction algorithm
General Mapping of Feed-Forward Neural Networks onto an MIMD Computer
, 1995
"... This paper describes a scheme for mapping the back propagation algorithm onto an MIMD computer with 2D-torus network. We propose a new strategy that allows arbitrary assignment of processors to the multiple degrees of back propagation parallelism (training set parallelism, pipelining and node parall ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper describes a scheme for mapping the back propagation algorithm onto an MIMD computer with 2D-torus network. We propose a new strategy that allows arbitrary assignment of processors to the multiple degrees of back propagation parallelism (training set parallelism, pipelining and node
Performance Analysis of Mesh Interconnection Networks with Deterministic Routing
- IEEE Transactions on Parallel and Distributed Systems
, 1994
"... This paper develops detailed analytical performance models for k-ary n-cube networks with single-flit or infinite buffers, wormhole routing, and the non-adaptive deadlock-free routing scheme proposed by Dally and Seitz. In contrast to previous performance studies of such networks, the system is mo ..."
Abstract
- Add to MetaCart
is modeled as a closed queueing network that (1) includes the effects of blocking and pipelining of messages in the network, (2) allows for arbitrary source-destination probability distributions, and (3) explicitly models the virtual channels used in the deadlock-free routing algorithm. The models
A Performance Model and Code Overlay Generator for Scratchpad Enhanced Embedded Processors
"... Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing re-quired hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we a ..."
Abstract
- Add to MetaCart
Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing re-quired hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we
Results 1 - 10
of
92