Results 1 - 10
of
14
Compiling Fortran 90D/HPF for distributed memory MIMD computers
- Journal of Parallel and Distributed Computing
, 1994
"... This paper describes the design of the Fortran90D/HPF compiler, a source-to-source parallel compiler for distributed memory systems being developed at Syracuse University. Fortran 90D/HPF is a data parallel language with special directives to specify data alignment and distributions. A systematic me ..."
Abstract
-
Cited by 41 (3 self)
- Add to MetaCart
This paper describes the design of the Fortran90D/HPF compiler, a source-to-source parallel compiler for distributed memory systems being developed at Syracuse University. Fortran 90D/HPF is a data parallel language with special directives to specify data alignment and distributions. A systematic methodology to process distribution directives of Fortran 90D/HPF is presented. Furthermore, techniques for data and computation partitioning, communication detection and generation, and the run-time support for the compiler are discussed. Finally, initial performance results for the compiler are presented. We believe that the methodology to process data distribution, computation partitioning, communication system design and the overall compiler design can be used by the implementors of compilers for HPF.
Empirical Analysis of Overheads in Cluster Environments
- CONCURRENCY: PRACTICE AND EXPERIENCE
, 1995
"... In concurrent computing environments that are based on heterogeneous processing elements interconnected by general-purpose networks, several classes of overheads contribute to lowered performance. In an attempt to gain a deeper insight into the exact nature of these overheads, and to develop stra ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
In concurrent computing environments that are based on heterogeneous processing elements interconnected by general-purpose networks, several classes of overheads contribute to lowered performance. In an attempt to gain a deeper insight into the exact nature of these overheads, and to develop strategies to alleviate them, we have conducted empirical studies of selected applications representing different classes of concurrent programs. These analyses have identified load imbalance, the parallelism model adopted, communication delay and throughput, and system factors as the primary factors affecting performance in cluster environments. Based on the degree to which these factors affect specific classes of applications, we propose a combination of model selection criteria, partitioning strategies, and software system heuristics to reduce overheads and enhance performance in network based environments. We demonstrate that agenda parallelism and load balancing strategies contribu...
EcliPSe: A System for High Performance Concurrent Simulation
, 1991
"... this paper describes our approach from the system point of view. The programming interface is described in detail in the next section, following which the design and salient implementation aspects are discussed. Representative results from a few simulation systems are then reported, and the conclud ..."
Abstract
-
Cited by 20 (10 self)
- Add to MetaCart
this paper describes our approach from the system point of view. The programming interface is described in detail in the next section, following which the design and salient implementation aspects are discussed. Representative results from a few simulation systems are then reported, and the concluding section discusses some of the critical issues in such an approach, the implications for applications other than stochastic simulation, and ongoing and future work
Architecture Independent Massive Parallelization of Divide-and-Conquer Algorithms
- Mathematics of Program Construction, Lecture Notes in Computer Science 947
, 1995
"... . We present a strategy to develop, in a functional setting, correct, efficient and portable Divide-and-Conquer (DC) programs for massively parallel architectures. Starting from an operational DC program, mapping sequences to sequences, we apply a set of semantics preserving transformation rules, wh ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
. We present a strategy to develop, in a functional setting, correct, efficient and portable Divide-and-Conquer (DC) programs for massively parallel architectures. Starting from an operational DC program, mapping sequences to sequences, we apply a set of semantics preserving transformation rules, which transform the parallel control structure of DC into a sequential control flow, thereby making the implicit data parallelism in a DC scheme explicit. In the next phase of our strategy, the parallel architecture is fully expressed, where `architecture dependent' higher-order functions are introduced. Then -- due to the rising communication complexities on particular architectures -- topology dependent communication patterns are optimized in order to reduce the overall communication costs. The advantages of this approach are manifold and are demonstrated with a set of non-trivial examples. 1 Introduction It is well-known that the main problems in exploiting the power of modern parallel sys...
Software and hardware requirements for some applications of parallel computing to industrial problems
, 1995
"... We discuss the hardware and software requirements that appear relevant for a set of industrial applications of parallel computing. these are divided into 33 separate categories, and come from a recent survey of industry in New York State. The software discussions includes data parallel languages, me ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
We discuss the hardware and software requirements that appear relevant for a set of industrial applications of parallel computing. these are divided into 33 separate categories, and come from a recent survey of industry in New York State. The software discussions includes data parallel languages, message passing, databases, and high-level integration systems. The analysis is based on a general classification of problem architectures originally developed for academic applications of parallel computing. Suitable hardware architectures are suggested for each application. The general discussion is crystalized with three case studies: computational chemistry, computational fluid dynamics, including manufacturing, and Monte Carlo Methods.
An Application Perspective on High-Performance Computing and Communications
, 1996
"... We review possible and probable industrial applications of HPCC focusing on the software and hardware issues. Thirty-three separate categories are illustrated by detailed descriptions of five areas -- computational chemistry; Monte Carlo methods from physics to economics; manufacturing; and computat ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
We review possible and probable industrial applications of HPCC focusing on the software and hardware issues. Thirty-three separate categories are illustrated by detailed descriptions of five areas -- computational chemistry; Monte Carlo methods from physics to economics; manufacturing; and computational fluid dynamics; command and control; or crisis management; and multimedia services to client computers and settop boxes. The hardware varies from tightly-coupled parallel supercomputers to heterogeneous distributed systems. The software models span HPF and data parallelism, to distributed information systems and object/data ow parallelism on the Web. We find that in each case, it is reasonably clear that "HPCC works in principle," and postulate that this knowledge can be used in a new generation of software infrastructure based on the WebWindows approach, and discussed in an accompanying paper.
Massive parallelization of divide-and-conquer algorithms over powerlists. Science of Computer Programming, 26:59--78
- In 4th Principles and Practice of Parallel Programming
, 1996
"... It contains all proofs of the introduced transformation rules as well as programming examples on a SIMD computer. ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
It contains all proofs of the introduced transformation rules as well as programming examples on a SIMD computer.
DCL: Protocols and Primitives for Distributed and Concurrent Computing in Networked Environments
- International Conference on Computing and Information
, 1993
"... We present design and implementation strategies for providing general purpose distributed computing primitives on computer networks. This suite of primitives is intended to be a framework within which distributed and concurrent applications may be built in networked environments, in the absence of a ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
We present design and implementation strategies for providing general purpose distributed computing primitives on computer networks. This suite of primitives is intended to be a framework within which distributed and concurrent applications may be built in networked environments, in the absence of a distributed operating system. The proposed constructs are derived from typical application requirements, and include group communications, synchronization and recovery, and integrated distributed primitives such as mutual exclusion and consensus. We define an extensible suite of general purpose distributed computing primitives, discuss algorithms for their implementation, and present performance results and experiences. 1. Introduction Distributed applications that execute on networks of computer systems are rapidly increasing in number and variety. Applications that are inherently distributed are evolving, and traditional applications are changing to exploit the many benefits of distribut...
FAILURE-RESILIENT COMPUTATIONS IN THE EcliPSe SYSTEM
- in Proceedings of the International Conference on Parallel Processing
, 1994
"... Local or wide-area connected workstation cluster-based computation systems are inherently failure-prone, particularly for long running computations. In this work we introduce a variety of features for failure resilience in the EcliPSe system for replicative applications. Key characteristics of fault ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Local or wide-area connected workstation cluster-based computation systems are inherently failure-prone, particularly for long running computations. In this work we introduce a variety of features for failure resilience in the EcliPSe system for replicative applications. Key characteristics of fault-tolerant EcliPSe are ease of use, low statesaving costs, system scalability and good performance. 1 INTRODUCTION Cluster computing, a low-cost alternative to supercomputers, involves the use of workstation clusters to solve compute-intensive problems with solutions that are amenable to distribution [15]. In recent years, this mode of computation has grown to envelop an increasing number of applications, mainly for scientific problems. At the present time, heterogeneous workstation clusters are not ideal replacements for supercomputers, mainly because of their low interconnection bandwidth and reliability. Today's relatively low speed networks and communication protocols, not really design...
Strategies For The Modelling And Simulation Of Asynchronous Computer Architectures
, 1995
"... 15 Preface 19 Acknowledgements 22 1 Introduction 24 1.1 Background : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 24 1.2 Motivation and Objectives : : : : : : : : : : : : : : : : : : : : : : 24 1.3 Structure of the Thesis : : : : : : : : : : : : : : : : : : : : : : : : 25 1.3.1 Related ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
15 Preface 19 Acknowledgements 22 1 Introduction 24 1.1 Background : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 24 1.2 Motivation and Objectives : : : : : : : : : : : : : : : : : : : : : : 24 1.3 Structure of the Thesis : : : : : : : : : : : : : : : : : : : : : : : : 25 1.3.1 Related Publications : : : : : : : : : : : : : : : : : : : : : 27 2 The Quest for High Performance 28 2.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 28 2.2 Bit and Instruction Level Parallelism : : : : : : : : : : : : : : : : 29 2.3 Reduced Instruction Set Computers : : : : : : : : : : : : : : : : : 30 2.4 The Limits of Sequential Computation : : : : : : : : : : : : : : : 31 2.5 Parallel Computer Architectures : : : : : : : : : : : : : : : : : : : 32 2.5.1 SIMD : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 33 2.5.2 MIMD : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 34 2.5.2.1 Shared Memory MIMD Architectures : : : : : : : 34 2.5.2.2 Distributed M...

