Results 1  10
of
14
Distributed pC++: Basic Ideas for an Object Parallel Language
 Scientific Programming
"... pC++ is an objectparallel extension to the C++ programming language. This paper describes the current language definition and illustrates the programming style. Examples of parallel linear algebra operations are presented and a fast poisson solver is described in complete detail. ..."
Abstract

Cited by 102 (2 self)
 Add to MetaCart
pC++ is an objectparallel extension to the C++ programming language. This paper describes the current language definition and illustrates the programming style. Examples of parallel linear algebra operations are presented and a fast poisson solver is described in complete detail.
A Unified Vector/Scalar FloatingPoint Architecture
, 1989
"... research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There is a second research laboratory located in Palo Al ..."
Abstract

Cited by 32 (9 self)
 Add to MetaCart
research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There is a second research laboratory located in Palo Alto, the Systems Research Center (SRC). Other Digital research groups are located in Paris (PRL) and in Cambridge,
The Unsymmetric Lanczos Algorithms And Their Relations To Padé Approximation, Continued Fractions, And The QD Algorithm
, 1990
"... . First, several algorithms based on the unsymmetric Lanczos process are surveyed: the biorthogonalization (BO) algorithm for constructing a tridiagonal matrix T similar to a given matrix A (whose extreme spectrum is sought typically); the "BOBC algorithm", which generates directly the LU factors of ..."
Abstract

Cited by 23 (4 self)
 Add to MetaCart
. First, several algorithms based on the unsymmetric Lanczos process are surveyed: the biorthogonalization (BO) algorithm for constructing a tridiagonal matrix T similar to a given matrix A (whose extreme spectrum is sought typically); the "BOBC algorithm", which generates directly the LU factors of T ; the Biores (Lanczos/Orthores), Biomin (Lanczos/Orthomin or biconjugate gradient (BCG)), and the Biodir (Lanczos/Orthodir) algorithms for solving a nonsymmetric system of linear equations. The possibilities of breakdown in these algorithms are discussed and brought into relation. Then the connections to formal orthogonal polynomials, Pad'e approximation, continued fractions, and the qd algorithm are reviewed. They allow us to deapen our understanding of breakdowns. Next, three types of (bi)conjugate gradient squared (CGS) algorithms are presented: Biores 2 , Biomin 2 (standard CGS), and Biodir 2 . Finally, fast Hankel solvers related to the Lanczos process are described. 1 Key ...
Overview of recent supercomputers
, 1997
"... In this report we give an overview of parallel and vector computers which are currently available or will become available within a short time frame from vendors; no attempt is made to list all machines that are still in the research phase. The machines are described according to their architectura ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
In this report we give an overview of parallel and vector computers which are currently available or will become available within a short time frame from vendors; no attempt is made to list all machines that are still in the research phase. The machines are described according to their architectural class. Shared and distributed memory SIMD and MIMD machines are discerned. The information about each machine is kept as compact as possible. Moreover, no attempt is made to quote prices as these are often even more elusive than the performance of a system.
High Performance Distributed Computing
 Syracuse University
, 1995
"... High Performance Distributed Computing (HPDC) is driven by the rapid advance of two related technologies  those underlying computing and communications, respectively. These technology pushes are linked to application pulls, which vary from the use of a cluster of some 20 workstations simulating fl ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
High Performance Distributed Computing (HPDC) is driven by the rapid advance of two related technologies  those underlying computing and communications, respectively. These technology pushes are linked to application pulls, which vary from the use of a cluster of some 20 workstations simulating fluid flow around an aircraft, to the complex linkage of several hundred million advanced PCs around the globe to deliver and receive multimedia information. The review of base technologies and exemplar applications is followed by a brief discussion of software models for HPDC, which are illustrated by two extremes  PVM and the conjectured future World Wide Web based WebWork concept. The narrative is supplemented by a glossary describing the diverse concepts used in HPDC.
MultiEDA: A Programming Environment for Parallel Computations
"... This report presents the software implementation of the Extended Dataflow Actor model, EDA, using the Parallel Virtual Machine, PVM, system [2]. A formal description of the EDA model can be found in [1]. The goal of our research has been to develop MultiEDA, which is a programming environment for t ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
This report presents the software implementation of the Extended Dataflow Actor model, EDA, using the Parallel Virtual Machine, PVM, system [2]. A formal description of the EDA model can be found in [1]. The goal of our research has been to develop MultiEDA, which is a programming environment for testing and evaluating the different aspects of the EDA model using PVM. Several applications were tested in this environment using a cluster of workstations. The remainder of this report is organized as follows. Section 2 briefly introduces the EDA model. In section 3 an overview of the most important features of the PVM system is given. Section 4 describes the MultiEDA, mEDA, environment and the next section presents some applications which were developed using mEDA. Our conclusions are given in section 6. 2 The EDA Model
Basic Issues and Current Status of Parallel Computing  1995
"... The best enterprises have both a compelling need pulling them forward and an innovative technological solution pushing them on. In highperformance computing, we have the need for increased computational power in many applications and the inevitable longterm solution is massive parallelism. In the ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The best enterprises have both a compelling need pulling them forward and an innovative technological solution pushing them on. In highperformance computing, we have the need for increased computational power in many applications and the inevitable longterm solution is massive parallelism. In the short term, the relation between pull and push may seem unclear as novel algorithms and software are needed to support parallel computing. However, eventually parallelism will be present in all computers  including those in your children's video game, your personal computer or workstation, and the central supercomputer. The technological driving force is VLSI, or very large scale integration  the same technology that has created the personal computer and workstation market over the last decade. In 1980, the Intel 8086 used 50,000 transistors while in 1992 the latest Digital alpha RISC chip contains 1:7 10 6 transistorsa factor of 30 increase. In 1995, the 167 Mhz Ultrasparc contained 5:2 10 6 transistors divided roughtly 2:1 btween CPU and cache.
Array Processing Machines
 Budach (Ed.), Fundamentals of Computational Theory 1985, Cottbus GDR, SpringerVerlag, LNCS 199
, 1984
"... We present a new model of parallel computation called the "array processing machine" or APM (for short). The APM was designed to closely model the architecture of existing vector and array proces sots, and to provide a suitable unifying framework for the complexity theory of parallel combinatorial ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We present a new model of parallel computation called the "array processing machine" or APM (for short). The APM was designed to closely model the architecture of existing vector and array proces sots, and to provide a suitable unifying framework for the complexity theory of parallel combinatorial and numerical algorithms. After an introduction to the model and its basic programming techniques, we show that the APM can efficiently simulate a variety of extant models of parallel computation and vector processing. In particular it is shown that APMs satisfy Ooldschlager's "parallel computation thesis".
Massively parallel algorithms for realtime wavcfront control of a dense adaptive optics system
 J. Opt. Sot. Am. A (submitted
"... In this paper massively parallel algorithms and architectures for realtime wavefront control of a dense adaptive optic system (’SXLENE) are presented. We have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution c}f a discrete Poisson equa ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
In this paper massively parallel algorithms and architectures for realtime wavefront control of a dense adaptive optic system (’SXLENE) are presented. We have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution c}f a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm pc,ses a computationally challenging pxoblern since it demands a sustained computational throughput of the order of 10 GFlops. We develop a novel algorithm, designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, our algorithm is signif~cantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. We also discuss two massively parallel, algorithmically special~zed, architectures for lowcost and optimal implementation of the Fast Invariant Imbedding algorithm. 1.