Results 1  10
of
21,013
Improving graph coloring on distributedmemory parallel computers
 in High Performance Computing (HiPC), 2011 18th International Conference on
, 2011
"... Graph coloring is a combinatorial optimization problem that classically appears in distributed computing to identify the sets of tasks that can be safely performed in parallel. Despite many existing efficient sequential algorithms being known for this NPComplete problem, distributed variants are ch ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Graph coloring is a combinatorial optimization problem that classically appears in distributed computing to identify the sets of tasks that can be safely performed in parallel. Despite many existing efficient sequential algorithms being known for this NPComplete problem, distributed variants
Multiclass Classification of Distributed Memory Parallel Computations
"... High Performance Computing (HPC) is a field concerned with solving largescale problems in science and engineering. However, the computational infrastructure of HPC systems can also be misused as demonstrated by the recent commoditization of cloud computing resources on the black market. As a first ..."
Abstract
 Add to MetaCart
step towards addressing this, we introduce a machine learning approach for classifying distributed parallel computations based on communication patterns between compute nodes. We first provide relevant background on message passing and computational equivalence classes called dwarfs and describe our
Efficient Algorithms for Data Distribution on Distributed Memory Parallel Computers
 IEEE Transactions on Parallel and Distributed Systems
, 1997
"... Data distribution has been one of the most important research topics in parallelizing compilers for distributed memory parallel computers. Good data distribution schema should consider both the computation load balance and the communication overhead. In this paper, we show that data redistribution ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
Data distribution has been one of the most important research topics in parallelizing compilers for distributed memory parallel computers. Good data distribution schema should consider both the computation load balance and the communication overhead. In this paper, we show that data redistribution
Automatic Data and Computation Decomposition on Distributed Memory Parallel Computers
 ACM Trans. Programming Languages and Systems
, 2002
"... On shared memory parallel computers (SMPCs) it is natural to focus on decomposing the computation (mainly by distributing the iterations of the nested DoLoops). In contrast, on distributed memory parallel computers (DMPCs) the decomposition of computation and the distribution of data must both be h ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
On shared memory parallel computers (SMPCs) it is natural to focus on decomposing the computation (mainly by distributing the iterations of the nested DoLoops). In contrast, on distributed memory parallel computers (DMPCs) the decomposition of computation and the distribution of data must both
A9639989 Aerodynamic Shape Optimization of Supersonic Aircraft Configurations via an Adjoint Formulation on Distributed Memory Parallel Computers
"... shape optimization of supersonic aircraft configurations via an adjoint formulation on distributed memory parallel computers ..."
Abstract
 Add to MetaCart
shape optimization of supersonic aircraft configurations via an adjoint formulation on distributed memory parallel computers
An Efficient Sparse MatrixVector Multiplication on Distributed Memory Parallel Computers
, 2006
"... The matrixvector product is one of the most important computational components of Krylov methods. This kernel is an irregular problem, which has led to the development of several compressed storage formats. We design a data structure for distributed matrix to compute the matrixvector product effic ..."
Abstract
 Add to MetaCart
efficiently on distributed memory parallel computers using MPI. We conduct numerical experiments on several different sparse matrices and show the parallel performance of our sparse matrixvector product routines. Key words: Sparse matrices, matrixvector product, sparse storage formats, distributed computing
Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers
 Journal of Parallel and Distributed Computing
"... Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N), where 2 < 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(log N) time by using N = log N processors. Such a parallel comp ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N), where 2 < 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(log N) time by using N = log N processors. Such a parallel
The spectral decomposition of nonsymmetric matrices on distributed memory parallel computers
 SIAM J. Sci. Comput
, 1997
"... Abstract. The implementation and performance of a class of divideandconquer algorithms for computing the spectral decomposition of nonsymmetric matrices on distributed memory parallel computers are studied in this paper. After presenting a general framework, we focus on a spectral divideandconqu ..."
Abstract

Cited by 34 (10 self)
 Add to MetaCart
Abstract. The implementation and performance of a class of divideandconquer algorithms for computing the spectral decomposition of nonsymmetric matrices on distributed memory parallel computers are studied in this paper. After presenting a general framework, we focus on a spectral divide
Solving Linear and Quadratic Matrix Equations on Distributed Memory Parallel Computers
"... We discuss the parallel implementation of numerical solution methods for linear and quadratic matrix equations occurring frequently in control theory. In particular we consider equations related to analysis and synthesis of continuoustime, linear timeinvariant control systems. These are the Sylves ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
efficient and usually outperform methods based on the QR or QZ algorithms even in sequential computing environments. We discuss the implementation of these methods on distributed memory parallel computers employing MPI and ScaLAPACK.
Efficient AlltoAll Broadcast Schemes in DistributedMemory Parallel Computers
"... Distributedmemory parallel computers refers to parallel computers in which each processor has its own private memory. In such a system, processors communicate by message passing via interconnected network. One of important method to share distributed data is broadcasting. Alltoall broadcasting is ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Distributedmemory parallel computers refers to parallel computers in which each processor has its own private memory. In such a system, processors communicate by message passing via interconnected network. One of important method to share distributed data is broadcasting. Alltoall broadcasting
Results 1  10
of
21,013