Results 1 -
3 of
3
B.A.Abderazek: Performance Enhancement for Matrix Multiplication on a SMP PC Cluster, IPSJ SIG technical Report
- Department of Computer Science & Engineering at Visvesvaraya National Institute of Technology, Nagpur (India). His
, 2005
"... Our study proposes a Reducing-size Task Assignation technique (RTA), which is a novel approach to solve the grain-size problem for the hybrid MPI-OpenMP thread-to-thread (hybrid TC) programming model in performing distributed matrix mulitplication on SMP PC clusters. Applying RTA, hybrid TC achieves ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Our study proposes a Reducing-size Task Assignation technique (RTA), which is a novel approach to solve the grain-size problem for the hybrid MPI-OpenMP thread-to-thread (hybrid TC) programming model in performing distributed matrix mulitplication on SMP PC clusters. Applying RTA, hybrid TC achieves an acceptable computation performance while retaining the dynamic task scheduling capability, thereby it can yield a 22 % performance improvement for a 16-node cluster of Xeon dual-processor SMPs in comparison with the pure MPI model. Moreover, we provide formulas to predict hybrid TC performance in different circumstances. 1.
Parallel Homologous Search with Hirschberg Algorithm: A Hybrid MPI-Pthreads Solution
"... Abstract:- In this paper, we apply two different parallel programming model, the message passing model using Message Passing Interface (MPI) and the multithreaded model using Pthreads, to protein sequence homologous search. The protein sequence homologous search uses Hirschberg algorithm for the pai ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract:- In this paper, we apply two different parallel programming model, the message passing model using Message Passing Interface (MPI) and the multithreaded model using Pthreads, to protein sequence homologous search. The protein sequence homologous search uses Hirschberg algorithm for the pair-wise sequence alignment. The performance of the homologous search using the MPI-Pthread is compared to the implementation using pure message passing programming model MPI. The evaluation results show that there is a 50 % decrease in computing time when the parallel homologous search is implemented using MPI-Phtreads compared to when using MPI.
Optimization for Hybrid MPI-OpenMP Programs on a Cluster of SMP PCs
"... This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication method on a cluster of Dual Intel Xeon Processor SMPs connected by a Gigabit Ethernet network. The experiments include the well-known HPL and CG benchmarks. We also describe optimization tech-niques to ge ..."
Abstract
- Add to MetaCart
(Show Context)
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication method on a cluster of Dual Intel Xeon Processor SMPs connected by a Gigabit Ethernet network. The experiments include the well-known HPL and CG benchmarks. We also describe optimization tech-niques to get a high cache hit ratio with the given architecture. As a result, the hybrid model shows performance prominence over the pure MPI model with about 27 % for CG and 12 % for HPL. Besides, with a relatively small programming effort, we have succeeded in reducing the cache miss ratio and thus significantly risen up performance for the CG bench-mark as much as 4.5 times in some cases. 1