Results 21 - 30
of
123
Implementing a Low Cost, Low Latency Parallel Platform
- Parallel Computing
, 1997
"... The cost of high-performance parallel platforms prevents parallel processing techniques from spreading in present applications. Networks of Workstations (NOW) exploiting off-the-shelf communication hardware, high-end PCs and standard communication software provide much cheaper but poorly performing ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
The cost of high-performance parallel platforms prevents parallel processing techniques from spreading in present applications. Networks of Workstations (NOW) exploiting off-the-shelf communication hardware, high-end PCs and standard communication software provide much cheaper but poorly performing parallel platforms. In our NOW prototype called GAMMA (Genoa Active Message MAchine) every node is a PC running a Linux operating system kernel enhanced with efficient communication mechanisms based on the Active Message paradigm. Active Messages supply virtualization of the network interface close enough to the raw hardware to guarantee good performance. The preliminary performance measures obtained by GAMMA show how competitive such a cheap NOW is. 1 Introduction Historically Local Area Network (LAN) device drivers in the Operating System (OS) kernel of a workstation have never been optimized like other devices whose performance is critical for user applications (such as disk drivers, memo...
TEG: A high-performance, scalable, multi-network point-to-point communications methodology
- In Proceedings, 11th European PVM/MPI Users’ Group Meeting
, 2004
"... Abstract. TEG is a new component-based methodology for point-to-point messaging. Developed as part of the Open MPI project, TEG provides a configurable fault-tolerant capability for high-performance messaging that utilizes multi-network interfaces where available. Initial performance comparisons wit ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Abstract. TEG is a new component-based methodology for point-to-point messaging. Developed as part of the Open MPI project, TEG provides a configurable fault-tolerant capability for high-performance messaging that utilizes multi-network interfaces where available. Initial performance comparisons with other MPI implementations show comparable ping-pong latencies, but with bandwidths up to 30 % higher. 1
Automatic Binding of Native Scientific Libraries to Java
- In Scientific Computing in Object-Oriented Parallel Environments (New
, 1997
"... . We have created a tool for automatically binding existing native C libraries to Java. With the aid of the Java--to--C Interface generating tool (JCI) the abundance of existing C and Fortran-77 scientific libraries can more easily be made available to Java programmers. We have applied JCI to bind M ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
. We have created a tool for automatically binding existing native C libraries to Java. With the aid of the Java--to--C Interface generating tool (JCI) the abundance of existing C and Fortran-77 scientific libraries can more easily be made available to Java programmers. We have applied JCI to bind MPI, PBLAS, ScaLAPACK and other libraries to Java. The approach of automatic binding ensures both portability across different platforms and full compatibility with the library specifications. In order to evaluate the performance of Java code which accesses native libraries, we have run Java versions of parallel benchmarks from the ParkBench suite. The results obtained on a distributed--memory IBM SP2 machine demonstrate the viability of our approach. 1 Introduction As a programming language, Java has the basic qualities needed for writing high--performance applications. With the maturing of compilation technology, such applications written in Java will doubtlessly appear. Since Java is a fa...
Scalable fault tolerant protocol for parallel runtime environments
- In Ero PVM/MPI
, 2006
"... Abstract. The number of processors embedded on high performance computing platforms is growing daily to satisfy users desire for solving larger and more complex problems. Parallel runtime environments have to support and adapt to the underlying libraries and hardware which require a high degree of s ..."
Abstract
-
Cited by 11 (8 self)
- Add to MetaCart
Abstract. The number of processors embedded on high performance computing platforms is growing daily to satisfy users desire for solving larger and more complex problems. Parallel runtime environments have to support and adapt to the underlying libraries and hardware which require a high degree of scalability in dynamic environments. This paper presents the design of a scalable and fault tolerant protocol for supporting parallel runtime environment communications. The protocol is designed to support transmission of messages across multiple nodes with in a self-healing topology to protect against recursive node and process failures. A formal protocol verification has validated the protocol for both the normal and failure cases. We have implemented multiple routing algorithms for the protocol and concluded that the variant rulebased routing algorithm yields the best overall results for damaged and incomplete topologies. 1
Heterogeneous MPI Application Interoperation and Process Management under PVMPI
, 1997
"... Presently, different vendors' MPI implementations cannot interoperate directly with each other. As a result, performance of distributed computing across different vendors' machines requires use of a single MPI implementation, such as MPICH. This solution may be sub-optimal since it cannot utilize th ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Presently, different vendors' MPI implementations cannot interoperate directly with each other. As a result, performance of distributed computing across different vendors' machines requires use of a single MPI implementation, such as MPICH. This solution may be sub-optimal since it cannot utilize the vendors' own optimized MPI implementations. PVMPI, a software package currently under development at the University of Tennessee, provides the needed interoperability between different vendors' optimized MPI implementations. As the name suggests PVMPI is a powerful combination of the proven and widely ported Parallel Virtual Machine (PVM) system and MPI. PVMPI is transparent to MPI applications thus allowing intercommunication via all the MPI point-to-point calls. Additionally, PVMPI allows flexible control over MPI applications by providing access to all the process control and resource control functions available in the PVM virtual machine.
A Toolkit for Parallel Image Processing
- Proceedings of the SPIE Conference on Parallel and Distributed Methods for Image processing
, 1998
"... In this paper, we present the design and implementation of a parallel image processing software library (the Parallel Image Processing Toolkit). The Toolkit not only supplies a rich set of image processing routines, it is designed principally as an extensible framework containing generalized paralle ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
In this paper, we present the design and implementation of a parallel image processing software library (the Parallel Image Processing Toolkit). The Toolkit not only supplies a rich set of image processing routines, it is designed principally as an extensible framework containing generalized parallel computational kernels to support image processing. Users can easily add their own image processing routines without knowledge or explicit use of the underlying data distribution mechanisms or parallel computing model. Shared memory and multi-level memory hierarchies are exploited to achieve high performance on each node, thereby minimizing overall parallel execution time. Multiple load balancing schemes have been implemented within the parallel framework that transparently distribute the computational load evenly on a distributed memory computing environment. Inside the Toolkit, a message-passing model of parallelism is designed around the Message Passing Interface (MPI) standard. Experime...
On the Parallelization of UCT
"... Abstract. We present three parallel algorithms for UCT. For 9×9 Go, they all improve the results of the programs that use them against GNU GO 3.6. The simplest one, the single-run algorithm, uses very few communications and shows improvements comparable to the more complex ones. Further improvements ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Abstract. We present three parallel algorithms for UCT. For 9×9 Go, they all improve the results of the programs that use them against GNU GO 3.6. The simplest one, the single-run algorithm, uses very few communications and shows improvements comparable to the more complex ones. Further improvements may be possible sharing more information in the multiple-runs algorithm. 1
Open MPI: A high-performance, heterogeneous MPI
- In Proceedings of the Fifth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks
, 2006
"... The growth in the number of generally available, distributed, heterogeneous computing systems places increasing importance on the development of user-friendly tools that enable application developers to efficiently use these resources. Open MPI provides support for several aspects of heterogeneity w ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
The growth in the number of generally available, distributed, heterogeneous computing systems places increasing importance on the development of user-friendly tools that enable application developers to efficiently use these resources. Open MPI provides support for several aspects of heterogeneity within a single, open-source MPI implementation. Through careful abstractions, heterogeneous support maintains efficient use of uniform computational platforms. We describe Open MPI’s architecture for heterogeneous network and processor support. A key design features of this implementation is the transparency to the application developer while maintaining very high levels of performance. This is demonstrated with the results of several numerical experiments. 1.
Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI
- In ACM/IEEE SuperComputing (SC
, 2006
"... A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant programming environments should be used to guarantee the safe execution of critical applications. Research in fault tolerant M ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant programming environments should be used to guarantee the safe execution of critical applications. Research in fault tolerant MPI has led to the development of several fault tolerant MPI environments. Different approaches are being proposed using a variety of fault tolerant message passing protocols based on coordinated checkpointing or message logging. The most popular approach is with coordinated checkpointing. In the literature, two different concepts of coordinated checkpointing have been proposed: blocking and nonblocking. However they have never been compared quantitatively and their respective scalability remains unknown. The contribution of this paper is to provide the first comparison between these two approaches and a study of their scalability. We have implemented the two approaches within the MPICH environments and evaluate their performance using the NAS parallel benchmarks. 1

