Results 11 - 20
of
484
A Grid-Enabled MPI: Message Passing in Heterogeneous Distributed Computing Systems
, 1998
"... Application development for high-performance distributed computing systems, or computational grids as they are sometimes called, requires "grid-enabled" tools that hide mundane aspects of the heterogeneous grid environment without compromising performance. As part of an investigation of these issue ..."
Abstract
-
Cited by 108 (14 self)
- Add to MetaCart
Application development for high-performance distributed computing systems, or computational grids as they are sometimes called, requires "grid-enabled" tools that hide mundane aspects of the heterogeneous grid environment without compromising performance. As part of an investigation of these issues, we have developed MPICH-G, a grid-enabled implementation of the Message Passing Interface (MPI) that allows a user to run MPI programs across multiple computers at different sites using the same commands that would be usedonaparallel computer. This library extends the Argonne MPICH implementation of MPI to use services provided by the Globus grid toolkit. In this paper, we describe the MPICH-G implementation and present preliminary performance results.
CUMULVS: Providing Fault-Tolerance, Visualization and Steering of Parallel Applications
- International Journal of High Performance Computing Applications
, 1996
"... The use of visualization and computational steering can often assist scientists in analyzing large-scale scientific applications. Fault-tolerance to failures is of great importance when running on a distributed system. However, the details of implementing these features are complex and tedious, l ..."
Abstract
-
Cited by 103 (5 self)
- Add to MetaCart
The use of visualization and computational steering can often assist scientists in analyzing large-scale scientific applications. Fault-tolerance to failures is of great importance when running on a distributed system. However, the details of implementing these features are complex and tedious, leaving many scientists with inadequate development tools. CUMULVS is a library that enables programmers to easily incorporate interactive visualization and computational steering into existing parallel programs. The library is divided into two pieces: one for the application program and one for the, possibly commercial, visualization and steering front-end. Together these two libraries encompass all the connection and data protocols needed to dynamically attach multiple independent viewer front-ends to a running parallel application. Viewer programs can also steer one or more user-defined parameters to "close the loop" for computational experiments and analyses. CUMULVS allows the pr...
User's Guide for mpich, a Portable Implementation of MPI Version 1.2.1
, 1996
"... 1 1 Introduction 2 2 Linking and running programs 2 2.1 Scripts to Compile and Link Applications . . . . . . . . . . . . . . . . . . . 3 2.1.1 Fortran 90 and the MPI module . . . . . . . . . . . . . . . . . . . . 4 2.2 Compiling and Linking without the Scripts . . . . . . . . . . . . . . . . . . 4 2 ..."
Abstract
-
Cited by 101 (10 self)
- Add to MetaCart
1 1 Introduction 2 2 Linking and running programs 2 2.1 Scripts to Compile and Link Applications . . . . . . . . . . . . . . . . . . . 3 2.1.1 Fortran 90 and the MPI module . . . . . . . . . . . . . . . . . . . . 4 2.2 Compiling and Linking without the Scripts . . . . . . . . . . . . . . . . . . 4 2.3 Running with mpirun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3.1 SMP Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3.2 Multiple Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.4 More detailed control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 Special features of different systems 6 3.1 Workstation clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.1 Checking your machines list . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.2 Using the Secure Shell . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.3 Using the Secure Server . . . . . . . . . . . . . . . . ....
MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes
- In Supercomputing
, 2002
"... Global Computing platforms, large scale clusters and future TeraGRID systems gather thousands of nodes for computing parallel scientific applications. At this scale, node failures or disconnections are frequent events. This Volatility reduces the MTBF of the whole system in the range of hours or min ..."
Abstract
-
Cited by 93 (10 self)
- Add to MetaCart
Global Computing platforms, large scale clusters and future TeraGRID systems gather thousands of nodes for computing parallel scientific applications. At this scale, node failures or disconnections are frequent events. This Volatility reduces the MTBF of the whole system in the range of hours or minutes.
Wide-coverage efficient statistical parsing with CCG and log-linear models
- COMPUTATIONAL LINGUISTICS
, 2007
"... This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are "full" parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminativ ..."
Abstract
-
Cited by 87 (20 self)
- Add to MetaCart
This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are "full" parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminative training is used to estimate the models, which requires incorrect parses for each sentence in the training data as well as the correct parse. The lexicalized grammar formalism used is Combinatory Categorial Grammar (CCG), and the grammar is automatically extracted from CCGbank, a CCG version of the Penn Treebank. The combination of discriminative training and an automatically extracted grammar leads to a significant memory requirement (over 20 GB), which is satisfied using a parallel implementation of the BFGS optimisation algorithm running on a Beowulf cluster. Dynamic programming over a packed chart, in combination with the parallel implementation, allows us to solve one of the largest-scale estimation problems in the statistical parsing literature in under three hours. A key component of the parsing system, for both training and testing, is a Maximum Entropy supertagger which assigns CCG lexical categories to words in a sentence. The supertagger makes the discriminative training feasible, and also leads to a highly efficient parser. Surprisingly,
An annotation language for optimizing software libraries
- In Second Conference on Domain Specific Languages
, 1999
"... Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. ..."
Abstract
-
Cited by 82 (15 self)
- Add to MetaCart
Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.
Distributed Computing in a Heterogeneous Computing Environment
- Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Sciences
, 1998
"... Distributed computing is a means to overcome the limitations of single computing systems. In this paper we describe how clusters of heterogeneous supercomputers can be used to run a single application or a set of applications. We concentrate on the communication problem in such a configuration and p ..."
Abstract
-
Cited by 80 (8 self)
- Add to MetaCart
Distributed computing is a means to overcome the limitations of single computing systems. In this paper we describe how clusters of heterogeneous supercomputers can be used to run a single application or a set of applications. We concentrate on the communication problem in such a configuration and present a software library called PACX-MPI that was developed to allow a single system image from the point of view of an MPI programmer. We describe the concepts that have been implemented for heterogeneous clusters of this type and give a description of real applications using this library.
Managing Multiple Communication Methods in High-Performance Networked Computing Systems
- Journal of Parallel and Distributed Computing
, 1997
"... Modern networked computing environments and applications often require---or can benefit from---the use of multiple communication substrates, transport mechanisms, and protocols, chosen according to where communication is directed, what is communicated, or when communication is performed. We propose ..."
Abstract
-
Cited by 79 (13 self)
- Add to MetaCart
Modern networked computing environments and applications often require---or can benefit from---the use of multiple communication substrates, transport mechanisms, and protocols, chosen according to where communication is directed, what is communicated, or when communication is performed. We propose techniques that allow multiple communication methods to be supported transparently in a single application, with either automatic or user-specified selection criteria guiding the methods used for each communication. We explain how communication link and remote service request mechanisms facilitate the specification and implementation of multimethod communication. These mechanisms have been implemented in the Nexus multithreaded runtime system, and we use this system to illustrate solutions to various problems that arise when implementing multimethod communication. We also illustrate the application of our techniques by describing a multimethod, multithreaded implementation of the Message Pas...
The LAM/MPI checkpoint/restart framework: System-initiated checkpointing
- in Proceedings, LACSI Symposium, Sante Fe
, 2003
"... As high-performance clusters continue to grow in size and popularity, issues of fault tolerance and reliability are becoming limiting factors on application scalability. To address these issues, we present the design and implementation of a system for providing coordinated checkpointing and rollback ..."
Abstract
-
Cited by 67 (7 self)
- Add to MetaCart
As high-performance clusters continue to grow in size and popularity, issues of fault tolerance and reliability are becoming limiting factors on application scalability. To address these issues, we present the design and implementation of a system for providing coordinated checkpointing and rollback recovery for MPI-based parallel applications. Our approach integrates the Berkeley Lab BLCR kernellevel process checkpoint system with the LAM implementation of MPI through a defined checkpoint/restart interface. Checkpointing is transparent to the application, allowing the system to be used for cluster maintenance and scheduling reasons as well as for fault tolerance. Experimental results show negligible communication performance impact due to the incorporation of the checkpoint support capabilities into LAM/MPI. 1
Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance
, 2000
"... The ecient implementation of collective communication operations has received much attention. Initial eorts modeled network communication and produced \optimal" trees based on those models. However, the models used by these initial eorts assumed equal point-to-point latencies between any two process ..."
Abstract
-
Cited by 67 (10 self)
- Add to MetaCart
The ecient implementation of collective communication operations has received much attention. Initial eorts modeled network communication and produced \optimal" trees based on those models. However, the models used by these initial eorts assumed equal point-to-point latencies between any two processes. This assumption is violated in heterogeneous systems such as clusters of SMPs and wide-area \computational grids", and as a result, collective operations that utilize the trees generated by these models perform suboptimally. In response, more recent work has focused on creating topology-aware trees for collective operations that minimize communication across slower channels (e.g., a wide-area network). While these efforts have signicant communication benets, they all limit their view of the network to only two layers. We present a strategy based upon a multilayer view of the network. By creating multilevel topology trees we take advantage of communication cost dierences at every lev...

