Results 1 - 10
of
48
User's Guide for mpich, a Portable Implementation of MPI Version 1.2.1
, 1996
"... 1 1 Introduction 2 2 Linking and running programs 2 2.1 Scripts to Compile and Link Applications . . . . . . . . . . . . . . . . . . . 3 2.1.1 Fortran 90 and the MPI module . . . . . . . . . . . . . . . . . . . . 4 2.2 Compiling and Linking without the Scripts . . . . . . . . . . . . . . . . . . 4 2 ..."
Abstract
-
Cited by 101 (10 self)
- Add to MetaCart
1 1 Introduction 2 2 Linking and running programs 2 2.1 Scripts to Compile and Link Applications . . . . . . . . . . . . . . . . . . . 3 2.1.1 Fortran 90 and the MPI module . . . . . . . . . . . . . . . . . . . . 4 2.2 Compiling and Linking without the Scripts . . . . . . . . . . . . . . . . . . 4 2.3 Running with mpirun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3.1 SMP Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3.2 Multiple Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.4 More detailed control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 Special features of different systems 6 3.1 Workstation clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.1 Checking your machines list . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.2 Using the Secure Shell . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.3 Using the Secure Server . . . . . . . . . . . . . . . . ....
Vertical Profiling: Understanding the Behavior of Object-Oriented Applications
"... Object-oriented programming languages provide a rich set of features that provide significant software engineering benefits. The increased productivity provided by these features comes at a justifiable cost in a more sophisticated runtime system whose responsibility is to implement these features e# ..."
Abstract
-
Cited by 47 (14 self)
- Add to MetaCart
Object-oriented programming languages provide a rich set of features that provide significant software engineering benefits. The increased productivity provided by these features comes at a justifiable cost in a more sophisticated runtime system whose responsibility is to implement these features e#ciently. However, the virtualization introduced by this sophistication provides a significant challenge to understanding complete system performance, not found in traditionally compiled languages, such as C or C++. Thus, understanding system performance of such a system requires profiling that spans all levels of the execution stack, such as the hardware, operating system, virtual machine, and application.
Using Hardware Performance Monitors to Understand the Behavior of Java Applications
- IN PROC. OF THE THIRD USENIX VIRTUAL MACHINE RESEARCH AND TECHNOLOGY SYMP
, 2004
"... Modern Java programs, such as middleware and application servers, include many complex software components. Improving the performance of these Java applications requires a better understanding of the interactions between the application, virtual machine, operating system, and architecture. Hardware ..."
Abstract
-
Cited by 33 (8 self)
- Add to MetaCart
Modern Java programs, such as middleware and application servers, include many complex software components. Improving the performance of these Java applications requires a better understanding of the interactions between the application, virtual machine, operating system, and architecture. Hardware performance monitors, which are available on most modern processors, provide facilities to obtain detailed performance measurements of long-running applications in real time. However, interpreting the data collected using hardware performance monitors is difficult because of the low-level nature of the data. We have
From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems
- In Proc. of SC2000: High Performance Networking and Computing
, 2000
"... In this paper we describe a trace analysis framework, from trace generation to visualization. It includes a unified tracing facility on IBM SP systems, a self-defining interval file format, an API for framework extensions, utilities for merging and statistics generation, and a visualization tool ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
In this paper we describe a trace analysis framework, from trace generation to visualization. It includes a unified tracing facility on IBM SP systems, a self-defining interval file format, an API for framework extensions, utilities for merging and statistics generation, and a visualization tool with preview and multiple time-space diagrams. The trace environment is extremely scalable, and combines MPI events with system activities in the same set of trace files, one for each SMP node. Since the amount of trace data may be very large, utilities are developed to convert and merge individual trace files into a self-defining interval trace file with multiple frame directories. The interval format allows the development of multiple time-space diagrams, such as thread-activity view, processoractivity view, etc., from the same interval file. A visualization tool, Jumpshot, is modified to visualize these views. A statistics utility is developed using the API, along with its graphics v...
Installation Guide to mpich, a Portable Implementation of MPI
, 1996
"... 1 1 Quick Start 1 2 Obtaining and Unpacking the Distribution 3 3 Documentation 5 4 Conguring mpich 5 4.1 Building a production mpich . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Preparing mpich for TotalView debugging . . . . . . . . . . . . . . . . . . . 16 4.3 What if there is no Fo ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
1 1 Quick Start 1 2 Obtaining and Unpacking the Distribution 3 3 Documentation 5 4 Conguring mpich 5 4.1 Building a production mpich . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Preparing mpich for TotalView debugging . . . . . . . . . . . . . . . . . . . 16 4.3 What if there is no Fortran compiler? . . . . . . . . . . . . . . . . . . . . . 16 4.4 Conguring with the Absoft Fortran Compiler . . . . . . . . . . . . . . . . . 16 4.5 Fortran 90 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.6 Special issues for heterogeneous networks . . . . . . . . . . . . . . . . . . . 17 4.7 Conguring with ssh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5 Compiling mpich 18 5.1 Getting tcl, tk, and wish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.2 Building multiple devices or architectures . . . . . . . . . . . . . . . . . . . 19 6 Running an MPI Program 19 7 MPE Library 19 7.1 Congure Options . . . . . . . ....
Automating Vertical Profiling
, 2005
"... Last year at OOPSLA we presented a methodology, vertical profiling, for understanding the performance of objectoriented programs. The key insight behind this methodology is that modern programs run on top of many layers (virtual machine, middleware, etc) and thus we need to collect and combine infor ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
Last year at OOPSLA we presented a methodology, vertical profiling, for understanding the performance of objectoriented programs. The key insight behind this methodology is that modern programs run on top of many layers (virtual machine, middleware, etc) and thus we need to collect and combine information from all layers in order to understand system performance. Although our methodology was able to explain previously unexplained performance phenomena, it was extremely labor intensive. In this paper we describe and evaluate techniques for automating two significant activities of vertical profiling: trace alignment and correlation. Trace alignment aligns traces obtained from separate runs so that one can reason across the traces. We are not aware of any prior approach that effectively and automatically aligns traces. Correlation sifts through hundreds of metrics to find ones that have a bearing on a performance anomaly of interest. In prior work we found that statistical correlation was only sometimes effective. We have identified highly-effective approaches for both activities. For aligning traces we explore dynamic time warping, and for correlation we explore eight correlators based on statistical correlation, distance measures, and piecewise linear segmentation. Although we explore these activities in the context of vertical profiling, both activities are widely applicable in the performance analysis area.
User's Guide for MPE: Extensions for MPI Programs
- Argonne National Laboratory
, 1998
"... � C ..."
Learning from the Success of MPI
, 2001
"... The Message Passing Interface (MPI) has been extremely successful as a portable way to program high-performance parallel computers. ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
The Message Passing Interface (MPI) has been extremely successful as a portable way to program high-performance parallel computers.
Sedna: A BPEL-based environment for visual scientific workflow modelling
- In Workflows for eScience - Scientific Workflows for Grids
, 2007
"... Scientific Grid computing environments are increasingly adopting the Open Grid Services Architecture (OGSA), which is a service-oriented architecture for Grids. With the proliferation of OGSA, Grids effectively consist of a collection of Grid services, Web services with certain extensions providing ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Scientific Grid computing environments are increasingly adopting the Open Grid Services Architecture (OGSA), which is a service-oriented architecture for Grids. With the proliferation of OGSA, Grids effectively consist of a collection of Grid services, Web services with certain extensions providing additional
Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer
- Supercomputer, Research Letters in the Information and Mathematical Sciences Volume 5, June 2003, ISSN
, 2003
"... This paper presents some performance results obtained from a new Beowulf cluster, the Helix, built at Massey University, Auckland funded by the Allan Wilson Center for Evolutionary Ecology. Issues concerning network latency and the e#ect of the switching fabric and network topology on performance ar ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
This paper presents some performance results obtained from a new Beowulf cluster, the Helix, built at Massey University, Auckland funded by the Allan Wilson Center for Evolutionary Ecology. Issues concerning network latency and the e#ect of the switching fabric and network topology on performance are discussed. In order to assess how the system performed using the message passing interface (MPI), two test suites (mpptest and jumpshot) were used to provide a comprehensive network performance analysis. The performance of an older fast-ethernet/single processor based cluster is compared to the new Gigabit/SMP cluster. The Linpack performance of Helix is investigated. The Linpack Rmax rating of 234.8 Gflops puts the cluster at third place in the Australia/ New Zealand sublist of the Top500 supercomputers, an extremely good performance considering the commodity parts and its low cost (US$125000)

