Results 1 - 10
of
34
The Globus Project: A Status Report
, 1998
"... The Globus project is a multi-institutional research e#ort that seeks to enable the construction of computational grids providing pervasive, dependable, and consistent access to high-performance computational resources, despite geographical distribution of both resources and users. Computational gri ..."
Abstract
-
Cited by 267 (18 self)
- Add to MetaCart
The Globus project is a multi-institutional research e#ort that seeks to enable the construction of computational grids providing pervasive, dependable, and consistent access to high-performance computational resources, despite geographical distribution of both resources and users. Computational grid technology is being viewed as a critical element of future highperformance computing environments that will enable entirely new classes of computation-oriented applications, much as the World Wide Web fostered the development of new classes of information-oriented applications. In this paper, we report on the status of the Globus project as of early 1998. We describe the progress that has been achieved to date in the development of the Globus toolkit, a set of core services for constructing grid tools and applications. We also discuss on the Globus Ubiquitous Supercomputing Testbed (GUSTO) that we have constructed to enable largescale evaluation of Globus technologies, and review early exp...
GASS: A Data Movement and Access Service for Wide Area Computing Systems
- PROCEEDINGS OF THE SIXTH WORKSHOP ON I/O IN PARALLEL AND DISTRIBUTED SYSTEMS
, 1999
"... In wide area computing, programs frequently execute at sites that are distant from their data. Data access mechanisms are required that place limited functionality demands on an application or host system yet permit high-performance implementations. To address these requirements, we propose a data m ..."
Abstract
-
Cited by 149 (10 self)
- Add to MetaCart
In wide area computing, programs frequently execute at sites that are distant from their data. Data access mechanisms are required that place limited functionality demands on an application or host system yet permit high-performance implementations. To address these requirements, we propose a data movement and access service called Global Access to Secondary Storage (GASS). This service defines a global name space via Uniform Resource Locators and allows applications to access remote files via standard I/O interfaces. High performance is achieved by incorporating default data movement strategies that are specialized for I/O patterns common in wide area applications and by providing support for programmer management of data movement. GASS forms part of the Globus toolkit, a set of services for high-performance distributed computing. GASS itself makes use of Globus services for security and communication, and other Globus components use GASS services for executable staging and real-time remote monitoring. Application experiences demonstrate that the library has practical utility.
On Implementing MPI-IO Portably and with High Performance
- In Proceedings of the 6th Workshop on I/O in Parallel and Distributed Systems
, 1999
"... We discuss the issues involved in implementing MPI-IO portably on multiple machines and file systems and also achieving high performance. One way to implement MPI-IO portably is to implement it on top of the basic Unix I/O functions (open, lseek, read, write, and close), which are themselves portabl ..."
Abstract
-
Cited by 137 (21 self)
- Add to MetaCart
We discuss the issues involved in implementing MPI-IO portably on multiple machines and file systems and also achieving high performance. One way to implement MPI-IO portably is to implement it on top of the basic Unix I/O functions (open, lseek, read, write, and close), which are themselves portable. We argue that this approach has limitations in both functionality and performance. We instead advocatean implementation approach that combines a large portion of portable code and a small portion of code that is optimized separately for different machines and file systems. We have used such an approach to develop a high-performance, portable MPI-IO implementation, called ROMIO. In addition to basic I/O functionality, we consider the issues of supporting other MPI-IO features, such as 64-bit file sizes, noncontiguous accesses, collective I/O, asynchronous I/O, consistency and atomicity semantics, user-supplied hints, shared file pointers, portable data representation, and file preallocati...
DataCutter: Middleware for Filtering Very Large Scientific Datasets on Archival Storage Systems
, 2000
"... In this paper we present a middleware infrastructure, called DataCutter, that enables processing of scientific datasets stored in archival storage systems across a widearea network. DataCutter provides support for subsetting of datasets through multidimensional range queries, and application spec ..."
Abstract
-
Cited by 79 (13 self)
- Add to MetaCart
In this paper we present a middleware infrastructure, called DataCutter, that enables processing of scientific datasets stored in archival storage systems across a widearea network. DataCutter provides support for subsetting of datasets through multidimensional range queries, and application specific aggregation on scientific datasets stored in an archival storage system. We also present experimental results from a prototype implementation.
The Globus Striped GridFTP Framework and Server
- In SC ’05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing
, 2005
"... The GridFTP extensions to the File Transfer Protocol define a general-purpose mechanism for secure, reliable, high-performance data movement. We report here on the Globus striped GridFTP framework, a set of client and server libraries designed to support the construction of data-intensive tools and ..."
Abstract
-
Cited by 62 (12 self)
- Add to MetaCart
The GridFTP extensions to the File Transfer Protocol define a general-purpose mechanism for secure, reliable, high-performance data movement. We report here on the Globus striped GridFTP framework, a set of client and server libraries designed to support the construction of data-intensive tools and applications. We describe the design of both this framework and a striped GridFTP server constructed within the framework. We show that this server is faster than other FTP servers in both single-process and striped configurations, achieving, for example, speeds of 27.3 Gbit/s memory-to-memory and 17 Gbit/s disk-to-disk over a 60 millisecond round trip time, 30 Gbit/s network. In another experiment, we show that the server can support 1800 concurrent clients without excessive load. We argue that this combination of performance and modular structure make the Globus GridFTP framework both a good foundation on which to build tools and applications, and a unique testbed for the study of innovative data management techniques and network protocols. 1
A Framework for Reliable and Efficient Data Placement in Distributed Computing Systems
- Journal of Parallel and Distributed Computing
, 2005
"... Data placement is an essential part of today’s distributed applications since moving the data close to the application has many benefits. The increasing data requirements of both scientific and commercial applications, and collaborative access to these data make it even more important. In the curren ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Data placement is an essential part of today’s distributed applications since moving the data close to the application has many benefits. The increasing data requirements of both scientific and commercial applications, and collaborative access to these data make it even more important. In the current approach, data placement is regarded as a side affect of computation. Our goal is to make data placement a first class citizen in distributed computing systems just like the computational jobs. They will be queued, scheduled, monitored, managed, and even checkpointed. Since data placement jobs have different characteristics than computational jobs, they cannot be treated in the exact same way as computational jobs. For this purpose, we are proposing a framework which can be considered as a “data placement subsystem ” for distributed computing systems, similar to the I/O subsystem in operating systems. This framework includes a specialized scheduler for data placement, a high level planner aware of data placement jobs, a resource broker/policy enforcer and some optimization tools. Our system can perform reliable and efficient data placement, it can recover from all kinds of failures without any human intervention, and it can dynamically adapt to the environment at the execution time. Key words. Distributed computing, reliable and efficient data placement, scheduling, run-time adaptation, protocol auto-tuning, data intensive applications, I/O subsystem. 1.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing
- In proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing
, 2000
"... A new RAID-x (redundant array of inexpensive disks at level x) architecture is presented for distributed I/O processing on a serverless cluster of computers. The RAID-x architecture is based on a new concept of orthogonal striping and mirroring (OSM) across all distributed disks in the cluster. The ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
A new RAID-x (redundant array of inexpensive disks at level x) architecture is presented for distributed I/O processing on a serverless cluster of computers. The RAID-x architecture is based on a new concept of orthogonal striping and mirroring (OSM) across all distributed disks in the cluster. The primary advantages of this OSM approach lie in: (1) a significant improvement in parallel I/O bandwidth, (2) hiding disk mirroring overhead in the background, and (3) greatly enhanced scalability and reliability in cluster computing applications. All claimed advantages are substantiated with benchmark performance results on the Trojans cluster built at USC in 1999. Throughout the paper, we discuss the issues of scalable I/O performance, enhanced system reliability, and striped checkpointing on distributed RAID-x in a serverless cluster environment. 1.
RFS: Efficient and Flexible Remote File Access for MPI-IO
- In Proceedings of the IEEE International Conference on Cluster Computing
, 2004
"... Scientific applications often need to access remote file systems. Because of slow networks and large data size, however, remote I/O can become an even more serious performance bottleneck than local I/O performance. In this work, we present RFS, a high-performance remote I/O facility for ROMIO, which ..."
Abstract
-
Cited by 15 (9 self)
- Add to MetaCart
Scientific applications often need to access remote file systems. Because of slow networks and large data size, however, remote I/O can become an even more serious performance bottleneck than local I/O performance. In this work, we present RFS, a high-performance remote I/O facility for ROMIO, which is a well-known MPI-IO implementation. Our simple, portable, and flexible design eliminates the shortcomings of previous remote I/O efforts. In particular, RFS improves the remote I/O performance by adopting active buffering with threads (ABT), which hides I/O cost by aggressively buffering the output data using available memory and performing background I/O using threads while computation is taking place. Our experimental results show that RFS with ABT can significantly reduce the remote I/O visible cost, achieving up to 92 % of the theoretical peak throughput. The computation slowdown caused by concurrent I/O activities was 0.2–6.2%, which is dwarfed by the overall performance improvement in application turnaround time. 1
Scheduling Large Parametric Modelling Experiments on a Distributed Meta-computer
- In PCW'97
, 1997
"... Nimrod is a tool which makes it easy to parallelise and distribute large computational experiments based on the exploration of a range of parameterised scenarios. Using Nimrod, it is possible to specify and generate a parametric experiment, and then control the execution of the code across distribut ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Nimrod is a tool which makes it easy to parallelise and distribute large computational experiments based on the exploration of a range of parameterised scenarios. Using Nimrod, it is possible to specify and generate a parametric experiment, and then control the execution of the code across distributed computers. Nimrod has been applied to a range of application areas, including Bioinformatics, Operations Research, Electronic CAD, Ecological Modelling and Computer Movies. Nimrod was extremely successful at generating work, but it contained no mechanisms for scheduling the computation on the underlying resources. Consequently, users would not have any idea when an experiment might complete. We are currently building a new version of Nimrod, called Nimrod/G. Nimrod/G will integrate Nimrod job generation techniques with Globus, an international project which is building the underlying infrastructure for large meta-computing applications. Using Globus, it will be possible for Nimrod users t...
Managing Clusters of Geographically Distributed High-Performance Computers
, 1999
"... this paper we describe the architecture of a software system for access and management of geographically distributed HPC systems. There are three key components: 1. The Computing Center Software (CCS) is a resource management software for the access and system administration of HPC systems that are ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
this paper we describe the architecture of a software system for access and management of geographically distributed HPC systems. There are three key components: 1. The Computing Center Software (CCS) is a resource management software for the access and system administration of HPC systems that are operated in a single site. It gives a vendor-independent interface to parallel systems and it provides tools for specifying, configuring and scheduling system components.

