Results 1 - 10
of
45
Managing Large-Scale Workflow Execution from Resource Provisioning to Provenance Tracking: The CyberShake Example
- In Proceedings of the Second IEEE international Conference on E-Science and Grid Computing
, 2006
"... This paper discusses the process of building an environment where large-scale, complex, scientific analysis can be scheduled onto a heterogeneous collection of computational and storage resources. The example application is the Southern California Earthquake Center (SCEC) CyberShake project, an anal ..."
Abstract
-
Cited by 22 (11 self)
- Add to MetaCart
This paper discusses the process of building an environment where large-scale, complex, scientific analysis can be scheduled onto a heterogeneous collection of computational and storage resources. The example application is the Southern California Earthquake Center (SCEC) CyberShake project, an analysis designed to compute probabilistic seismic hazard curves for sites in the Los Angeles area. We explain which software tools were used to build to the system, describe their functionality and interactions. We show the results of running the CyberShake analysis that included over 250,000 jobs using resources available through SCEC and the TeraGrid. 1.
The Design, Performance, and Use of DiPerF: An automated DIstributed PERformance testing Framework
- the Journal of Grid Computing, Special Issue on Global and Peer-to-Peer Computing
, 2006
"... We present DiPerF, a DIstributed PERformance testing Framework, aimed at simplifying and automating performance evaluation of networked services. DiPerF coordinates a pool of machines that test a target service, collects and aggregates performance metrics, and generates performance statistics. The a ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
We present DiPerF, a DIstributed PERformance testing Framework, aimed at simplifying and automating performance evaluation of networked services. DiPerF coordinates a pool of machines that test a target service, collects and aggregates performance metrics, and generates performance statistics. The aggregate data collected provide information on service throughput, service response time, service ‘fairness ’ when serving multiple clients concurrently, and on the impact of network connectivity on service performance. We have tested DiPerF in various environments (PlanetLab, Grid3, and the University of Chicago CS Cluster) and with a large number of services. In this paper we provide data that demonstrates that DiPerF is accurate: the aggregate client view matches the tested service view within a few percents, and scalable: DiPerF handles more than 10,000 clients and 100,000 transactions per second. Moreover, extensive use has demonstrated that the ability to automate extraction of service performance characteristics makes DiPerF a valuable tool. The main contribution of this paper is the DiPerF framework, which is a tool that allows automated large scale testing of grid services, web services, network services, and distributed services to be done in both LAN and WAN environments.
Flexible, Wide-Area Storage for Distributed Systems with WheelFS
"... WheelFS is a wide-area distributed storage system intended to help multi-site applications share data and gain fault tolerance. WheelFS takes the form of a distributed file system with a familiar POSIX interface. Its design allows applications to adjust the tradeoff between prompt visibility of upda ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
WheelFS is a wide-area distributed storage system intended to help multi-site applications share data and gain fault tolerance. WheelFS takes the form of a distributed file system with a familiar POSIX interface. Its design allows applications to adjust the tradeoff between prompt visibility of updates from other sites and the ability for sites to operate independently despite failures and long delays. WheelFS allows these adjustments via semantic cues, which provide application control over consistency, failure handling, and file and replica placement. WheelFS is implemented as a user-level file system and is deployed on PlanetLab and Emulab. Three applications (a distributed Web cache, an email service and large file distribution) demonstrate that WheelFS’s file system interface simplifies construction of distributed applications by allowing reuse of existing software. These applications would perform poorly with the strict semantics implied by a traditional file system interface, but by providing cues to WheelFS they are able to achieve good performance. Measurements show that applications built on WheelFS deliver comparable performance to services such as CoralCDN and BitTorrent that use specialized wide-area storage systems. 1
Coupling Prefix Caching and Collective Downloads for Remote Dataset Access
- In Proceedings of the 16th ACM International Conference on Supercomputing
, 2006
"... Scientific datasets are typically archived at mass storage systems or data centers close to supercomputers/instruments. Endusers of these datasets, however, usually perform parts of their workflows at their local computers. In such cases, client-side caching can offer significant gains by reducing t ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
Scientific datasets are typically archived at mass storage systems or data centers close to supercomputers/instruments. Endusers of these datasets, however, usually perform parts of their workflows at their local computers. In such cases, client-side caching can offer significant gains by reducing the cost of widearea data movement. Scientific data caches, however, traditionally cache entire datasets, which may not be necessary. In this paper, we propose a novel combination of prefix caching and collective download. Prefix caching allows the bootstrapping of dataset downloads by caching only a prefix of the dataset, while collective download facilitates efficient parallel patching of the missing suffix from an external data source. To estimate the optimal prefix size, we further present an analytical model that considers both the initial download overhead and the downloading speed. We implemented our proposed approach in the FreeLoader distributed cache prototype. Experimental results (using multiple scientific data repositories and data transfer tools, as well as a real-world scientific dataset access trace) demonstrate that prefix caching and collective download can be implemented efficiently, our model can select an appropriate prefix size, and the cache hit rate can be improved significantly without hurting the local access rate of cached datasets. 1.
GridTorrent: Optimizing data transfers in the Grid with collaborative sharing
"... Abstract. As Grid systems expand and become more and more popular, there is a growing need for efficient, scalable and robust data transfer mechanisms that can deal effectively with large file transfers and flash crowd situations. In this paper, we address the problem of data transfer optimization b ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
Abstract. As Grid systems expand and become more and more popular, there is a growing need for efficient, scalable and robust data transfer mechanisms that can deal effectively with large file transfers and flash crowd situations. In this paper, we address the problem of data transfer optimization by presenting GridTorrent- a modified BitTorrent protocol, tightly coupled with modern Grid middleware components. Grid-Torrent can be used to transfer files directly from established GridFTP servers or other GridTorrent peers that are simultaneously requesting the same information. The peer-to-peer approach, enables the aggregate data transfer throughput to escalate, even when numerous requests rely on a single data source, and achieve better utilization of the available Grid resources. Experimental results conducted using a prototype implementation suggest that there are significant advantages when using GridTorrent to optimize data transfers. Moreover, GridTorrent is completely backwards-compatible with already deployed Grids.
Don’t Give Up on Distributed File Systems
, 2007
"... Wide-area distributed applications often reinvent the wheel for their storage needs, each incorporating its own special-purpose storage manager to cope with distribution, intermittent failures, limited bandwidth, and high latencies. This paper argues that a distributed file system could provide a re ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Wide-area distributed applications often reinvent the wheel for their storage needs, each incorporating its own special-purpose storage manager to cope with distribution, intermittent failures, limited bandwidth, and high latencies. This paper argues that a distributed file system could provide a reusable solution to these problems by coupling a standard interface with a design suited to widearea distribution. For concreteness, this paper presents such a file system, called WheelFS, which allows applications to control consistency through the use of semantic cues, and minimizes communication costs by adhering to the slogan read globally, write locally. WheelFS could simplify distributed experiments, CDNs, and Grid applications.
Data Placement for Scientific Applications in Distributed Environments
"... Abstract — Scientific applications often perform complex computational analyses that consume and produce large data sets. We are concerned with data placement policies that distribute data in ways that are advantageous for application execution, for example, by placing data sets so that they may be ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract — Scientific applications often perform complex computational analyses that consume and produce large data sets. We are concerned with data placement policies that distribute data in ways that are advantageous for application execution, for example, by placing data sets so that they may be staged into or out of computations efficiently or by replicating them for improved performance and reliability. In particular, we propose to study the relationship between data placement services and workflow management systems. In this paper, we explore the interactions between two services used in large-scale science today. We evaluate the benefits of prestaging data using the Data Replication Service versus using the native data stage-in mechanisms of the Pegasus workflow management system. We use the astronomy application, Montage, for our experiments and modify it to study the effect of input data size on the benefits of data prestaging. As the size of input data sets increases, prestaging using a data placement service can significantly improve the performance of the overall analysis. I.
Building a Generic SOAP Framework over Binary XML
- In The 15th IEEE International Symposium on High Performance Distributed Computing (HPDC-15
, 2006
"... The prevailing binding of SOAP to HTTP specifies that SOAP messages be encoded as an XML 1.0 document which is then sent between client and server. XML processing however can be slow and memory intensive, especially for scientific data, and consequently SOAP has been regarded as an inappropriate pro ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
The prevailing binding of SOAP to HTTP specifies that SOAP messages be encoded as an XML 1.0 document which is then sent between client and server. XML processing however can be slow and memory intensive, especially for scientific data, and consequently SOAP has been regarded as an inappropriate protocol for scientific data. Efficiency considerations thus lead to the prevailing practice of separating data from the SOAP control channel. Instead, it is stored in specialized binary formats and transmitted either via attachments or indirectly via a file sharing mechanism, such as GridFTP or HTTP. This separation invariably complicates development due to the multiple libraries and type systems to be handled; furthermore it suffers from performance issues, especially when handling small binary data. As an alternative solution, binary XML provides a highly efficient encoding scheme for binary data in the XML and SOAP messages, and with it we can gain high performance as well as unifying the development environment without unduly impacting the web service protocol stack. In this paper we present our implementation of a generic SOAP engine that supports both textual XML and binary XML as the encoding scheme of the message. We also present our binary XML data model and encoding scheme. Our experiments show that for scientific applications binary XML together with the generic SOAP implementation not only ease development, but also provide better performance and are more widely applicable than the commonly used separated schemes.
Toward Seamless Grid Data Access: Design and
- Implementation of GridFTP on .NET. Proceedings of the 2005 Grid Workshop (Associated with Supercomputing 2005). Nov
, 2005
"... Abstract — To date, only Linux-/UNIX-based hosts have been participants in the Grid vision for seamless data access, because the necessary Grid data access protocols have not been implemented on Windows. As part of our larger effort at the University of Virginia to make the Windows platform a firstc ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Abstract — To date, only Linux-/UNIX-based hosts have been participants in the Grid vision for seamless data access, because the necessary Grid data access protocols have not been implemented on Windows. As part of our larger effort at the University of Virginia to make the Windows platform a firstclass participant in all aspects of Grids, this paper describes our experiences and lessons learned while implementing GridFTP on the Microsoft.NET Framework. Our implementation not only supports major extensions of GridFTP v1, it also uniquely implements some features of GridFTP v2 and introduces a new transfer mode specifically designed for transfer of large collection of small files. Our measured performance is comparable to GT4 GridFTP on both single and parallel streams transfer and more efficient than GT4 GridFTP on directory tree transfer. We also identify issues specific to the.NET Framework/Windows platform with regard to security and we identify limitations of current GridFTP protocol. To our knowledge, the work described in this paper is the first comprehensive and evaluated implementation of GridFTP on.NET. I.
Using Overlays For Efficient Data Transfer Over Shared Wide-Area Networks ∗
"... Data-intensive applications frequently transfer large amounts of data over wide-area networks. The performance achieved in such settings can often be improved by routing data via intermediate nodes chosen to increase aggregate bandwidth. We explore the benefits of overlay network approaches by desig ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Data-intensive applications frequently transfer large amounts of data over wide-area networks. The performance achieved in such settings can often be improved by routing data via intermediate nodes chosen to increase aggregate bandwidth. We explore the benefits of overlay network approaches by designing and implementing a service-oriented architecture that incorporates two key optimizations – multi-hop path splitting and multi-pathing – within the GridFTP file transfer protocol. We develop a file transfer scheduling algorithm that incorporates the two optimizations in con-junction with the use of available file replicas. The algorithm makes use of information from past GridFTP transfers to estimate network bandwidths and resource availability. The effectiveness of these optimizations is evaluated us-ing several application file transfer patterns: one-to-all broadcast, all-to-one gather, and data redistribution, on a wide-area testbed. The experimental results show that our architecture and algorithm achieve significant performance improvement. 1

