Results 1 - 10
of
10
Pegasus: a framework for mapping complex scientific workflows onto distributed systems
- SCIENTIFIC PROGRAMMING JOURNAL
, 2005
"... This paper describes the Pegasus framework that can be used to map complex scientific workflows onto distributed resources. Pegasus enables users to represent the workflows at an abstract level without needing to worry about the particulars of the target execution systems. The paper describes genera ..."
Abstract
-
Cited by 145 (24 self)
- Add to MetaCart
This paper describes the Pegasus framework that can be used to map complex scientific workflows onto distributed resources. Pegasus enables users to represent the workflows at an abstract level without needing to worry about the particulars of the target execution systems. The paper describes general issues in mapping applications and the functionality of Pegasus. We present the results of improving application performance through workflow restructuring.
Mapping Abstract Complex Workflows onto Grid Environments
"... In this paper we address the problem of automatically generating job workflows for the Grid. These workflows describe the execution of a complex application built from individual application components. In our work we have developed two workflow generators: the first (the Concrete Workflow Generator ..."
Abstract
-
Cited by 141 (17 self)
- Add to MetaCart
In this paper we address the problem of automatically generating job workflows for the Grid. These workflows describe the execution of a complex application built from individual application components. In our work we have developed two workflow generators: the first (the Concrete Workflow Generator CWG) maps an abstract workflow defined in terms of application-level components to the set of available Grid resources. The second generator (Abstract and Concrete Workflow Generator, ACWG) takes a wider perspective and not only performs the abstract to concrete mapping but also enables the construction of the abstract workflow based on the available components. This system operates in the application domain and chooses application components based on the application metadata attributes. We describe our current ACWG based on AI planning technologies and outline how these technologies can play a crucial role in developing complex application workflows in Grid environments. Although our work is preliminary, CWG has already been used to map high energy physics applications onto the Grid. In one particular experiment, a set of production runs lasted 7 days and resulted in the generation of 167,500 events by 678 jobs. Additionally, ACWG was used to map gravitational physics workflows, with hundreds of nodes onto the available resources, resulting in 975 tasks, 1365 data transfers and 975 output files produced.
The Earth System Grid: Supporting the Next Generation of Climate Modeling Research
- Proceedings of the IEEE
, 2005
"... Abstract—Understanding the Earth’s climate system and how it might be changing is a preeminent scientific challenge. Global climate models are used to simulate past, present, and future climates, and experiments are executed continuously on an array of distributed supercomputers. The resulting data ..."
Abstract
-
Cited by 30 (14 self)
- Add to MetaCart
Abstract—Understanding the Earth’s climate system and how it might be changing is a preeminent scientific challenge. Global climate models are used to simulate past, present, and future climates, and experiments are executed continuously on an array of distributed supercomputers. The resulting data archive, spread over several sites, currently contains upwards of one hundred terabytes of simulation data and is growing rapidly. Looking towards mid-decade and beyond, we must anticipate and prepare for distributed climate research data holdings of many petabytes. The Earth System Grid (ESG) is a collaborative interdisciplinary project aimed at addressing the challenge of enabling management, discovery, access, and analysis of these critically important datasets in a distributed and heterogeneous computational environment. The problem is fundamentally a Grid problem. Building upon
Automatically composed workflows for grid environments
- IEEE INTELLIGENT SYSTEMS
, 2004
"... This planning system uses heuristics to select application components and computing resources to help generate executable workflows for a grid, and provides different levels of support, depending on the information available. ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
This planning system uses heuristics to select application components and computing resources to help generate executable workflows for a grid, and provides different levels of support, depending on the information available.
Distributed Virtual Computer (DVC): Simplifying the Development of High Performance Grid Applications
- In Workshop on Grids and Advanced Networks
, 2004
"... Distributed Virtual Computer (DVC) is a computing environment which simplifies the development and execution of distributed applications on computational grids. DVC provides a simple set of abstractions to simplify application management of naming, security, communication, and resource, easing use o ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Distributed Virtual Computer (DVC) is a computing environment which simplifies the development and execution of distributed applications on computational grids. DVC provides a simple set of abstractions to simplify application management of naming, security, communication, and resource, easing use of highly dynamic and heterogeneous resource environments. These abstractions enable complex collections of grid resources to be used in a fashion similar to private user or workgroup resources. The DVC model is attractive for lambda-grids with circuitswitched optical networks, providing a structure for exploiting unique communication and security properties. Examples of DVC’s include virtual clusters and virtual heterogeneous resource collections. We introduce the concept of a DVC, its system structure and mechanisms. We discuss the potential benefits of DVC’s for application programmers. 1.
Data Management Challenges of Data-Intensive Scientific Workflows
"... Scientific workflows play an important role in today’s science. Many disciplines rely on workflow technologies to orchestrate the execution of thousands of computational tasks. Much research to-date focuses on efficient, scalable, and robust workflow execution, especially in distributed environments ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Scientific workflows play an important role in today’s science. Many disciplines rely on workflow technologies to orchestrate the execution of thousands of computational tasks. Much research to-date focuses on efficient, scalable, and robust workflow execution, especially in distributed environments. However, many challenges remain in the area of data management related to workflow creation, execution, and result management. In this paper we examine some of these issues in the context of the entire workflow lifecycle. 1.
P2P Grid: Service Oriented Framework for Distributed Resource Management
- In Proceedings of the 2005 IEEE International Conference on Service Computing (SCC’05
, 2005
"... With the increasing number of computers on the Internet, there is a growing interest in harnessing the unused and inexpensive computational resources over the Internet. However, current approaches such as the Grid computing paradigm are not sufficient. We present our preliminary work that uses exten ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
With the increasing number of computers on the Internet, there is a growing interest in harnessing the unused and inexpensive computational resources over the Internet. However, current approaches such as the Grid computing paradigm are not sufficient. We present our preliminary work that uses extends Peer-2-Peer (P2P) computing with a framework that allows Grid computing over the internet. 1.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands. Distributed Downloads of Bulk, Replicated Grid Data
"... Data-sharing scientific communities use storage systems as distributed data stores by replicating content. In such highly replicated environments, a particular dataset can reside at multiple locations and can thus be downloaded from any one of them. Since datasets of interest are significantly large ..."
Abstract
- Add to MetaCart
Data-sharing scientific communities use storage systems as distributed data stores by replicating content. In such highly replicated environments, a particular dataset can reside at multiple locations and can thus be downloaded from any one of them. Since datasets of interest are significantly large in size, improving download speeds either by server selection or by co-allocation can offer substantial benefits. In this paper, we present an architecture for co-allocating Grid data transfers across multiple connections, enabling the parallel download of datasets from multiple servers. We have developed several co-allocation strategies comprising of simple brute-force, predictive and dynamic load balancing techniques as a means both to exploit rate differences among the various client–server links and to address dynamic rate fluctuations. We evaluate our approaches using the GridFTP data movement protocol in a wide-area testbed and present our results. 1.
Abstract On-demand VPN Support for Grid Applications Quality of Service delivered to Grids by packet-switched
"... networks, is one of the most important factors affecting performance and efficiency of both Grid applications and middleware. The use Virtual Private Network services can improve the overall performance of Grids in many respects. In this paper, we show how the security, privacy and Quality of Servic ..."
Abstract
- Add to MetaCart
networks, is one of the most important factors affecting performance and efficiency of both Grid applications and middleware. The use Virtual Private Network services can improve the overall performance of Grids in many respects. In this paper, we show how the security, privacy and Quality of Service offered by scalable on-demand VPN services can be applied in large-scale Grid scenarios. We propose a novel network resource abstraction for resource discovery of on-demand Virtual Private Networks. It is implemented in a Grid Information Service prototype which was successfully tested both on dedicated infrastructures and production networks. 1
July2003, Vol.18, No.4, pp.413-422 J. Comput. Sci. & Technol. VEGA Infrastructure for Resource Discovery in Grids
, 2003
"... Abstract Grids enable users to share and access large collections and various types of resources in wide areas, and how to locate resources in such dynamic, heterogeneous and autonomous distributed environments is a key and challenging issue. In this paper, a three-level decentralized and dynamic VE ..."
Abstract
- Add to MetaCart
Abstract Grids enable users to share and access large collections and various types of resources in wide areas, and how to locate resources in such dynamic, heterogeneous and autonomous distributed environments is a key and challenging issue. In this paper, a three-level decentralized and dynamic VEGA Infrastructure for Resource Discovery (VIRD) is proposed. In this architecture, every Border Grid Resource Name Server (BGRNS) or Grid Resource Name Server (GRNS) has its own local policies, governing information organization, management and searching. Changes in resource information are propagated dynamically among GRNS servers according to a link-statelike algorithm. A client can query its designated GRNS either recursively or iteratively. Optimizing techniques, such as shortcut, are adopted to make the dynamic framework more flexible and efficient. A simulator called SimVIRD is developed to verify the proposed architecture and algorithms. Experiment results indicate that this architecture could deliver good scalability and performance for grid resource discovery.

