Results 1 - 10
of
27
The Grid Economy
- PROCEEDINGS OF THE IEEE, GRID COMPUTING (SECTION 5, CHAPTER 3)
"... This chapter identifies challenges in managing resources in a Grid computing environment and proposes computational economy as a metaphor for effective management of resources and application scheduling. It identifies distributed resource management challenges and requirements of economybased Grid s ..."
Abstract
-
Cited by 77 (13 self)
- Add to MetaCart
This chapter identifies challenges in managing resources in a Grid computing environment and proposes computational economy as a metaphor for effective management of resources and application scheduling. It identifies distributed resource management challenges and requirements of economybased Grid systems, and discusses various representative economy-based systems, both historical and emerging, for cooperative and competitive trading of resources such as CPU cycles, storage, and network bandwidth. It presents an extensible, service-oriented Grid architecture driven by Grid economy and an approach for its realization by leveraging various existing Grid technologies. It also presents commodity and auction models for resource allocation. The use of commodity economy model for resource management and application scheduling in both computational and data grids is also presented.
OptorSim - A Grid Simulator for Studying Dynamic Data Replication Strategies
- International Journal of High Performance Computing Applications
, 2003
"... Abstract Computational Grids process large, computationally intensive prob-lems on small data sets. In contrast, Data Grids process large computational problems that in turn require evaluating, mining and producinglarge amounts of data. Replication, creating geographically disparate identical copies ..."
Abstract
-
Cited by 53 (4 self)
- Add to MetaCart
Abstract Computational Grids process large, computationally intensive prob-lems on small data sets. In contrast, Data Grids process large computational problems that in turn require evaluating, mining and producinglarge amounts of data. Replication, creating geographically disparate identical copies of data, is regarded as one of the major optimisationtechniques for reducing data access costs. In this paper, several replication algorithms are discussed. Thesealgorithms were studied using the Grid simulator: OptorSim. OptorSim provides a modular framework within which optimisation strate-gies can be studied under different Grid configurations. The goal is to explore the stability and transient behaviour of selected optimisationtechniques. We detail the design and implementation of OptorSim andanalyse various replication algorithms based on different Grid workloads. 1 Introduction Within the Grid community much work has been done on providing the basic infrastructure for a typical Grid environment. Globus [3], Condor [1] and recently the EU DataGrid [2] have contributed substantially to core Grid
Evaluating Scheduling and Replica Optimisation Strategies in OptorSim
- In 4th International Workshop on Grid Computing (Grid2003
, 2003
"... Grid computing is fast emerging as the solution to the problems posed by the massive computational and data handling requirements of many current international scientific projects. Simulation of the Grid environment is important to evaluate the impact of potential data handling strategies before bei ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
Grid computing is fast emerging as the solution to the problems posed by the massive computational and data handling requirements of many current international scientific projects. Simulation of the Grid environment is important to evaluate the impact of potential data handling strategies before being deployed on the Grid. In this paper, we look at the effects of various job scheduling and data replication strategies and compare them in a variety of Grid scenarios, evaluating several performance metrics. We use the Grid simulator OptorSim, and base our simulations on a world-wide Grid testbed for data intensive high energy physics experiments. Our results show that the choice of scheduling and data replication strategies can have a large effect on both job throughput and the overall consumption of Grid resources. 1
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
- ACM Comput. Surv
, 2006
"... Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. ..."
Abstract
-
Cited by 27 (7 self)
- Add to MetaCart
Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases.
Increasing distributed storage survivability with a stackable raid-like file system
- In Proceedings of the 2005 IEEE/ACM Workshop on Cluster Security, in conjunction with the Fifth IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2005
, 2005
"... We have designed a stackable file system called Redundant Array of Independent Filesystems (RAIF). It combines the data survivability properties and performance benefits of traditional RAIDs with the unprecedented flexibility of composition, improved security, and ease of development of stackable fi ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
We have designed a stackable file system called Redundant Array of Independent Filesystems (RAIF). It combines the data survivability properties and performance benefits of traditional RAIDs with the unprecedented flexibility of composition, improved security, and ease of development of stackable file systems. RAIF can be mounted on top of any combination of other file systems including network, distributed, disk-based, and memory-based file systems. Existing encryption, compression, antivirus, and consistency checking stackable file systems can be mounted above and below RAIF, to efficiently cope up with slow or unsecure branches. Individual files can be distributed across branches, replicated, stored with parity, or stored with erasure correction coding to recover from failures on multiple branches. Per-file incremental recovery, storage type migration, and load-balancing are especially well suited for grid storages. In this paper we describe the current RAIF design, provide preliminary performance results and discuss current status and future directions. 1
An Economy-based Algorithm for Scheduling Data-Intensive Applications on Global Grids
, 2004
"... Data Grids have become the de facto platform for the next generation of eScience experiments that will be carried out through large collaborations spread around the world. As the number of entities within a data grid increases, scheduling of applications in order to make the most efficient use of th ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Data Grids have become the de facto platform for the next generation of eScience experiments that will be carried out through large collaborations spread around the world. As the number of entities within a data grid increases, scheduling of applications in order to make the most efficient use of the available resources such as computational, storage and network facilities becomes a challenge. Previous work has suggested a computational economy metaphor for resource management within compute and data grids. However, the issue of scheduling jobs that require distributed data within an economy-based data grid has not been studied in detail so far.
Co-scheduling of computation and data on computer clusters
- In Proceedings of the 17th International Conference on Scientific and Statistical Database Management (SSDBM
, 2005
"... Scientific investigations have to deal with rapidly growing amounts of data from simulations and experiments. During data analysis, scientists typically want to extract subsets of the data and perform computations on them. In order to speed up the analysis, computations are performed on distributed ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Scientific investigations have to deal with rapidly growing amounts of data from simulations and experiments. During data analysis, scientists typically want to extract subsets of the data and perform computations on them. In order to speed up the analysis, computations are performed on distributed systems such as computer clusters, or Grid systems. A well-known difficult problem is to build systems that execute the computations and data movement in a coordinated fashion. In this paper, we describe an architecture for executing co-scheduled tasks of computation and data movement on a computer cluster that takes advantage of two technologies currently being used in distributed Grid systems. The first is Condor, that manages the scheduling and execution of distributed computation, and the second is Storage Resource Managers (SRMs) that manage the space usage and content of storage systems. This is achieved by including the information about the availability of files on the nodes provided by SRMs into the advertised information that Condor uses for the purpose of matchmaking. The system is capable of dynamically load balancing by replicating popular files on idle nodes. To confirm the feasibility of our approach, a prototype system was built on a computer cluster. Several experiments based on real work logs were performed. We observed that without replication compute nodes are underutilized and job wait times in the scheduler’s queue are longer. This architecture can be used in wide-area Grid systems since the basic components are already used for the Grid. ∗ Visiting LBNL from the Computer Sciences Department, University of Wisconsin 1
Design patterns for self-organizing multiagent systems
- IN: PROCEEDINGS OF EEDAS
, 2007
"... Natural systems are currently being regarded as rich sources of inspiration for engineering artificial systems, particularly when adopting the multiagent system (MAS) paradigm. To promote a systematic reuse of mechanisms featured in self-organizing systems, we analyse a selected list of design patte ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Natural systems are currently being regarded as rich sources of inspiration for engineering artificial systems, particularly when adopting the multiagent system (MAS) paradigm. To promote a systematic reuse of mechanisms featured in self-organizing systems, we analyse a selected list of design patterns for recurrent problems in the literature. Starting from our reference MAS metamodel, we provide a complete characterization of each pattern according to a reference scheme: in particular we describe the problem, the solution with respect to our metamodel, the natural systems which have inspired the pattern and known applications. Furthermore, to contextualize the patterns within an engineering workflow, we briefly describe our methodological approach for designing self-organizing MAS.
Economy-Based Data Replication Broker Policies in Data Grids
, 2005
"... Data is being produced at a tremendous velocity and volume from scientific experiments in the fields of high energy physics, molecular docking, computer micro-tomography and many others. E-Science and in particular the LHC (Large Hadron Collider), which will host experiments upon its completion in 2 ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Data is being produced at a tremendous velocity and volume from scientific experiments in the fields of high energy physics, molecular docking, computer micro-tomography and many others. E-Science and in particular the LHC (Large Hadron Collider), which will host experiments upon its completion in 2007, is expected to generate petabytes of data per year. There is thus an urgent requirement to obtain solutions to manage, distribute and access large sets of raw and processed data efficiently and effectively across the globe. A variant of the grid architecture for a distributed data management system named the data grid has been proposed to address these infrastructure issues of e-Science. Data replication is one of the key components in the proposed data grid architecture as it enhances data access and reliability. One of the key decision making tools in a data replication scheme is the replication scheduler; the scheduler determines when and where to perform a replication over a network in order to fulfill its goals. The other key component is the resource broker that determines how and when to acquire grid services and resources for higher level components. The following work introduces a novel approach to data resource broker policies for a replication scheduler under the tiered MONARC data grid structure from a market economy resource management approach. The new approach extends the commodity market model of the market economy to take into account other factors such as server reliability, transfer speeds, link reliability and service costs. Furthermore a detailed and
Data-Driven Batch Scheduling
, 2005
"... In this paper, we develop data-driven strategies for batch computing schedulers. Current CPU-centric batch schedulers ignore the data needs within workloads and execute them by linking them transparently and directly to their needed data. When scheduled on remote computational resources, this elegan ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper, we develop data-driven strategies for batch computing schedulers. Current CPU-centric batch schedulers ignore the data needs within workloads and execute them by linking them transparently and directly to their needed data. When scheduled on remote computational resources, this elegant solution of direct data access can incur an order of magnitude performance penalty for data-intensive workloads. Adding data-awareness to batch schedulers allows a careful coordination of data and CPU allocation thereby reducing the cost of remote execution. We offer here new techniques by which batch schedulers can become data-driven. Such systems can use our analytical predictive models to select one of the four data-driven scheduling policies that we have created. Through simulation, we demonstrate the accuracy of our predictive models and show how they can reduce time to completion for some workloads by as much as 80%.

