Results 1 - 10
of
21
Flexible and efficient workflow deployement of data-intensive applications on grids with MOTEUR
- IN "INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS", TO APPEAR IN THE SPECIAL ISSUE ON WORKFLOW SYSTEMS IN GRID ENVIRONMENTS
, 2007
"... Workflows offer a powerful way to describe and deploy applications on grid infrastructures. Many workflow management systems have been proposed but there is still a lack of a system that would allow both a simple description of the dataflow of the application and an efficient execution on a grid pla ..."
Abstract
-
Cited by 83 (43 self)
- Add to MetaCart
(Show Context)
Workflows offer a powerful way to describe and deploy applications on grid infrastructures. Many workflow management systems have been proposed but there is still a lack of a system that would allow both a simple description of the dataflow of the application and an efficient execution on a grid platform. In this paper, we study the requirements of such a system, underlining the need for well-defined data composition strategies on the one hand and for a fully parallel execution on the other hand. As combining those features is not straight forward, we then propose algorithms to do so and we describe the design and implementation of MOTEUR, a workflow engine that fulfills those requirements. Performance results and overhead quantification are shown to evaluate MO-TEUR with respect to existing comparable workflow systems on a production grid.
GridBot: Execution of Bags of Tasks in Multiple Grids
"... We present a holistic approach for efficient execution of bags-of-tasks (BOTs) on multiple grids, clusters, and volunteer computing grids virtualized as a single computing platform. The challenge is twofold: to assemble this compound environment and to employ it for execution of a mixture of through ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
(Show Context)
We present a holistic approach for efficient execution of bags-of-tasks (BOTs) on multiple grids, clusters, and volunteer computing grids virtualized as a single computing platform. The challenge is twofold: to assemble this compound environment and to employ it for execution of a mixture of throughput- and performance-oriented BOTs, with a dozen to millions of tasks each. Our generic mechanism allows per BOT specification of dynamic arbitrary scheduling and replication policies as a function of the system state, BOT execution state, and BOT priority. We implement our mechanism in the GridBot system and demonstrate its capabilities in a production setup. GridBot has executed hundreds of BOTs with over 9 million jobs during the last 3 months alone; these have been invoked on 25,000 hosts, 15,000 from the Superlink@Technion community grid and the rest from the Technion campus grid, local clusters, the Open Science Grid, EGEE, and the UW Madison pool.
Performance implications of virtualizing multicore cluster machines
- In HPCVirt (2008
"... High performance computers are typified by cluster machines constructed from multicore nodes and using high performance interconnects like Infiniband. Virtualizing such ‘capacity computing’ platforms implies the shared use of not only the nodes and node cores, but also of the cluster interconnect (e ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
(Show Context)
High performance computers are typified by cluster machines constructed from multicore nodes and using high performance interconnects like Infiniband. Virtualizing such ‘capacity computing’ platforms implies the shared use of not only the nodes and node cores, but also of the cluster interconnect (e.g., Infiniband). This paper presents a detailed study of the implications of sharing these resources, using the Xen hypervisor to virtualize platform nodes and exploiting Infiniband’s native hardware support for its simultaneous use by multiple virtual machines. Measurements are conducted with multiple VMs deployed per node, using modern techniques for hypervisor bypass for high performance network access, and evaluating the implications of resource sharing with different patterns of application behavior. Results indicate that multiple applications can share the clusters multicore nodes without undue effects on the performance of Infiniband access and use. Higher degrees of sharing are possible with communication-conscious VM placement and scheduling.
Scheduling in data intensive and network aware (diana) grid environments
- Journal of Grid Computing
"... In scientific environments such as High Energy Physics (HEP), hundreds of end-users may individually or collectively submit thousands of jobs that access subsets of the petabytes of HEP data distributed over the world. Given the large number of jobs that can result from the splitting process and the ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
In scientific environments such as High Energy Physics (HEP), hundreds of end-users may individually or collectively submit thousands of jobs that access subsets of the petabytes of HEP data distributed over the world. Given the large number of jobs that can result from the splitting process and the amount of data being used by these jobs, it is possible to submit the job clusters (batch of similar jobs) to some scheduler as a unique entity, with subsequent optimization in the handling of the input datasets. In this process, known as bulk scheduling, jobs compete for scarce compute and storage resources and this can distribute the load disproportionately among available Grid nodes. Moreover, the Grid scheduling decisions are often made on the basis of jobs being either data or computation intensive: in data intensive situations jobs may be pushed to the data and in computation intensive situations data may be pulled to the jobs. This kind of scheduling, in which there is no consideration of network characteristics, can lead to performance degradation in a Grid environment and may result in large processing queues and job execution delays due to site overloads. Furthermore, previous approaches have been based on so-called greedy algorithms where a job is
In Cloud, Do MTC or HTC Service Providers Benefit from the Economies of Scale?
"... Cloud computing, which is advocated as an economic platform for daily computing, has become a hot topic for both industrial and academic communities in the last couple of years. The basic idea behind cloud computing is that resource providers, which own the cloud platform, offer elastic resources to ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
(Show Context)
Cloud computing, which is advocated as an economic platform for daily computing, has become a hot topic for both industrial and academic communities in the last couple of years. The basic idea behind cloud computing is that resource providers, which own the cloud platform, offer elastic resources to end users. In this paper, we intend to answer one key question to the success of cloud computing: in cloud, do many task computing (MTC) or high throughput computing (HTC) service providers, which offer the corresponding computing service to end users, benefit from the economies of scale? To the best of our knowledge, no previous work designs and implements the enabling system to consolidate MTC and HTC workloads on the cloud platform; no one answers the above question. Our research contributions are three-fold: first, we propose an innovative usage model, called
Active CoordinaTion (ACT) - Towards Effectively Managing Virtualized Multicore Clouds
- In Cluster
, 2008
"... Abstract—A key benefit of utility data centers and cloud computing infrastructures is the level of consolidation they can offer to arbitrary guest applications, and the substantial saving in operational costs and resources that can be derived in the process. However, significant challenges remain be ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
Abstract—A key benefit of utility data centers and cloud computing infrastructures is the level of consolidation they can offer to arbitrary guest applications, and the substantial saving in operational costs and resources that can be derived in the process. However, significant challenges remain before it becomes possible to effectively and at low cost manage virtualized systems, particularly in the face of increasing complexity of individual many-core platforms, and given the dynamic behaviors and resource requirements exhibited by cloud guest VMs. This paper describes the Active CoordinaTion (ACT) approach, aimed to address a specific issue in the management domain, which is the fact that management actions must (1) typically touch upon multiple resources in order to be effective, and (2) must be continuously refined in order to deal with the dynamism in the platform resource loads. ACT relies on the notion of Class-of-Service, associated with (sets of) guest VMs, based on which it maps VMs onto Platform Units, the latter encapsulating sets of platform resources of different types. Using these abstractions, ACT can perform active management in multiple ways, including a VM-specific approach and a black box approach that relies on continuous monitoring of the guest VMs ’ runtime behavior and on an adaptive resource allocation algorithm, termed Multiplicative Increase, Subtractive Decrease Algorithm with Wiggle Room. In addition, ACT permits explicit external events to trigger VM or application-specific resource allocations, e.g., leveraging emerging standards such as WSDM. The experimental analysis of the ACT prototype, built for Xenbased platforms, use industry-standard benchmarks, including RUBiS, Hadoop, and SPEC. They demonstrate ACT’s ability to efficiently manage the aggregate platform resources according to the guest VMs ’ relative importance (Class-of-Service), for both the black-box and the VM-specific approach. I.
Load Balancing for Parallel Branch and Bound
- In Proceedings of 10th Workshop on Preferences and Soft Constraints
, 2010
"... Abstract. A strategy for parallelization of a state-of-the-art Branch and Bound algorithm for weighted CSPs and other graphical model optimization tasks is introduced: independent worker nodes concurrently solve subproblems, managed by a Branch and Bound master node; the problem cost functions are u ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract. A strategy for parallelization of a state-of-the-art Branch and Bound algorithm for weighted CSPs and other graphical model optimization tasks is introduced: independent worker nodes concurrently solve subproblems, managed by a Branch and Bound master node; the problem cost functions are used to predict subproblem complexity, enabling efficient load balancing, which is crucial for the performance of the parallelization process. Experimental evaluation on up to 20 nodes yields very promising results and suggests the effectiveness of the scheme. The system runs on loosely coupled commodity hardware, simplifying deployment on a larger scale in the future. 1
A System for Exact and Approximate Genetic Linkage Analysis of SNP Data in Large Pedigrees
, 2012
"... The wide availability of dense single nucleotide polymorphism (SNP) data imposes computational bottlenecks on genetic linkage analysis of large pedigrees exceeding the capabilities of contemporary computers. Here we report Superlink-Online SNP, a new strong system for analysis of SNP data on large p ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The wide availability of dense single nucleotide polymorphism (SNP) data imposes computational bottlenecks on genetic linkage analysis of large pedigrees exceeding the capabilities of contemporary computers. Here we report Superlink-Online SNP, a new strong system for analysis of SNP data on large pedigrees. Superlink-Online SNP provides geneticists a collection of highly integrated services, including sifting of erroneous data, SNP clustering, exact and approximate LOD calculations, and maximum likelihood haplotyping. This integrated system better facilitates a workflow towards easier pinpointing of disease genes. Computations performed by Superlink-Online SNP are automatically parallelized using novel paradigms, and executed on unlimited number of private or public CPUs. One novel service is high scale approximate Markov Chain-Monte Carlo (MCMC) analysis. The accuracy of the results is reliably estimated by running the same computation on multiple CPUs and evaluating the Gelman-Rubin Score to discard unreliable results. Another service within the workflow is a novel parallelized exact algorithm for inferring maximum likelihood haplotyping. The reported system enables genetic analyses that were previously infeasible. Genetic linkage analysis is a statistical method for locating disease-susceptibility genes. Existing computer packages that perform exact genetic linkage analysis, such as Merlin, 1 Allegro, 2 GENE-HUNTER, 3 Superlink4 and Vitesse5 use either the
Resource Use Pattern Analysis for Opportunistic Grids ∗
"... This work presents a method for predicting resource availability in opportunistic grids by means of Use Pattern Analysis (UPA), a technique based on non-supervised learning methods. The basic assumptions of the method and its capability to predict resource availability were demonstrated by simulatio ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
This work presents a method for predicting resource availability in opportunistic grids by means of Use Pattern Analysis (UPA), a technique based on non-supervised learning methods. The basic assumptions of the method and its capability to predict resource availability were demonstrated by simulations; accurate learning techniques and distance metrics are determined. The UPA method was implemented and experiments showed the feasibility of its use in low-overhead scheduling of grid tasks and its superiority over other predictive and non-predictive methods.
Optimization of workload scheduling for multimedia cloud computing
- International Symposium on In Circuits and Systems (ISCAS), IEEE
, 2013
"... Abstract—The cloud based multimedia applications have been widely adopted in recent years. Due to the large-scale and time-varying workload, an effective workload scheduling scheme is becoming a challenge faced by multimedia application providers. In this paper, we study the workload scheduling sche ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract—The cloud based multimedia applications have been widely adopted in recent years. Due to the large-scale and time-varying workload, an effective workload scheduling scheme is becoming a challenge faced by multimedia application providers. In this paper, we study the workload scheduling schemes for multimedia cloud. Specifically, we examine and solve the response time minimization problem and the resource cost minimization problem, respectively. Moreover, we propose a greedy algorithm to efficiently schedule workload for practical multimedia cloud. Simulation results demonstrate that the proposed workload scheduling schemes can optimally balance workload to achieve the minimal response time or the minimal resource cost for multimedia application providers. I.