Results 1 -
9 of
9
Data Management Challenges of Data-Intensive Scientific Workflows
"... Scientific workflows play an important role in today’s science. Many disciplines rely on workflow technologies to orchestrate the execution of thousands of computational tasks. Much research to-date focuses on efficient, scalable, and robust workflow execution, especially in distributed environments ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
(Show Context)
Scientific workflows play an important role in today’s science. Many disciplines rely on workflow technologies to orchestrate the execution of thousands of computational tasks. Much research to-date focuses on efficient, scalable, and robust workflow execution, especially in distributed environments. However, many challenges remain in the area of data management related to workflow creation, execution, and result management. In this paper we examine some of these issues in the context of the entire workflow lifecycle. 1.
Grids and clouds: Making workflow applications work in heterogeneous distributed environments
- Int. J. High Perform. Comput. Appl
, 2010
"... ..."
(Show Context)
Scheduling and Management Techniques for Data-Intensive Application Workflows
"... This chapter presents a comprehensive survey of algorithms, techniques and frameworks used for scheduling and management of data-intensive application workflows. Many complex scientific experiments are expressed in the form of workflows for structured, repeatable, controlled, scalable and automated ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This chapter presents a comprehensive survey of algorithms, techniques and frameworks used for scheduling and management of data-intensive application workflows. Many complex scientific experiments are expressed in the form of workflows for structured, repeatable, controlled, scalable and automated executions. This chapter focuses on the type of workflows that have tasks processing huge amount of data, usually in the range from hundreds of mega-bytes to petabytes. Scientists are already using Grid systems that schedule these workflows onto globally distributed resources for optimizing various objectives: minimize total makespan of the workflow, minimize cost and usage of network bandwidth, minimize cost of computation and storage, meet the deadline of the application, and so forth. This chapter lists and describes techniques used in each of these systems for processing huge amount of data. A survey of workflow management techniques is useful for understanding the working of the Grid systems providing insights on performance optimization of scientific applications dealing with dataintensive workloads.
SOA-based Grid Job Management Framework. In Proc. 9th Workshop
"... Dedicated to my FamilyList of papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. Project- 1: In papers I and II, work on application execution environments is described. In paper I, we present tools for general purpose solutions using port ..."
Abstract
- Add to MetaCart
(Show Context)
Dedicated to my FamilyList of papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. Project- 1: In papers I and II, work on application execution environments is described. In paper I, we present tools for general purpose solutions using portal technology while paper II addresses access of grid resources within an application specific problem solving environment. I
Planning and Scheduling Data Processing Workflows in the Cloud with Quality-of-Data Constraints?
"... Abstract. Data-intensive and long-lasting applications running in the form of workflows are being increasingly more dispatched to cloud com-puting systems. Current scheduling approaches for graphs of dependen-cies fail to deliver high resource efficiency while keeping computation costs low, especial ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Data-intensive and long-lasting applications running in the form of workflows are being increasingly more dispatched to cloud com-puting systems. Current scheduling approaches for graphs of dependen-cies fail to deliver high resource efficiency while keeping computation costs low, especially for continuous data processing workflows, where the scheduler does not perform any reasoning about the impact new input data may have in the workflow final output. To face such stark challenge, we introduce a new scheduling criterion, Quality-of-Data (QoD), which describes the requirements about the data that worth the triggering of tasks in workflows. Based on the QoD notion, we propose a novel service-oriented scheduler planner, for continuous data processing workflows, that is capable of enforcing QoD constraints and guide the scheduling to attain resource efficiency, overall controlled performance, and task pri-oritization. To contrast the advantages of our scheduling model against others, we developed WaaS (Workflow-as-a-Service), a workflow coordi-nator system for the Cloud where data is shared among tasks via cloud columnar database. 1
WaaS: Workflow-as-a-Service for the Cloud with Scheduling of Continuous and Data-intensive Workflows
"... Data-intensive and long-lasting applications running in the form of workflows are being increasingly dispatched to cloud computing systems. Current scheduling approaches for graphs of dependencies fail to deliver high resource efficiency while keeping computation costs low, especially for continuous ..."
Abstract
- Add to MetaCart
(Show Context)
Data-intensive and long-lasting applications running in the form of workflows are being increasingly dispatched to cloud computing systems. Current scheduling approaches for graphs of dependencies fail to deliver high resource efficiency while keeping computation costs low, especially for continuous data processing workflows, where the scheduler does not perform any reasoning about the impact new input data may have in the workflow final output. To face such a challenge, we introduce a new scheduling criterion, Quality-of-Data (QoD), which describes the requirements about the data that are worthy of the triggering of tasks in workflows. Based on the QoD notion, we propose a novel service-oriented scheduler planner, for continuous data processing workflows, that is capable of enforcing QoD constraints and guide the scheduling to attain resource efficiency, overall controlled performance, and task prioritization. To contrast the advantages of our scheduling model against others, we developed WaaS (Workflow-as-a-Service), a workflow coordinator system for the Cloud where data is shared among tasks via cloud columnar database.
Workflows with Model Selection: a Multilocus Approach to Phylogenetic Analysis
"... Abstract The workflow model of description and execution of complex tasks can be of great use to design and parallelize scientific experiments, though it remains a scarcely studied area in its application to phylogenetic analysis. In order to remedy this situation, we study and identify sources of p ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract The workflow model of description and execution of complex tasks can be of great use to design and parallelize scientific experiments, though it remains a scarcely studied area in its application to phylogenetic analysis. In order to remedy this situation, we study and identify sources of parallel tasks in the main recon-struction stages as well as in other indispensable problems on which it depends: model selection and sequence alignment. Finally, we present a general-purpose im-plementation for use in cluster environments and examine the performance of our method through application to very large sets of whole mitochondrial genomes, by which problems of biological interest can be solved with new-found efficiency and accuracy. 1
J. Parallel Distrib. Comput. ( ) – Contents lists available at ScienceDirect
"... journal homepage: www.elsevier.com/locate/jpdc Time-division-multiplexed arbitration in silicon nanophotonic networks-on-chip ..."
Abstract
- Add to MetaCart
(Show Context)
journal homepage: www.elsevier.com/locate/jpdc Time-division-multiplexed arbitration in silicon nanophotonic networks-on-chip
Ver. Author Date Comments
"... 1 P. Martin (UEDIN) 12/3/2012 Initial draft, basic layout. 2 P. Martin 28/3/2012 Annotations of content for delegation. ..."
Abstract
- Add to MetaCart
1 P. Martin (UEDIN) 12/3/2012 Initial draft, basic layout. 2 P. Martin 28/3/2012 Annotations of content for delegation.