Results 1 - 10
of
67
Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors
, 1999
"... Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported ..."
Abstract
-
Cited by 142 (4 self)
- Add to MetaCart
Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported by the Hong Kong Research Grants Council under contract numbers HKUST 734/96E, HKUST 6076/97E, and HKU 7124/99E. Authors' addresses: Y.-K. Kwok, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong; email: ykwok@eee.hku.hk; I. Ahmad, Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. 2000 ACM 0360-0300/99/1200--0406 $5.00 ACM Computing Surveys, Vol. 31, No. 4, December 1999 1.
COSYN: Hardware-Software Co-synthesis of Embedded Systems
, 1997
"... Hardware-software co-synthesis is the process of partitioning an embedded system specification into hardware and software modules to meet performance, power, cost, and reliability goals. In this paper, we present a hardware-software co-synthesis technique for real-time distributed embedded systems. ..."
Abstract
-
Cited by 79 (8 self)
- Add to MetaCart
Hardware-software co-synthesis is the process of partitioning an embedded system specification into hardware and software modules to meet performance, power, cost, and reliability goals. In this paper, we present a hardware-software co-synthesis technique for real-time distributed embedded systems. Our cosynthesis algorithm has the following features: 1) it allows the use of multiple types of processing elements (PEs) and inter-PE communication links, where the links can take various forms (point-to-point, bus, local area network, etc.), 2) it supports both concurrent and sequential modes of communication and computation, 3) it allows both preemptive and non-preemptive scheduling, 4) it employs the concept of an association array to tackle the problem of multi-rate systems (which are commonly found in multimedia applications), 5) it uses a scheduler based on dynamic deadline-based priority levels for an accurate performance estimation of a cosynthesis solution, 6) it uses a new dynamic...
Benchmarking and Comparison of the Task Graph Scheduling Algorithms
, 1999
"... The problem of scheduling a parallel program represented by a weighted directed acyclic graph (DAG) to a set of homogeneous processors for minimizing the completion time of the program has been extensively studied. The NP-completeness of the problem has stimulated researchers to propose a myriad of ..."
Abstract
-
Cited by 67 (2 self)
- Add to MetaCart
The problem of scheduling a parallel program represented by a weighted directed acyclic graph (DAG) to a set of homogeneous processors for minimizing the completion time of the program has been extensively studied. The NP-completeness of the problem has stimulated researchers to propose a myriad of heuristic algorithms. While most of these algorithms are reported to be efficient, it is not clear how they compare against each other. A meaningful performance evaluation and comparison of these algorithms is a complex task and it must take into account a number of issues. First, most scheduling algorithms are based upon diverse assumptions, making the performance comparison rather purposeless. Second, there does not exist a standard set of benchmarks to examine these algorithms. Third, most algorithms are evaluated using small problem sizes, and, therefore, their scalability is unknown. In this paper, we first provide a taxonomy for classifying various algorithms into distinct categories a...
On Exploiting Task Duplication in Parallel Program Scheduling
, 1998
"... One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executing on different processors exchange data. Given a task graph, duplication-based scheduling can mitigate this overhead by all ..."
Abstract
-
Cited by 53 (7 self)
- Add to MetaCart
One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executing on different processors exchange data. Given a task graph, duplication-based scheduling can mitigate this overhead by allocating some of the tasks redundantly on more than one processors. In this paper, we focus on the problem of using duplication in static scheduling of task graphs on parallel and distributed systems. We discuss five previously proposed algorithms, and examine their merits and demerits. We describe some of the essential principles for exploiting duplication in a more useful manner, and based on these principles propose an algorithm which outperforms the previous algorithms. The proposed algorithm generates optimal solutions for a number of task graphs. The algorithm assumes an unbounded number of processors. For scheduling on a bounded number of processors, we propose a second algorithm which...
Integrating Communication Protocol Selection with Partitioning in Hardware/Software Codesign
- in Proc. Int. Symp. System Level Synthesis
, 1998
"... This paper presents a codesign approach which incorporates communication protocol selection as a design parameter within hardware/software partitioning. The presented approach takes into account data transfer rates depending on communication protocol types and configurations, and different operating ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
This paper presents a codesign approach which incorporates communication protocol selection as a design parameter within hardware/software partitioning. The presented approach takes into account data transfer rates depending on communication protocol types and configurations, and different operating frequencies of system components, i.e. CPUs, ASICs, and busses. It also takes into account the timing and area influences of drivers and driver calls needed to perform the communication. The approach is illustrated by a number of design space exploration experiments which use models of the PCI and USB communication protocols. 1. Introduction This paper presents an approach to hardware/software codesign which integrates communication protocol selection with hardware/software partitioning. The approach has been implemented as an extension to the LYCOS[6] cosynthesis system. Most current approaches to co-synthesis consider communication synthesis to be a final step in the co-synthesis traject...
Optimized Rapid Prototyping for Real-Time Embedded Heterogeneous Multiprocessors
, 1999
"... This paper presents an enhancement of our "Algorithm Architecture Adequation" (AAA) prototyping methodology which allows to rapidly develop and optimize the implementation of a reactive real-time dataflow algorithm on a embedded heterogeneous multiprocessor architecture, predict its real-time behavi ..."
Abstract
-
Cited by 32 (9 self)
- Add to MetaCart
This paper presents an enhancement of our "Algorithm Architecture Adequation" (AAA) prototyping methodology which allows to rapidly develop and optimize the implementation of a reactive real-time dataflow algorithm on a embedded heterogeneous multiprocessor architecture, predict its real-time behavior and automatically generate the corresponding distributed and optimized static executive. It describes a new optimization heuristic able to support heterogeneous architectures and takes into account accurately inter-processor communications, which are usually neglected but may reduce dramatically multiprocessor performances. 1 Introduction The increasing complexity of signal, image and control processing algorithms in embedded applications, requires high computational power to satisfy real-time constraints. This power can be achieved by parallel multiprocessors which are often heterogeneous in embedded system: they are made of different types of processors interconnected by different typ...
Efficient Scheduling of Arbitrary Task Graphs to Multiprocessors using A Parallel Genetic Algorithm
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1997
"... Given a parallel program represented by a task graph, the objective of a scheduling algorithm is to minimize the overall execution time of the program by properly assigning the nodes of the graph to the processors. This multiprocessor scheduling problem is NP-complete even with simplifying assumptio ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
Given a parallel program represented by a task graph, the objective of a scheduling algorithm is to minimize the overall execution time of the program by properly assigning the nodes of the graph to the processors. This multiprocessor scheduling problem is NP-complete even with simplifying assumptions, and becomes more complex under relaxed assumptions such as arbitrary precedence constraints, and arbitrary task execution and communication times. The present literature on this topic is a large repertoire of heuristics that produce good solutions in a reasonable amount of time. These heuristics, however, have restricted applicability in a practical environment because they have a number of fundamental problems including high time complexity, lack of scalability, and no performance guarantee with respect to optimal solutions. Recently, genetic algorithms (GAs) have been widely reckoned as a useful vehicle for obtaining high quality or even optimal solutions for a broad range of combinato...
Benchmarking the task graph scheduling algorithms
- in IPPS/SPDP
, 1998
"... Abstract † The problem of scheduling a weighted directed acyclic graph (DAG) to a set of homogeneous processors to minimize the completion time has been extensively studied. The NPcompleteness of the problem has instigated researchers to propose a myriad of heuristic algorithms. While these algorith ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
Abstract † The problem of scheduling a weighted directed acyclic graph (DAG) to a set of homogeneous processors to minimize the completion time has been extensively studied. The NPcompleteness of the problem has instigated researchers to propose a myriad of heuristic algorithms. While these algorithms are individually reported to be efficient, it is not clear how effective they are and how well they compare against each other. A comprehensive performance evaluation and comparison of these algorithms entails addressing a number of difficult issues. One of the issues is that a large number of scheduling algorithms are based upon radically different assumptions, making their comparison on a unified basis a rather intricate task. Another issue is that there is no standard set of benchmarks that can be used to evaluate and compare these algorithms. Furthermore, most algorithms are evaluated using small problem sizes, and it is not clear how their performance scales with the problem size. In this paper, we first provide a taxonomy for classifying various algorithms into different categories according to their assumptions and functionalities. We then propose a set of benchmarks which are of diverse structures without being biased towards a particular scheduling technique and still allow variations in important parameters. We have evaluated 15 scheduling algorithms, and compared them using the proposed benchmarks. Based upon the design philosophies and principles behind these algorithms, we interpret the results and discuss why some algorithms perform better than the others.
On Runtime Parallel Scheduling for Processor Load Balancing
- IEEE Trans. Parallel and Distributed Systems
, 1997
"... Parallel scheduling is a new approach for load balancing. In parallel scheduling, all processors cooperate to schedule work. Parallel scheduling is able to accurately balance the load by using global load information at compile-time or runtime. It provides high-quality load balancing. This paper pre ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
Parallel scheduling is a new approach for load balancing. In parallel scheduling, all processors cooperate to schedule work. Parallel scheduling is able to accurately balance the load by using global load information at compile-time or runtime. It provides high-quality load balancing. This paper presents an overview of the parallel scheduling technique. Scheduling algorithms for tree, hypercube, and mesh networks are presented. These algorithms can fully balance the load and maximize locality 1. Introduction Static scheduling balances the workload before runtime and can be applied to problems with a predictable structure, which are called static problems. Dynamic scheduling performs scheduling activities concurrently at runtime, which applies to problems with an unpredictable structure, which are called dynamic problems. Static scheduling utilizes the knowledge of problem characteristics to reach a well-balanced load [1, 2, 3, 4]. However, it is not able to balance the load for dynami...
Analysis and Synthesis of Communication-Intensive Heterogeneous Real-Time Systems
- LINKÖPING STUDIES IN SCIENCE AND TECHNOLOGY, PH.D. DISSERTATION NO. 833
, 2003
"... EMBEDDED COMPUTER SYSTEMS are now everywhere: from alarm clocks to PDAs, from mobile phones to cars, almost all the devices we use are controlled by embedded computer systems. An important class of embedded computer systems is that of real-time systems, which have to fulfill strict timing requiremen ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
EMBEDDED COMPUTER SYSTEMS are now everywhere: from alarm clocks to PDAs, from mobile phones to cars, almost all the devices we use are controlled by embedded computer systems. An important class of embedded computer systems is that of real-time systems, which have to fulfill strict timing requirements. As realtime systems become more complex, they are often implemented using distributed heterogeneous architectures. The main objective of this thesis is to develop analysis and synthesis methods for communication-intensive heterogeneous hard real-time systems. The systems are heterogeneous not only in terms of platforms and communication protocols, but also in terms of scheduling policies. Regarding this last aspect, in this thesis we consider time-driven systems, event-driven systems, and a combination of both, called multi-cluster systems. The analysis takes into

