Results 1 - 10
of
142
Dataflow Process Networks
- Proceedings of the IEEE
, 1995
"... We review a model of computation used in industrial practice in signal processing software environments and experimentally in other contexts. We give this model the name "dataflow process networks," and study its formal properties as well as its utility as a basis for programming language design. Va ..."
Abstract
-
Cited by 232 (30 self)
- Add to MetaCart
We review a model of computation used in industrial practice in signal processing software environments and experimentally in other contexts. We give this model the name "dataflow process networks," and study its formal properties as well as its utility as a basis for programming language design. Variants of this model are used in commercial visual programming systems such as SPW from the Alta Group of Cadence (formerly Comdisco Systems), COSSAP from Synopsys (formerly Cadis), the DSP Station from Mentor Graphics, and Hypersignal from Hyperception. They are also used in research software such as Khoros from the University of New Mexico and Ptolemy from the University of California at Berkeley, among many others. Dataflow process networks are shown to be a special case of Kahn process networks, a model of computation where a number of concurrent processes communicate through unidirectional FIFO channels, where writes to the channel are non-blocking, and reads are blocking. In dataflow process networks, each process consists of repeated "firings" of a dataflow "actor". An actor defines a (often functional) quantum of computation. By dividing processes into actor firings, the considerable overhead of context switching incurred in most implementations of Kahn process networks is avoided. We relate dataflow process networks to other dataflow models, including those used in dataflow machines, such as static dataflow and the tagged-token model. We also relate dataflow process networks to functional languages such as Haskell, and show that modern language concepts such as higher-order functions and polymorphism can be used effectively in dataflow process networks. A number of programming examples using a visual syntax are given. This research is part of the Ptolemy project, whi...
A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems
, 2001
"... this paper is organized as follows. Section 2 defines the computational environment parameters that were varied in the simulations. Descriptions of the 11 mapping heuristics are found in Section 3. Section 4 examines selected results from the simulation study. A list of implementation parameters and ..."
Abstract
-
Cited by 155 (40 self)
- Add to MetaCart
this paper is organized as follows. Section 2 defines the computational environment parameters that were varied in the simulations. Descriptions of the 11 mapping heuristics are found in Section 3. Section 4 examines selected results from the simulation study. A list of implementation parameters and procedures that could be varied for each heuristic is presented in Section 5
Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors
, 1999
"... Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported ..."
Abstract
-
Cited by 142 (4 self)
- Add to MetaCart
Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported by the Hong Kong Research Grants Council under contract numbers HKUST 734/96E, HKUST 6076/97E, and HKU 7124/99E. Authors' addresses: Y.-K. Kwok, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong; email: ykwok@eee.hku.hk; I. Ahmad, Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. 2000 ACM 0360-0300/99/1200--0406 $5.00 ACM Computing Surveys, Vol. 31, No. 4, December 1999 1.
Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors
- IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1996
"... In this paper, we propose a static scheduling algorithm for allocating task graphs to fullyconnected multiprocessors. We discuss six recently reported scheduling algorithms and show that they possess one drawback or the other which can lead to poor performance. The proposed algorithm, which is calle ..."
Abstract
-
Cited by 100 (17 self)
- Add to MetaCart
In this paper, we propose a static scheduling algorithm for allocating task graphs to fullyconnected multiprocessors. We discuss six recently reported scheduling algorithms and show that they possess one drawback or the other which can lead to poor performance. The proposed algorithm, which is called the Dynamic Critical-Path (DCP) scheduling algorithm, is different from the previously proposed algorithms in a number of ways. First, it determines the critical path of the task graph and selects the next node to be scheduled in a dynamic fashion. Second, it rearranges the schedule on each processor dynamically in the sense that the positions of the nodes in the partial schedules are not fixed until all nodes have been considered. Third, it selects a suitable processor for a node by looking ahead the potential start times of the remaining nodes on that processor, and schedules relatively less important nodes to the processors already in use. A global as well as a pair-wise comparison is c...
Simgrid: a Toolkit for the Simulation of Application Scheduling
- Proceedings of the First IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2001
, 2001
"... Advances in hardware and software technologies have made it possible to deploy parallel applications over increasingly large sets of distributed resources. Consequently, the study of scheduling algorithms for such applications has been an active area of research. Given the nature of most scheduling ..."
Abstract
-
Cited by 99 (6 self)
- Add to MetaCart
Advances in hardware and software technologies have made it possible to deploy parallel applications over increasingly large sets of distributed resources. Consequently, the study of scheduling algorithms for such applications has been an active area of research. Given the nature of most scheduling problems one must resort to simulation to effectively evaluate and compare their efficacy over a wide range of scenarios. It has thus become necessary to simulate those algorithms for increasingly complex distributed, dynamic, heterogeneous environments. In this paper we present Simgrid, a simulation toolkit for the study of scheduling algorithms for distributed application. This paper gives the main concepts and models behind Simgrid, describes its API and highlights current implementation issues. We also give some experimental results and describe work that builds on Simgrid's functionalities. 1.
Design of Embedded Systems: Formal Models, Validation, and Synthesis
- PROCEEDINGS OF THE IEEE
, 1999
"... This paper addresses the design of reactive real-time embedded systems. Such systems are often heterogeneous in implementation technologies and design styles, for example by combining hardware ASICs with embedded software. The concurrent design process for such embedded systems involves solving the ..."
Abstract
-
Cited by 92 (8 self)
- Add to MetaCart
This paper addresses the design of reactive real-time embedded systems. Such systems are often heterogeneous in implementation technologies and design styles, for example by combining hardware ASICs with embedded software. The concurrent design process for such embedded systems involves solving the specification, validation, and synthesis problems. We review the variety of approaches to these problems that have been taken.
Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing
- IEEE Transactions on Parallel and Distributed Systems
, 2002
"... AbstractÐEfficient application scheduling is critical for achieving high performance in heterogeneous computing environments. The application scheduling problem has been shown to be NP-complete in general cases as well as in several restricted cases. Because of its key importance, this problem has b ..."
Abstract
-
Cited by 92 (0 self)
- Add to MetaCart
AbstractÐEfficient application scheduling is critical for achieving high performance in heterogeneous computing environments. The application scheduling problem has been shown to be NP-complete in general cases as well as in several restricted cases. Because of its key importance, this problem has been extensively studied and various algorithms have been proposed in the literature which are mainly for systems with homogeneous processors. Although there are a few algorithms in the literature for heterogeneous processors, they usually require significantly high scheduling costs and they may not deliver good quality schedules with lower costs. In this paper, we present two novel scheduling algorithms for a bounded number of heterogeneous processors with an objective to simultaneously meet high performance and fast scheduling time, which are called the Heterogeneous Earliest-Finish-Time (HEFT) algorithm and the Critical-Path-on-a-Processor (CPOP) algorithm. The HEFT algorithm selects the task with the highest upward rank value at each step and assigns the selected task to the processor, which minimizes its earliest finish time with an insertion-based approach. On the other hand, the CPOP algorithm uses the summation of upward and downward rank values for prioritizing tasks. Another difference is in the processor selection phase, which schedules the critical tasks onto the processor that minimizes the total execution time of the critical tasks. In order to provide a robust and unbiased comparison with the related work, a parametric graph generator was designed to generate weighted directed acyclic graphs with various characteristics. The comparison study, based on both randomly generated graphs and the graphs of some real applications, shows that our scheduling algorithms significantly surpass previous approaches in terms of both quality and cost of schedules, which are mainly presented with schedule length ratio, speedup, frequency of best results, and average scheduling time metrics. Index TermsÐDAG scheduling, task graphs, heterogeneous systems, list scheduling, mapping. 1
COSYN: Hardware-Software Co-synthesis of Embedded Systems
, 1997
"... Hardware-software co-synthesis is the process of partitioning an embedded system specification into hardware and software modules to meet performance, power, cost, and reliability goals. In this paper, we present a hardware-software co-synthesis technique for real-time distributed embedded systems. ..."
Abstract
-
Cited by 79 (8 self)
- Add to MetaCart
Hardware-software co-synthesis is the process of partitioning an embedded system specification into hardware and software modules to meet performance, power, cost, and reliability goals. In this paper, we present a hardware-software co-synthesis technique for real-time distributed embedded systems. Our cosynthesis algorithm has the following features: 1) it allows the use of multiple types of processing elements (PEs) and inter-PE communication links, where the links can take various forms (point-to-point, bus, local area network, etc.), 2) it supports both concurrent and sequential modes of communication and computation, 3) it allows both preemptive and non-preemptive scheduling, 4) it employs the concept of an association array to tackle the problem of multi-rate systems (which are commonly found in multimedia applications), 5) it uses a scheduler based on dynamic deadline-based priority levels for an accurate performance estimation of a cosynthesis solution, 6) it uses a new dynamic...
Stochastic Scheduling
, 1999
"... There is a current need for scheduling policies that can leverage the performance variability of resources on multiuser clusters. We develop one solution to this problem called stochastic scheduling that utilizes a distribution of application execution performance on the target resources to determin ..."
Abstract
-
Cited by 77 (12 self)
- Add to MetaCart
There is a current need for scheduling policies that can leverage the performance variability of resources on multiuser clusters. We develop one solution to this problem called stochastic scheduling that utilizes a distribution of application execution performance on the target resources to determine a performance-efficient schedule. In this paper, we define a stochastic scheduling policy based on time-balancing for data parallel applications whose execution behavior can be represented as a normal distribution. Using three distributed applications on two contended platforms, we demonstrate that a stochastic scheduling policy can achieve good and predictable performance for the application as evaluated by several performance measures.
Scheduling Strategies for Master-Slave Tasking on Heterogeneous Processor Grids
, 2002
"... In this paper, we consider the problem of allocating a large number of independent, equal-sized tasks to a heterogeneous "grid" computing platform. We use a non-oriented graph to model a grid, where resources can have different speeds of computation and communication, as well as different overlap ca ..."
Abstract
-
Cited by 72 (34 self)
- Add to MetaCart
In this paper, we consider the problem of allocating a large number of independent, equal-sized tasks to a heterogeneous "grid" computing platform. We use a non-oriented graph to model a grid, where resources can have different speeds of computation and communication, as well as different overlap capabilities. We show how to determine the optimal steady-state scheduling strategy for each processor (the fraction of time spent computing and the fraction of time spent communicating with each neighbor). This result holds for a quite general framework, allowing for cycles and multiple paths in the interconnection graph, and allowing for several masters. Because

