Results 1 - 10
of
28
Hypertool: A Programming Aid for Message-Passing Systems
- IEEE Trans. on Parallel and Distributed Systems
, 1990
"... Abstract|As both the number of processors and the complexity of problems to be solved increase, programming multiprocessing systems becomes more di cult and error-prone. This paper discusses programming assistance and automation concepts and their application to a program development tool for messag ..."
Abstract
-
Cited by 146 (17 self)
- Add to MetaCart
Abstract|As both the number of processors and the complexity of problems to be solved increase, programming multiprocessing systems becomes more di cult and error-prone. This paper discusses programming assistance and automation concepts and their application to a program development tool for message-passing systems called Hypertool. It performs scheduling and handles the communication primitive insertion automatically. Two algorithms, based on the critical-path method, are presented for scheduling processes statically. Hypertool also generates the performance estimates and other program quality measures to help programmers in improving their algorithms and programs. I.
Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors
, 1999
"... Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported ..."
Abstract
-
Cited by 142 (4 self)
- Add to MetaCart
Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported by the Hong Kong Research Grants Council under contract numbers HKUST 734/96E, HKUST 6076/97E, and HKU 7124/99E. Authors' addresses: Y.-K. Kwok, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong; email: ykwok@eee.hku.hk; I. Ahmad, Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. 2000 ACM 0360-0300/99/1200--0406 $5.00 ACM Computing Surveys, Vol. 31, No. 4, December 1999 1.
Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors
- IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1996
"... In this paper, we propose a static scheduling algorithm for allocating task graphs to fullyconnected multiprocessors. We discuss six recently reported scheduling algorithms and show that they possess one drawback or the other which can lead to poor performance. The proposed algorithm, which is calle ..."
Abstract
-
Cited by 100 (17 self)
- Add to MetaCart
In this paper, we propose a static scheduling algorithm for allocating task graphs to fullyconnected multiprocessors. We discuss six recently reported scheduling algorithms and show that they possess one drawback or the other which can lead to poor performance. The proposed algorithm, which is called the Dynamic Critical-Path (DCP) scheduling algorithm, is different from the previously proposed algorithms in a number of ways. First, it determines the critical path of the task graph and selects the next node to be scheduled in a dynamic fashion. Second, it rearranges the schedule on each processor dynamically in the sense that the positions of the nodes in the partial schedules are not fixed until all nodes have been considered. Third, it selects a suitable processor for a node by looking ahead the potential start times of the remaining nodes on that processor, and schedules relatively less important nodes to the processors already in use. A global as well as a pair-wise comparison is c...
On the allocation of documents in multiprocessor information retrieval systems
- In Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 1991
"... Abstract. Information retrieval is the selection of documents that are potentially relevant to a user’s information need. Given the vast volume of data stored in modern information retrieval systems, searching the document database requires vast computational resources. To meet these computational d ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Abstract. Information retrieval is the selection of documents that are potentially relevant to a user’s information need. Given the vast volume of data stored in modern information retrieval systems, searching the document database requires vast computational resources. To meet these computational demands, various researchers have developed parallel information retrieval systems. As efficient exploitation of parallelism demands fast access to the documents, data organization and placement significantly affect the total processing time. We describe and evaluate a data placement strategy for distributed memory, distributed 1/0 multicomputers. Initially, a formal description of the Multiprocessor Document Allocation Problem (MDAP) and a proof that MDAP is NP Complete are presented. A document allocation
Graph contraction and physical optimization methods: a quality-cost tradeoff for mapping data on parallel computers
- in International Conference of Supercomputing
, 1993
"... 1 ..."
Compile-Time Scheduling of Dataflow Program graphs with Dynamic Constructs
- University of California, Berkeley
, 1992
"... by ..."
Task Assignment on Distributed-Memory Systems with Adaptive Wormhole Routing
- Proc. Interact 2001
, 1994
"... Assignment of tasks of a parallel program onto processors of a distributed-memory system is critical to obtain minimal program completion time by minimizing communication overhead. Wormhole-routing switching technique, with various adaptive routing strategies, is increasingly becoming the trend to b ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Assignment of tasks of a parallel program onto processors of a distributed-memory system is critical to obtain minimal program completion time by minimizing communication overhead. Wormhole-routing switching technique, with various adaptive routing strategies, is increasingly becoming the trend to build scalable distributed-memory systems. This paper presents task assignment heuristics for such wormholerouted systems and analyzes the effect of adaptive routing. A Temporal Communication Graph (TCG) is used to model task graphs and to identify communication steps that conflict both temporally and spatially. Heuristics are proposed to capture temporal link contention and derive optimal assignment in an iterative manner by pairwise exchanging of processors, associated with the critical communication edges, within d hops. The interplay between degree of routing adaptivity, topology, application characteristics, and optimal task assignment are studied through simulation experiments using ran...
Clustering and Intra-Processor Scheduling for Explicitly-Parallel Programs on Distributed-Memory Systems
- In International Parallel Processing Symposium
, 1994
"... Programs for distributed-memory systems are explicitly-parallel and comprise of a set of sequential tasks or processes that communicate via message-passing. The sequence of computation in each task together with the intermediate send and receive communication steps exhibit temporal behavior of the p ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Programs for distributed-memory systems are explicitly-parallel and comprise of a set of sequential tasks or processes that communicate via message-passing. The sequence of computation in each task together with the intermediate send and receive communication steps exhibit temporal behavior of the program. In this paper, we show that the two common models of program representation, the precedence graph and the interaction graph models, are insufficient to capture this temporal behavior and hence are not ideal for solving the clustering and the scheduling problems. We use a new Temporal Communication Graph (TCG) model to represent such explicitly-parallel programs. This model captures communication dependency and overlap of communication with computation. This provides flexibility to get a better estimate of the program completion time. New measures are developed for quantifying critical communication and inter-task parallelism on this model. We analyze the importance of intra-processor...
Mapping Arbitrary Non-Uniform Task Graphs onto Arbitrary Non-Uniform System Graphs
- In Proc. International Conference on Parallel Processing
, 1995
"... this paper, a generic technique for clustering and mapping arbitrary task graphs onto arbitrary system graphs is presented. The task and system graphs studied in this paper have non-uniform computation and communication weights associated with the nodes and edges. The task graphs are directed graphs ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
this paper, a generic technique for clustering and mapping arbitrary task graphs onto arbitrary system graphs is presented. The task and system graphs studied in this paper have non-uniform computation and communication weights associated with the nodes and edges. The task graphs are directed graphs, while the system graphs are undirected. Using two clustering algorithms presented, a multi-level clustered graph called Spec graph can be obtained from a given task graph, and a multi-level clustered graph called Rep graph can be obtained from a given system graph. We present a mapping algorithm which produces a sub-optimal matching of a given Spec graph containing M task modules, onto a Rep graph of N processors, in O(MP ) time, where P = max(M;N ). This algorithm is the first technique which can map arbitrary task graphs with non-uniform nodes and edges onto arbitrary system graphs with non-uniform nodes and edges. A number of algorithms exist which can map an arbitrary non-uniform task graph onto a specific uniform system graph. Even though our algorithm is more generic, we still compare ours with these specialized techniques and show that our technique produces similar results with lower time complexity. 1 Introduction The mapping problem is one of the most challenging problems in parallel and distributed computing [9, 16]. It is known to be NP-complete in its general form as well as several restricted forms [16]. The mapping problem has been studied in a number of different ways in literature. Mapping can be either static or dynamic. In static mapping, the assignments of the nodes

