• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Adaptive data place ment for distributed-memory machines (1995)

by David K Lowenthal, Gregory R Andrews
Add To MetaCart

Tools

Sorted by:
Results 1 - 6 of 6

Using fine-grain threads and run-time decision making in parallel computing

by David K. Lowenthal, Vincent W. Freeh, Gregory R. Andrews - Journal of Parallel and Distributed Computing , 1996
"... Programming distributed-memory multiprocessors and networks of workstations requires deciding what can execute concurrently, how processes communicate, and where data is placed. These decisions can be made statically by a programmer or compiler, or they can be made dynamically at run time. Using run ..."
Abstract - Cited by 33 (14 self) - Add to MetaCart
Programming distributed-memory multiprocessors and networks of workstations requires deciding what can execute concurrently, how processes communicate, and where data is placed. These decisions can be made statically by a programmer or compiler, or they can be made dynamically at run time. Using run-time decisions leads to a simpler interface—because decisions are implicit—and it can lead to better decisions—because more information is available. This paper examines the costs, benefits, and details of making decisions at run time. The starting point is explicit fine-grain parallelism with any number (even thousands) of threads. Five specific techniques are considered: (1) implicitly coarsening the granularity of parallelism, (2) using implicit communication implemented by a distributed shared memory, (3) overlapping computation and communication, (4) adaptively moving threads and data between nodes to minimize communication and balance load, and (5) dynamically remapping data to pages to avoid false sharing. Details are given on the performance of each of these techniques as well as their overall performance on several scientific applications. 1
(Show Context)

Citation Context

...hout requiring programmers or compilers to make such decisions. (Different dynamic approaches are discussed in [Who91] and [HMS + 95].) This approach is implemented in a prototype system called Adapt =-=[LA95]-=-, which is a subsystem of the Filaments package. (a) (b) (c) (BLOCK) (VARIABLE BLOCK) (CYCLIC) Figure 4: Different data placements in Adapt for the case of 16 rows and 4 nodes. Three different placeme...

A Sisal Compiler for Both Distributed- and Shared-Memory Machines

by Vincent Freeh, Vincent W. Freeh, Gregory R. Andrews, Gregory R. Andrews - In High Performance Functional Computing , 1995
"... This paper describes a prototype Sisal compiler that supports distributed- as well as shared-memory machines. The compiler, fsc, modifies the code-generation phase of the optimizing Sisal compiler, osc, to use the Filaments library as a run-time system. Filaments efficiently supports fine-grain para ..."
Abstract - Cited by 6 (1 self) - Add to MetaCart
This paper describes a prototype Sisal compiler that supports distributed- as well as shared-memory machines. The compiler, fsc, modifies the code-generation phase of the optimizing Sisal compiler, osc, to use the Filaments library as a run-time system. Filaments efficiently supports fine-grain parallelism and a shared-memory programming model. Using fine-grain threads makes it possible to implement recursive as well as loop parallelism; it also facilitates dynamic load balancing. Using a distributed implementation of shared memory (a DSM) simplifies the compiler by obviating the need for explicit message passing. February 21, 1995 Department of Computer Science The University of Arizona Tucson, AZ 85721 1 First published in the "High-Performance Functional Computing Conference", April 1995. 2 This work was supported by NSF grants CCR-9108412 and CDA-8822652. 1 Introduction It is difficult to create a correct and efficient parallel program; this difficulty is compounded because e...

Reducing File-related Network Traffic in TreadMarks via Parallel File Input/Output

by Ce-kuen Shieh, Su-cheong Mac, Bor-jyh Shieh
"... In this paper, we describe the implementation of a parallel file I/O system on TreadMarks, a page-based software Distributed Shared Memory (DSM) system built on a network of workstations. The main goal of our parallel file I/O system is to reduce filerelated network traffic in TreadMarks. This proto ..."
Abstract - Add to MetaCart
In this paper, we describe the implementation of a parallel file I/O system on TreadMarks, a page-based software Distributed Shared Memory (DSM) system built on a network of workstations. The main goal of our parallel file I/O system is to reduce filerelated network traffic in TreadMarks. This prototype employs our previously proposed variable data distribution scheme, which distributes the file blocks among the nodes according to the application’s access pattern, and delayed file access mechanism, which delays the transfer of a requested file block across the network until the block is actually used during computation. Currently, our parallel file I/O system is combined into the user-level library of TreadMarks, with minor modification of TreadMarks ’ code. Due to our UNIX-like interface, the existing TreadMarks programs require very little modifications. The performance improvement of our prototype on Successive Over Relaxation is quite satisfactory while that on Matrix Multiplication is less significant. Keywords: distributed shared memory, parallel file I/O, network of workstations, variable data distribution scheme, delayed file access mechanism 1.
(Show Context)

Citation Context

... of our parallel file I/O system [16]. The results show that our design is quite efficient for some DSM applications, and that the implementation requires little modification of the DSM system. Adapt =-=[17]-=- is a subsystem of the Distributed Filaments package [7], which is another DSM system. Adapt tries to minimize the overall completion time of an iterative parallel program by determining the best data...

Locality-Preserving Dynamic Load Balancing for Data-Parallel Applications on Distributed-Memory Multiprocessors

by Pangfeng Liu, Jan-jan Wu, Chih-hsuae Yang
"... Load balancing and data locality are the two most important factors affecting the performance of parallel programs running on distributed-memory multiprocessors. A good balancing scheme should evenly distribute the workload among the available processors, and locate the tasks close to their data to ..."
Abstract - Add to MetaCart
Load balancing and data locality are the two most important factors affecting the performance of parallel programs running on distributed-memory multiprocessors. A good balancing scheme should evenly distribute the workload among the available processors, and locate the tasks close to their data to reduce communication and idle time. In this paper, we study the load balancing problem of data-parallel loops with predictable neighborhood data references. The loops are characterized by variable and unpredictable execution time due to dynamic external workload. Nevertheless the data referenced by each loop iteration exploits spatial locality of stencil references. We combine an initial static BLOCK scheduling and a dynamic scheduling based on work stealing. Data locality is preserved by careful restrictions on the tasks that can be migrated. Experimental results on a network of workstations are reported.
(Show Context)

Citation Context

...ta and Rivera [5] proposed a two-level scheme (SDD) in which static scheduling and dynamic scheduling overlap. A similar approach focusing on adaptive data placement for load balancing is reported in =-=[8]-=-. The difficulty of load balancing is in deciding whether work migration is beneficial or not. None of the above balancing strategies has addressed this issue however. The SUPPLE system [9] is a run-t...

Improving the Performance of Distributed Shared Memory Systems via Parallel File Input/Output

by Tp Ut
"... File accesses in page-based software Distributed Shared Memory (DSM) systems are usually performed by a single node, which may lead to a poor overall performance because a large amount of network traffic is generated to transfer data between this file handling node and the other nodes. To reduce the ..."
Abstract - Add to MetaCart
File accesses in page-based software Distributed Shared Memory (DSM) systems are usually performed by a single node, which may lead to a poor overall performance because a large amount of network traffic is generated to transfer data between this file handling node and the other nodes. To reduce the file-related network traffic in the DSM systems, we have designed a parallel file I/O system, that is independent of the memory consistency models, for the pagebased software DSM systems built on a network of workstations. The two main features in our design are the adaptive data distribution scheme and the delayed file access mechanism. The former distributes file blocks among the nodes according to the access pattern of the application; while the latter ensures that the data are transferred to the consumer node instead of the request node by exploiting the memory mapping features of the virtual shared address space of the DSM systems. Our first prototype is built on Cohesion, a page-base ...

FINE-GRAIN PARALLELISM AND RUN-TIME DECISION MAKING

by David K, David Lowenthai
"... Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the aut ..."
Abstract - Add to MetaCart
Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author. Downloaded 10-May-2016 21:06:32 Link to item
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University