Results 1 -
7 of
7
Mapping Parallel Iterative Algorithms onto Workstation Networks
, 1994
"... For communication-intensive parallel applications, the maximum degree of concurrency achievable is limited by the communication throughput made available by the network. In previous work [HPS94], we showed experimentally that the performance of certain parallel applications running on a workstation ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
For communication-intensive parallel applications, the maximum degree of concurrency achievable is limited by the communication throughput made available by the network. In previous work [HPS94], we showed experimentally that the performance of certain parallel applications running on a workstation network can be improved significantly if a congestion control protocol is used to enhance network performance. In this
Towards performance-driven system support for distributed computing in clustered environments
- Journal of Parallel and Distributed Computing
, 1999
"... With the proliferation of workstation clusters connected by high-speed networks, providing efficient system support for concurrent applications engaging in nontrivial interaction has become an important problem. Two principal barriers to harnessing parallelism are: one, efficient mechanisms that ach ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
With the proliferation of workstation clusters connected by high-speed networks, providing efficient system support for concurrent applications engaging in nontrivial interaction has become an important problem. Two principal barriers to harnessing parallelism are: one, efficient mechanisms that achieve transparent dependency maintenance while preserving semantic correctness, and two, scheduling algorithms that match coupled processes to distributed resources while explicitly incorporating their communication costs. This paper describes a set of performance features, their properties, and implementation in a system support environment called DUNES that achieves transparent dependency maintenance—IPC, file access, memory access, process creation/termination, process relationships—under dynamic load balancing. The two principal performance features are push/pull-based active and passive end-point caching and communication-sensitive load balancing. Collectively, they mitigate the overhead introduced by the transparent dependency maintenance mechanisms. Communication-sensitive load balancing, in addition, affects the scheduling of distributed resources to application processes where both communication and computation costs are explicitly taken into account. DUNES ’ architecture endows commodity operating systems with distributed operating system functionality while achieving transparency with respect to their existing application base. DUNES also preserves semantic correctness with respect to single processor semantics. We show performance measurements of a UNIX based implementation on Sparc and x86 architectures over high-speed LAN environments. We show that significant performance gains in terms of system throughput and parallel application speed-up are achievable.
Congestion Control for Asynchronous Parallel Computing on Workstation Networks
- Parallel Computing
, 1997
"... Asynchronous parallel computing can result in high message generation rates, thus triggering network congestion. This paper investigates the network congestion problem that can result from asynchronous parallel programs' high message generation rates. First, we characterize the communication require ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Asynchronous parallel computing can result in high message generation rates, thus triggering network congestion. This paper investigates the network congestion problem that can result from asynchronous parallel programs' high message generation rates. First, we characterize the communication requirements of a large class of supercomputing applications falling under the category of fixed-point problems amenable to solution by parallel iterative methods. In particular, we concentrate on asynchronous iterative algorithms whose communication/computation ratio is especially high resulting in degraded effective throughput if communication is not managed properly. Second, we show the effects of network contention and asynchrony on application performance in a local-area network environment and investigate methods of solution. Our approach is based on a congestion control algorithm called Warp Control whose adaptive properties are exploited to yield significant performance enhancements when ne...
Distributed Parallel Computing in Mermera: Mixing Noncoherent Shared Memories
, 1996
"... Programmers of parallel processes that communicate through shared globally distributed data structures (DDS) face a difficult choice. Either they must explicitly program DDS management, by partitioning or replicating it over multiple distributed memory modules, or be content with a high latency c ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Programmers of parallel processes that communicate through shared globally distributed data structures (DDS) face a difficult choice. Either they must explicitly program DDS management, by partitioning or replicating it over multiple distributed memory modules, or be content with a high latency coherent (sequentially consistent) memory abstraction that hides the DDS' distribution.
Program-Level Control of Network Delay for Parallel Asynchronous Iterative Applications
- In Proc. 3rd Int’l Conference on High Performance Computing
, 1996
"... Software distributedshared memory (DSM)platfamzs on networks of workstations tolerate large network latencies by employing one of several weak memory consistency mod-els. Fully asynchronous parallel iterative algorithms offer an additional degree offreedom to tolerate network latency: they behave co ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Software distributedshared memory (DSM)platfamzs on networks of workstations tolerate large network latencies by employing one of several weak memory consistency mod-els. Fully asynchronous parallel iterative algorithms offer an additional degree offreedom to tolerate network latency: they behave correctly when supplied outdated shared data. However; these algorithms canjood the network with mes-sages in the presence of large delays. We propose a method of controlling asynchronous iterative methods wherein the reader of a shared datum imposes an upper bound on its age via use of a blocking GlobalRead primitive. This reduces the overall number of iterations executed by the readel; thus controlling the amount of shared updates generated. Exper-iments for a fully asynchronous linear equation solver run-ning on a network of 10 IBM RY6000 workstations show that the proposed GlobalRead primitive provides sign$c-ant performance improvement. 1.
Approaches to Support Parallel Programming on Workstation Clusters: A Survey
- A Survey, Informatik Berichte, Fachgruppe Informatik, Universitat-GH Siegen
, 1995
"... The goal of this report is to survey state of the art and existing approaches for parallel programming on workstation clusters with special emphasis on object-oriented programming. First, workstation clusters as parallel computing platforms are characterized and fundamental concepts for parallel pro ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The goal of this report is to survey state of the art and existing approaches for parallel programming on workstation clusters with special emphasis on object-oriented programming. First, workstation clusters as parallel computing platforms are characterized and fundamental concepts for parallel programming are discussed. Then, an overview of existing tools, systems, languages, and environments is given. The report concludes by identifying features of software systems suitable for parallel object-oriented programming on top of workstation clusters.
Non-strict cache coherence: Exploiting data-race tolerance in emerging applications
- In Proc. of the 2000 Intl. Conf. on Parallel Processing
, 2000
"... Software distributed shared memory (DSM) ..."

