Results 1 - 10
of
17
Efficient load balancing for wide-area divideand-conquer applications
- In: Proc. PPoPP’01, Snowbird, UT (2001
"... Divide-and-conquer programs are easily parallelized by letting the programmer annotate potential parallelism in the form of spawn and sync constructs. To achieve efficient program execution, the generated work load has to be balanced evenly among the available CPUs. For single cluster systems, Rando ..."
Abstract
-
Cited by 46 (16 self)
- Add to MetaCart
Divide-and-conquer programs are easily parallelized by letting the programmer annotate potential parallelism in the form of spawn and sync constructs. To achieve efficient program execution, the generated work load has to be balanced evenly among the available CPUs. For single cluster systems, Random Stealing (RS) is known to achieve optimal load balancing. However, RS is inefficient when applied to hierarchical wide-area systems where multiple clusters are connected via wide-area networks (WANs) with high latency and low bandwidth. In this paper, we experimentally compare RS with existing loadbalancing strategies that are believed to be efficient for multi-cluster systems, Random Pushing and two variants of Hierarchical Stealing. We demonstrate that, in practice, they obtain less than optimal results. We introduce a novel load-balancing algorithm, Clusteraware Random Stealing (CRS) which is highly efficient and easy to implement. CRS adapts itself to network conditions and job granularities, and does not require manually-tuned parameters. Although CRS sends more data across the WANs, it is faster than its competitors for 11 out of 12 test applications with various WAN configurations. It has at most 4 % overhead in run time compared to RS on a single, large cluster, even with high wide-area latencies and low wide-area bandwidths. These strong results suggest that divideand-conquer parallelism is a useful model for writing distributed supercomputing applications on hierarchical wide-area systems.
Fault Tolerance in Parallel Implementations of Functional Languages
- IN PROCEEDINGS OF THE 21ST SYMPOSIUM ON FAULT TOLERANT COMPUTING
, 1991
"... Computing models for functional language programs not only facilitate automatic exploitation of inherent parallelism, but they also provide for implicit tolerance to hardware faults through temporal and spatial redundancy. In this paper, we argue that faulttolerance can be achieved more efficiently ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Computing models for functional language programs not only facilitate automatic exploitation of inherent parallelism, but they also provide for implicit tolerance to hardware faults through temporal and spatial redundancy. In this paper, we argue that faulttolerance can be achieved more efficiently by using intensional computing models (eduction) rather than extensional computing models (reduction). While intensional computing models can be implemented by using either data-driven execution or demand-driven execution, we show that the latter is naturally suited.
Satin: Efficient Parallel Divide-and-Conquer
- in Java, in: Euro-PAR 2000, no. 1900 in Lecture Notes in Computer Science
, 2000
"... Satin is a system for running divide and conquer programs on distributed memory systems (and ultimately on wide-area metacomputing systems). Satin extends Java with three simple Cilk-like primitives for divide and conquer programming. The Satin compiler and runtime system cooperate to implement th ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Satin is a system for running divide and conquer programs on distributed memory systems (and ultimately on wide-area metacomputing systems). Satin extends Java with three simple Cilk-like primitives for divide and conquer programming. The Satin compiler and runtime system cooperate to implement these primitives eciently on a distributed system, using work stealing to distribute the jobs. Satin optimizes the overhead of local jobs using on-demand serialization, which avoids copying and serialization of parameters for jobs that are not stolen. This optimization is implemented using explicit invocation records. We have implemented Satin by extending the Manta compiler. We discuss the performance of ten applications on a Myrinet-based cluster.
Structured Parallel Programming: Theory meets Practice
- Research Directions in Computer Science
, 1995
"... We address the issue of what should be the proper relationship between theoretical computer science and practical computing. Starting from an analysis of what we perceive of as the failure of formally based research to have as much impact on practical computing as is merited we propose a diagnos ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
We address the issue of what should be the proper relationship between theoretical computer science and practical computing. Starting from an analysis of what we perceive of as the failure of formally based research to have as much impact on practical computing as is merited we propose a diagnosis based on the way formally based research is conducted and the way it is envisaged that results from these areas will be translated into practice. We suggest that it is the responsibility of practitioners of theoretical computer science to work more closely with the practical areas in order to identify ways in which their ideas can be used to augment current practice rather than seeking to replace it. As a case in point we examine functional programming and its relationship to programming parallel machines. We introduce a development, structured parallel programming, that seeks to combine the theoretical advantages of functional programming with established practice in these area...
Models for Persistence in Lazy Functional Programming Systems
, 1993
"... Research into providing support for long term data in lazy functional programming systems is presented in this thesis. The motivation for this work has been to reap the benefits of integrating lazy functional programming languages and persistence. The benefits are . the programmer need not write cod ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Research into providing support for long term data in lazy functional programming systems is presented in this thesis. The motivation for this work has been to reap the benefits of integrating lazy functional programming languages and persistence. The benefits are . the programmer need not write code to support long term data since this is provided as part of the programming system . persistent data can be used in a type safe way since the programming language type system applies to data with the whole range of persistence . the benefits of lazy evaluation are extended to the full lifetime of a data value. Whilst data is reachable, any evaluation performed on the data persists. A data value changes monotonically from an unevaluated state towards a completely evaluated state over time. . interactive data intensive applications such as functional databases can be developed. These benefits are realised by the development of models for persistence in lazy functional programming systems. Tw...
Fundamental issues and the design of MONSTR
- Journal of Universal Computer Science
, 1996
"... Abstract: This is the first in a series of papers dealing with the implementation of an extended term graph rewriting model of computation (described by the DACTL language) on a distributed store architecture. In this paper we set out the high level model, and under some simple packet store model is ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
Abstract: This is the first in a series of papers dealing with the implementation of an extended term graph rewriting model of computation (described by the DACTL language) on a distributed store architecture. In this paper we set out the high level model, and under some simple packet store model is compared to a more realistic and finegrained packet store model, more closely related to the properties of a genuine distributed store architecture, and the differences are used to inspire the definition of the MONSTR sublanguage of DACTL, intended for direct execution on the machine. Various alternative operational semantics for MONSTR are proposed to reflect more closely the finegrained packet store model, and the prospects for establishing correctness are discussed. The detailed treatment of the alternative models, in the context of suitable sublanguages of MONSTR where appropriate, are subjects for subsequent papers.
Computers for Symbolic Processing
- Proceedings of the IEEE
, 1989
"... In this paper, we provide a detailed survey on the motivations, desisn, applications, current status, and limitations of computers destsned fo symbolic processing. Symbolic processin applications are computations that are performed at the word, relation, or meanin levels. A major difference between ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this paper, we provide a detailed survey on the motivations, desisn, applications, current status, and limitations of computers destsned fo symbolic processing. Symbolic processin applications are computations that are performed at the word, relation, or meanin levels. A major difference between symbolic and conventional numeric applications is that the knowledge used in symbolic applications may be fuzzy, uncertain, indeterminate, and ill represented. As a result, the collection, representation, and management of knowledge is more difficult in symbolic applications than in conventional numeric applications. We survey various techniques for knowledge representation and processing, from both the designers' and users' points of view. The design and choice of a suitable language fo symbolic processing and the mapping of applications into a software architecture are then presented. We examine the design process of refining the application requirements into hardware and software architectures and discuss state-of-the-art sequential and parallel computers designed for symbolic processing.
Object-Oriented Term Graph Rewriting
- International Journal of Computer Systems Science and Engineering, CRL Publs., (in print
, 1997
"... The relationship between the generalised computational model of Term Graph Rewriting Systems (TGRS) and Object-Oriented Programming (OOP) is explored and exploited by extending the TGRS model with records where access to parameters is done by naming rather than position. Records are then used as the ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
The relationship between the generalised computational model of Term Graph Rewriting Systems (TGRS) and Object-Oriented Programming (OOP) is explored and exploited by extending the TGRS model with records where access to parameters is done by naming rather than position. Records are then used as the basis for expressing object-oriented techniques such as object encapsulation and (various forms of) inheritance. The effect is that TGRS with records can now be used as an implementation model for a variety of (concurrent) object-oriented (functional, logic or otherwise) languages but also as a common formalism for comparing various related techniques (such as different forms of inheritance or approaches for providing solutions to problems caused by the combination of concurrency and interaction between objects). Keywords: Object-Oriented Programming; Concurency and Parallelism; Programming Language Extensions; Rewriting Systems. Object-Oriented Term Graph Rewriting George A. Papadopoulos...
Experiments with Parallel Algorithms for Combinatorial Problems
- EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
, 1985
"... In the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines constructed ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines constructed so far all use a simple model of parallel computation. Therefore, not every existing parallel machine is equally well suited for each type of algorithm. The adaptation of a certain algorithm to a specific parallel architecture may severely increase the complexity of the algorithm or severely obscure its essence. Little is known about the performance of some standard combinatorial algorithms on existing parallel machines. In this paper we present computational results concerning the solution of knapsack, shortest paths and change-making problems by branch and bound, dynamic programming, and divide and conquer algorithms on the ICL-DAP (an SIMD computer), the Manchester dataflow machine and t...

