Results 1 - 10
of
12
A Memory Approach to Consistent, Reliable Distributed Shared Memory
, 1995
"... Fault-tolerant distributed shared memory systems do not always need to support a complete and consistent recovery after a failure. We describe a framework, within which different approaches to, and different degrees of consistency and recoverability can be understood. The addition of consistent fail ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
Fault-tolerant distributed shared memory systems do not always need to support a complete and consistent recovery after a failure. We describe a framework, within which different approaches to, and different degrees of consistency and recoverability can be understood. The addition of consistent failure recovery may be approached from two different viewpoints: either by an application-oriented view or a memoryoriented view. The major characteristics used in our framework are variations of availability, consistency, and application support. This paper explains the basic model, which is used in Reliable Mirage+, and describes how the framework can be used by other researchers to understand and classify solutions to the reliable DSM problem. The model distinguishes a recoverable system, which must be able to survive any single-site failure, from a reliable system which also ensures consistency after the recovery. Since consistency requirements may impose a high penalty on standard op...
Data Management in Networks: Experimental Evaluation of a Provably Good Strategy
- In Proc. of the 11th ACM Symp. on Parallel Algorithms and Architectures (SPAA
, 1999
"... This paper deals with data management for parallel and distributed systems in which the computing nodes are connected by a relatively sparse network. We present the DIVA (Distributed Variables) library that provides fully transparent access to global variables, i.e., shared data objects, from the in ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
This paper deals with data management for parallel and distributed systems in which the computing nodes are connected by a relatively sparse network. We present the DIVA (Distributed Variables) library that provides fully transparent access to global variables, i.e., shared data objects, from the individual nodes in the network. The current implementations are based on mesh-connected massively parallel computers. The data management strategies implemented in the library use a non-standard approach based on a randomized but locality preserving embedding of “access trees ” into the physical network. The access tree strategy was previously analyzed only in a theoretical model using competitive analysis, where it was shown that the strategy produces minimal network congestion up to small factors. In this paper, the access tree strategy will be evaluated experimentally. We test several variations of this strategy on three different
Fault Tolerance and Configurability in DSM Coherence Protocols
- IEEE Concurrency
, 2000
"... With the advent of large networks and the demand to have uninterrupted service, computer systems need to be more robust and fault tolerant. There are numerous ways to implement fault tolerance and recovery. A central concept in all these methods is the requirement for replicated data for high data a ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
With the advent of large networks and the demand to have uninterrupted service, computer systems need to be more robust and fault tolerant. There are numerous ways to implement fault tolerance and recovery. A central concept in all these methods is the requirement for replicated data for high data availability. We believe that a protocol must not only provide replication, but do so at low operation overhead. Further, the protocol must provide configurable mechanisms for varying the level of replication, so that the system may be operated at the desired overhead cost. We have developed several Distributed Shared Memory (DSM) protocols and use these with a program-driven simulation to examine the robustness, fault tolerance, and configurability of these. Our investigation compares the Write-Invalidate, Write-Invalidate with Downgrading, Write-Broadcast and several instances of the Boundary-Restricted coherence protocol class. The DSM application suite contains programs representative of...
High Performance Distributed Objects Using Distributed Shared Memory and Remote Method Invocation
- In Proc. of the 31st Hawaii Int'l Conf. on System Sciences (HICSS-31), volume VII
, 1998
"... There are two emerging trends in distributed computing: the evolution of client/server architectures into multitiered systems and advances in distributed shared memory (DSM). The convergence of these two trends yields a new structure we call virtual distributed objects (VDOs). ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
There are two emerging trends in distributed computing: the evolution of client/server architectures into multitiered systems and advances in distributed shared memory (DSM). The convergence of these two trends yields a new structure we call virtual distributed objects (VDOs).
Approaches to Support Parallel Programming on Workstation Clusters: A Survey
- A Survey, Informatik Berichte, Fachgruppe Informatik, Universitat-GH Siegen
, 1995
"... The goal of this report is to survey state of the art and existing approaches for parallel programming on workstation clusters with special emphasis on object-oriented programming. First, workstation clusters as parallel computing platforms are characterized and fundamental concepts for parallel pro ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The goal of this report is to survey state of the art and existing approaches for parallel programming on workstation clusters with special emphasis on object-oriented programming. First, workstation clusters as parallel computing platforms are characterized and fundamental concepts for parallel programming are discussed. Then, an overview of existing tools, systems, languages, and environments is given. The report concludes by identifying features of software systems suitable for parallel object-oriented programming on top of workstation clusters.
A Case For Virtual Distributed Objects
- International Journal on Parallel and Distributed Computing
, 1998
"... . There are two emerging trends in distributed computing. The first trend is evolving because programming high performance client-server applications is a challenge. Client-server architectures must be designed from the ground up for good performance. Increasingly, we are seeing client-server models ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
. There are two emerging trends in distributed computing. The first trend is evolving because programming high performance client-server applications is a challenge. Client-server architectures must be designed from the ground up for good performance. Increasingly, we are seeing client-server models evolving away from traditional client-server structures into new structures such as three-tier systems and distributed object systems. Consequently, as the typical architecture used for distributed systems evolves to increase performance, we must also recognize that the complexity of developing software for cost-effective distributed systems increases as we distribute functionality. Error handling must be more robust and messaging more efficient as we move away from centralized server models. As the pressure to decentralize for better performance rises, the increasing number of decentralized servers increases management/administrative complexity. The second converging trend associated with ...
On the Synchronization Mechanisms in Distributed Shared Memory Systems
, 1994
"... Distributed Shared Memory (DSM) is the implementation of the shared memory programming paradigm on a distributed memory (or multicomputer) system. Programming multicomputer systems using Distributed Shared Memory as the programming model is appealing because it combines the performance advantage of ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Distributed Shared Memory (DSM) is the implementation of the shared memory programming paradigm on a distributed memory (or multicomputer) system. Programming multicomputer systems using Distributed Shared Memory as the programming model is appealing because it combines the performance advantage of distributed memory systems and the ease of programming of shared memory systems. In DSM systems, cooperating tasks communicate with one another through shared variables. Thus, DSM systems must provide synchronization mechanisms to Coordinate concurrent access to these shared variables. In this paper we describe and classify the synchronization mechanisms supported by several Distributed Shared Memory systems. We classify these systems according to whether they are hardware or software based, whether the mechanism is integrated into the system or not and whether implementation of the synchronization mechanism is centralized or distributed. Key phrases: Distributed Shared Memory, Distributed ...
Fault Tolerance and Scalability in DSM Coherence Protocols - A Simulation Approach
- IEEE Transaction on Computers
, 1997
"... With the advent of large networks and the demand to have uninterrupted service, there is a pressing need for computer systems to be more robust and fault tolerant. There are numerous ways to implement fault tolerance and recovery [5, 50]. Yet, a central concept in all these methods is the requiremen ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
With the advent of large networks and the demand to have uninterrupted service, there is a pressing need for computer systems to be more robust and fault tolerant. There are numerous ways to implement fault tolerance and recovery [5, 50]. Yet, a central concept in all these methods is the requirement for replicated data leading to high data availability. We believe that a protocol must not only provide data replication, but also that it should do so at low operational overhead. Further, the protocol must provide mechanisms for varying the level of replication (so that the system may be operated at a desired overhead cost), and must scale well. At the University of California, Riverside, we have developed a program-driven ...
A Memory Approach to Consistent, Reliable DSM
"... Fault-tolerant distributed shared memory systems do not always need to support a complete and consistent recovery after a failure. We describe a framework, within which different approaches to, and different degrees of consistency and recoverability can be understood. The addition of consistent fail ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Fault-tolerant distributed shared memory systems do not always need to support a complete and consistent recovery after a failure. We describe a framework, within which different approaches to, and different degrees of consistency and recoverability can be understood. The addition of consistent failure recovery may be approached from two different viewpoints: either by an application-oriented view or a memoryoriented view. The major characteristics used in our framework are variations of availability , consistency , and application support . This paper explains the basic model, which is used in Reliable Mirage + , and describes how the framework can be used by other researchers to understand and classify solutions to the reliable DSM problem. The model distinguishes a recoverable system, which must be able to survive any single-site failure, from a reliable system which also ensures consistency after the recovery. Since consistency requirements may impose a high penalty on standard ...
AM: A Framework for Consistency and Recoverability in Distributed Shared Memory
- In Proceedings of the 14th Symposium on Reliable Distributed Systems
, 1995
"... Fault-tolerant distributed shared memory systems do not always need to support complete and consistent recovery . We describe a new framework, within which different approaches to, and different degrees of consistency and recoverability can be understood. Consistent failure recovery may be approache ..."
Abstract
- Add to MetaCart
Fault-tolerant distributed shared memory systems do not always need to support complete and consistent recovery . We describe a new framework, within which different approaches to, and different degrees of consistency and recoverability can be understood. Consistent failure recovery may be approached from two different viewpoints: either from an applicationoriented view or a kernel-oriented view. The major characteristics used in our framework are variations of availability , consistency , and application support . The degree to which a DSM system supports reliability and consistency is described in a multi-level model. A kernel-oriented approach constitutes a bottom-up approach whereas an application-oriented approach constitutes a top-down approach. This paper explains the basic framework and shows how the framework can be used to understand and classify solutions to robust DSM. The model distinguishes a recoverable system, which must be able to survive any single-site failure, fro...

