Results 1 - 10
of
29
Building Secure and Reliable Network Applications
, 1996
"... ly, the remote procedure call problem, which an RPC protocol undertakes to solve, consists of emulating LPC using message passing. LPC has a number of "properties" -- a single procedure invocation results in exactly one execution of the procedure body, the result returned is reliably delivered to th ..."
Abstract
-
Cited by 209 (16 self)
- Add to MetaCart
ly, the remote procedure call problem, which an RPC protocol undertakes to solve, consists of emulating LPC using message passing. LPC has a number of "properties" -- a single procedure invocation results in exactly one execution of the procedure body, the result returned is reliably delivered to the invoker, and exceptions are raised if (and only if) an error occurs. Given a completely reliable communication environment, which never loses, duplicates, or reorders messages, and given client and server processes that never fail, RPC would be trivial to solve. The sender would merely package the invocation into one or more messages, and transmit these to the server. The server would unpack the data into local variables, perform the desired operation, and send back the result (or an indication of any exception that occurred) in a reply message. The challenge, then, is created by failures. Were it not for the possibility of process and machine crashes, an RPC protocol capable of overcomi...
The Totem Single-Ring Ordering and Membership Protocol
, 1995
"... Operating Systems]: Organization and Design---distributed systems General Terms: Protocols, Performance, Reliability Additional Key Words and Phrases: Flow control, membership, reliable delivery, token passing, total ordering, virtual synchrony Earlier versions of the Totem single-ring protocol app ..."
Abstract
-
Cited by 166 (30 self)
- Add to MetaCart
Operating Systems]: Organization and Design---distributed systems General Terms: Protocols, Performance, Reliability Additional Key Words and Phrases: Flow control, membership, reliable delivery, token passing, total ordering, virtual synchrony Earlier versions of the Totem single-ring protocol appeared in the Proceedings of the IEE International Conference on Information Engineering, Singapore (December 1991) and in the Proceedings of the IEEE 13th International Conference on Distributed Computing Systems, Pittsburgh, PA (May 1993). This research was supported by NSF Grant No. NCR-9016361, ARPA Contract No. N00174-93K -0097, and Rockwell CMC/State of California MICRO Grant No. 92-101. Authors' Addresses: Y. Amir, Computer Science Department, The Hebrew University of Jerusalem, Israel; L. E. Moser and P. M. Melliar-Smith, Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106; D. A. Agarwal, Lawrence Berkele
The SecureRing Protocols for Securing Group Communication
- In Hawaii International Conference on System Sciences
, 1998
"... The SecureRing group communication protocols provide reliable ordered message delivery and group membership services despite Byzantine faults such as might be caused by modifications to the programs of a group member following illicit access to, or capture of, a group member. The protocols multicast ..."
Abstract
-
Cited by 117 (2 self)
- Add to MetaCart
The SecureRing group communication protocols provide reliable ordered message delivery and group membership services despite Byzantine faults such as might be caused by modifications to the programs of a group member following illicit access to, or capture of, a group member. The protocols multicast messages to groups of processors within an asynchronous distributed system and deliver messages in a consistent total order to all members of the group. They ensure that correct members agree on changes to the membership, that correct processors are eventually included in the membership, and that processors that exhibit detectable Byzantine faults are eventually excluded from the membership. To provide these message delivery and group membership services, the protocols make use of an unreliable Byzantine fault detector. 1.
Replication Using Group Communication Over a Partitioned Network
, 1995
"... In systems based on the client-server model, a single server may serve many clients and the heavy load on the server may cause the response time to be adversely affected. In such circumstances, replicating data or servers may improve performance. Replication may also improve the availability of info ..."
Abstract
-
Cited by 81 (19 self)
- Add to MetaCart
In systems based on the client-server model, a single server may serve many clients and the heavy load on the server may cause the response time to be adversely affected. In such circumstances, replicating data or servers may improve performance. Replication may also improve the availability of information when processors crash or the network partitions. Existing replication methods are often needlessly expensive. They sometimes use pointto -point communication when multicast communication is available; they typically pay the full price of end-to-end acknowledgments for all of the participants for every update; they may claim locks, and therefore, may be vulnerable to faults that can unnecessarily block the system for long periods of time. This thesis presents a new architecture and algorithms for replication over a partitioned network. The architecture is structured into two layers: a replication server and a group communication layer. Each of the replication servers maintains a priva...
A Configurable Membership Service
- IEEE Transactions on Computers
, 1994
"... A membership service is used to maintain information about which sites are functioning in a distributed system at any given time. Many such services have been defined, with each implementing a unique combination of properties that simplify the construction of higher levels of the system. Despite thi ..."
Abstract
-
Cited by 45 (10 self)
- Add to MetaCart
A membership service is used to maintain information about which sites are functioning in a distributed system at any given time. Many such services have been defined, with each implementing a unique combination of properties that simplify the construction of higher levels of the system. Despite this wealth of possibilities, however, any given service only realizes one set of properties, which makes it difficult to tailor the service provided to the specific needs of the application. Here, a configurable membership service that addresses this problem is described. This service is based on decomposing membership into its constituent abstract properties, and then implementing these properties as separate software modules called micro-protocols that can be configured together to produce a customized membership service. A prototype C++ implementation of the membership service for a simulated distributed environment is also described. December 19, 1994 Revised January 9, 1996 Department of ...
Fail-Awareness in Timed Asynchronous Systems
, 2003
"... We address the problem of the impossibility of implementing synchronous fault-tolerant service specifications in asynchronous distributed systems. We introduce a method for weakening a synchronous service specification so that it becomes implementable in "timed" asynchronous systems, that is, asynch ..."
Abstract
-
Cited by 43 (15 self)
- Add to MetaCart
We address the problem of the impossibility of implementing synchronous fault-tolerant service specifications in asynchronous distributed systems. We introduce a method for weakening a synchronous service specification so that it becomes implementable in "timed" asynchronous systems, that is, asynchronous systems in which processes have access to local hardware clocks. The method (1) adds to a service interface an exception indicator so that a client knows at any time if a server is currently providing its standard "synchronous" semantics or some other specified exceptional semantics, (2) the standard behavior provided when the exception indicator does not signal an exception is "similar" to the original synchronous service behavior, and (3) a server has to provide its standard semantics whenever the underlying communication and process services exhibit "synchronous behavior ". To illustrate our method, we show how the specification of a synchronous datagram service and an internal clock synchronization service can be transformed into a fail-aware service specification. Further illustrations of the usefulness of fail-aware services are provided by describing a railway crossing service and a fail-aware weak group membership service.
The Totem System
- In Proc. of the 25th Annual International Symposium on Fault-Tolerant Computing
, 1995
"... The Totem system supports fault-tolerant applications in which distributed processes cooperate to perform a common task and in which replicated data must be updated consistently in the presence of asynchrony and faults. Reliable totally ordered delivery of messages to processes within process groups ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
The Totem system supports fault-tolerant applications in which distributed processes cooperate to perform a common task and in which replicated data must be updated consistently in the presence of asynchrony and faults. Reliable totally ordered delivery of messages to processes within process groups is provided on a single local-area network or over multiple local-area networks interconnected by gateways. Message ordering is consistent across the entire network, despite processor and communication faults, without requiring all processes to deliver all messages. The Totem system handles processor failure and recovery, as well as network partitioning and remerging, and provides membership and topology maintenance services. 1 Introduction The Totem system, developed at the University of California, Santa Barbara, supports applications in which information must be replicated to guard against faults and in which the consistency of information must be maintained as it is updated in the pres...
Agreeing on Processor Group Membership in Timed Asynchronous Distributed Systems
, 1995
"... We introduce the timed asynchronous distributed system model to describe existing asynchronous distributed systems subject to unbounded processing and communication delays, failures and recoveries. We then describe five increasingly strong specifications for processor-group membership services in ti ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
We introduce the timed asynchronous distributed system model to describe existing asynchronous distributed systems subject to unbounded processing and communication delays, failures and recoveries. We then describe five increasingly strong specifications for processor-group membership services in timed asynchronous systems subject to partitioning. We also propose five distributed protocols that implement these specifications despite arbitrary numbers of crash/performance processor failures and omission /performance communication failures, and prove their correctness. Finally, we show how two of the protocols can be adapted to implement a highly available processor leadership service that ensures the existence of at most one leader at any point in real-time.
A Fail-Aware Membership Service
, 1996
"... We propose a new protocol that can be used to implement a partitionable membership service majority-partition and partitionable membership services for timed asynchronous systems. The protocol is fail-aware in the sense that a process knows at all times if its approximation of the set of processes ..."
Abstract
-
Cited by 14 (10 self)
- Add to MetaCart
We propose a new protocol that can be used to implement a partitionable membership service majority-partition and partitionable membership services for timed asynchronous systems. The protocol is fail-aware in the sense that a process knows at all times if its approximation of the set of processes in its partition is up-to-date or out-of-date. The protocol minimizes wrong suspicions of processes by giving processes a second chance to stay in the membership before they are removed. Our measurements show that the exclusion of alive processes is rare and the crash detection times are good. The protocol guarantees that the memberships of two partitions never overlap.
A Reliable Ordered Delivery Protocol for Interconnected Local-Area Networks
, 1995
"... We present the Totem multiple-ring protocol, a novel reliable ordered multicast protocol for multiple interconnected local-area networks. The protocol exhibits excellent performance and maintains a consistent network-wide total order of messages despite network partitioning and remerging, or process ..."
Abstract
-
Cited by 13 (7 self)
- Add to MetaCart
We present the Totem multiple-ring protocol, a novel reliable ordered multicast protocol for multiple interconnected local-area networks. The protocol exhibits excellent performance and maintains a consistent network-wide total order of messages despite network partitioning and remerging, or processor failure and recovery with stable storage intact. The Totem protocol is designed for fault-tolerant distributed systems, which replicate data to guard against failures and must ensure that replicated data remain consistent despite failures. The network-wide total order of messages provided by Totem simplifies the maintenance of consistency of replicated data and, thus, eases the development of fault-tolerant distributed systems. 1 Introduction The Totem protocol has been designed to support fault-tolerant distributed systems. In such systems, both the processes executing application tasks and the data must be replicated to protect against failures. Inconsistencies in the replicated data ...

