Results 1 - 10
of
34
Group Communication Specifications: A Comprehensive Study
- ACM Computing Surveys
, 1999
"... View-oriented group communication is an important and widely used building block for many distributed applications. Much current research has been dedicated to specifying the semantics and services of view-oriented Group Communication Systems (GCSs). However, the guarantees of different GCSs are for ..."
Abstract
-
Cited by 284 (12 self)
- Add to MetaCart
View-oriented group communication is an important and widely used building block for many distributed applications. Much current research has been dedicated to specifying the semantics and services of view-oriented Group Communication Systems (GCSs). However, the guarantees of different GCSs are formulated using varying terminologies and modeling techniques, and the specifications vary in their rigor. This makes it difficult to analyze and compare the different systems. This paper provides a comprehensive set of clear and rigorous specifications, which may be combined to represent the guarantees of most existing GCSs. In the light of these specifications, over thirty published GCS specifications are surveyed. Thus, the specifications serve as a unifying framework for the classification, analysis and comparison of group communication systems. The survey also discusses over a dozen different applications of group communication systems, shedding light on the usefulness of the p...
On the Impossibility of Group Membership
, 1996
"... We prove that the primary-partition group membership problem cannot be solved in asynchronous systems with crash failures, even if one allows the removal or killing of non-faulty processes that are erroneously suspected to have crashed. 1 Introduction The problem of group membership has been the ..."
Abstract
-
Cited by 146 (5 self)
- Add to MetaCart
We prove that the primary-partition group membership problem cannot be solved in asynchronous systems with crash failures, even if one allows the removal or killing of non-faulty processes that are erroneously suspected to have crashed. 1 Introduction The problem of group membership has been the focus of much theoretical and experimental work on fault-tolerant distributed systems. A group membership protocol manages the formation and maintenance of a set of processes called a group. For example, a group may be a set of processes that are cooperating towards a common task (e.g., the primary and backup servers of a database), a set of processes that share a common interest (e.g., clients that subscribe to a particular newsgroup), or the set of all processes in the system that are currently deemed to be operational. In general, a process may leave a group because it failed, it voluntarily requested to leave, or it is forcibly expelled by other members of the group. Similarly, a proces...
The Timely Computing Base Model and Architecture
- IEEE Transactions on Computers, Special Issue on Asynchronous Real-Time Distributed Systems
, 2002
"... Current systems are very often based on largescale, unpredictable and unreliable infrastructures. However, users of these systems increasingly require services with timeliness properties. This creates a di#cult-to-solve contradiction with regard to the adequate time model: synchronous, or asynchrono ..."
Abstract
-
Cited by 55 (20 self)
- Add to MetaCart
Current systems are very often based on largescale, unpredictable and unreliable infrastructures. However, users of these systems increasingly require services with timeliness properties. This creates a di#cult-to-solve contradiction with regard to the adequate time model: synchronous, or asynchronous? In this paper, we propose an architectural construct and programming model, which address this problem. We assume the existence of a component that is capable of executing timely functions, however asynchronous the rest of the system may be. We call this component the Timely Computing Base, and it can be used by the other components to execute a set of simple but crucial time-related services. We also show how to use it to build dependable and timely applications exhibiting varying degrees of timeliness assurance, under several synchrony models.
Group communication in partitionable systems: Specification and algorithms
- IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 1998
"... We give a formal specification and an implementation for a partitionable group communication service in asynchronous distributed systems. Our specification is motivated by the requirements for building “partition-aware ” applications that can continue operating without blocking in multiple concurre ..."
Abstract
-
Cited by 51 (9 self)
- Add to MetaCart
We give a formal specification and an implementation for a partitionable group communication service in asynchronous distributed systems. Our specification is motivated by the requirements for building “partition-aware ” applications that can continue operating without blocking in multiple concurrent partitions and reconfigure themselves dynamically when partitions merge. The specified service guarantees liveness and excludes trivial solutions; it constitutes a useful basis for building realistic partition-aware applications; and it is implementable in practical asynchronous distributed systems where certain stability conditions hold.
Specifications and proofs for Ensemble Layers
- TACAS '99
, 1999
"... Ensemble is a widely used group communication system that supports distributed programming by providing precise guarantees for synchronization, message ordering, and message delivery. Ensemble eases the task of distributed-application programming, but as a result, ensuring the correctness of Ensemb ..."
Abstract
-
Cited by 47 (10 self)
- Add to MetaCart
Ensemble is a widely used group communication system that supports distributed programming by providing precise guarantees for synchronization, message ordering, and message delivery. Ensemble eases the task of distributed-application programming, but as a result, ensuring the correctness of Ensemble itself is a difficult problem. In this paper we use I/O automata for formalizing, specifying, and verifying the Ensemble implementation. We focus specifically on message total ordering, a property that is commonly used to guarantee consistency within a process group. The systematic verification of this protocol led to the discovery of an error in the implementation.
Fail-Awareness in Timed Asynchronous Systems
, 2003
"... We address the problem of the impossibility of implementing synchronous fault-tolerant service specifications in asynchronous distributed systems. We introduce a method for weakening a synchronous service specification so that it becomes implementable in "timed" asynchronous systems, that is, asynch ..."
Abstract
-
Cited by 43 (15 self)
- Add to MetaCart
We address the problem of the impossibility of implementing synchronous fault-tolerant service specifications in asynchronous distributed systems. We introduce a method for weakening a synchronous service specification so that it becomes implementable in "timed" asynchronous systems, that is, asynchronous systems in which processes have access to local hardware clocks. The method (1) adds to a service interface an exception indicator so that a client knows at any time if a server is currently providing its standard "synchronous" semantics or some other specified exceptional semantics, (2) the standard behavior provided when the exception indicator does not signal an exception is "similar" to the original synchronous service behavior, and (3) a server has to provide its standard semantics whenever the underlying communication and process services exhibit "synchronous behavior ". To illustrate our method, we show how the specification of a synchronous datagram service and an internal clock synchronization service can be transformed into a fail-aware service specification. Further illustrations of the usefulness of fail-aware services are provided by describing a railway crossing service and a fail-aware weak group membership service.
Abbadi. Using broadcast primitives in replicated databases
, 1998
"... We explore the use of di erent variants of broadcast protocols for managing replicated databases. Starting with the simplest broadcast primitive, the reliable broadcast protocol, we show how it can be used to ensure correct transaction execution. The protocol is simple, and has several advantages, i ..."
Abstract
-
Cited by 38 (3 self)
- Add to MetaCart
We explore the use of di erent variants of broadcast protocols for managing replicated databases. Starting with the simplest broadcast primitive, the reliable broadcast protocol, we show how it can be used to ensure correct transaction execution. The protocol is simple, and has several advantages, including prevention of deadlocks. However, it requires a twophase commitment protocol for ensuring correctness. We then develop a second protocol that uses causal broadcast and avoids the overhead of two-phase commit by exploiting the causal delivery properties of the broadcast primitives to implicitly collect the relevant information used in two-phase commit. Finally, we present a protocol that employs atomic broadcast and completely eliminates the need for acknowledgements during transaction commitment. 1
System Support for Partition-Aware Network Applications
"... Network applications and services need to be environment-aware in order to meet non-functional requirements in increasingly dynamic contexts. In this paper we consider partition awareness as an instance of environment awareness in network applications that need to be reliable and self-managing. Part ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
Network applications and services need to be environment-aware in order to meet non-functional requirements in increasingly dynamic contexts. In this paper we consider partition awareness as an instance of environment awareness in network applications that need to be reliable and self-managing. Partition-aware applications dynamically reconfigure themselves and adjust the quality of their services in response to partitioning and merging of networks. As such, they can automatically adapt to changes in the environment so as to remain available in multiple partitions without blocking, albeit with reduced or degraded functionality. We propose a system layer consisting of group membership and reliable multicast services that provides systematic support for partition-aware application development. We illustrate the effectiveness of the proposed interface by solving several problems that represent different classes of realistic network applications. 1.
Group Membership and View Synchrony in Partitionable Asynchronous Distributed Systems: Specifications
, 1996
"... S. All local authors can be reached via e-mail at the address last-name@cs.unibo.it. Questions and comments should be addressed to tr-admin@cs.unibo.it. Recent Titles from the UBLCS Technical Report Series 95-2 Clepsydra Methodology, P. Ciaccia, O. Ciancarini, W. Penzo, January 1995. 95-3 A Unified ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
S. All local authors can be reached via e-mail at the address last-name@cs.unibo.it. Questions and comments should be addressed to tr-admin@cs.unibo.it. Recent Titles from the UBLCS Technical Report Series 95-2 Clepsydra Methodology, P. Ciaccia, O. Ciancarini, W. Penzo, January 1995. 95-3 A Unified Framework for the Specification and Run-time Detection of Dynamic Properties in Distributed Computations, O. Babao glu, E. Fromentin, M. Raynal, January 1995 (Revised February 1995). 95-4 Effective Applicative Structures, A. Asperti, A. Ciabattoni, January 1995. 95-5 An Open Framework for Cooperative Problem Solving, M. Gaspari, E. Motta, A. Stutt, February 1995. 95-6 Considering New Guidelines in Group Interface Design: a Group-Friendly Interface for the CHAOS System, D. Bottura, C. Maioli, S. Mangiaracina, February 1995. 95-7 Modelling Interaction in Agent Systems, A. Dalmonte, M. Gaspari, February 1995. 95-8 Building Hypermedia for Learning: a Framework Based on the Design of User Inter...
Agreeing on Processor Group Membership in Timed Asynchronous Distributed Systems
, 1995
"... We introduce the timed asynchronous distributed system model to describe existing asynchronous distributed systems subject to unbounded processing and communication delays, failures and recoveries. We then describe five increasingly strong specifications for processor-group membership services in ti ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
We introduce the timed asynchronous distributed system model to describe existing asynchronous distributed systems subject to unbounded processing and communication delays, failures and recoveries. We then describe five increasingly strong specifications for processor-group membership services in timed asynchronous systems subject to partitioning. We also propose five distributed protocols that implement these specifications despite arbitrary numbers of crash/performance processor failures and omission /performance communication failures, and prove their correctness. Finally, we show how two of the protocols can be adapted to implement a highly available processor leadership service that ensures the existence of at most one leader at any point in real-time.

