Results 1 -
5 of
5
On the Formal Specification of Group Membership Services
, 1995
"... The problem of group membership has been the focus of much theoretical and experimental work on fault-tolerant distributed systems. This has resulted in a voluminous literature and several formal specifications of this problem have been given. In this paper, we examine the two most referenced formal ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
The problem of group membership has been the focus of much theoretical and experimental work on fault-tolerant distributed systems. This has resulted in a voluminous literature and several formal specifications of this problem have been given. In this paper, we examine the two most referenced formal specifications of group membership and show that they are unsatisfactory: One has flaws in the formalism and allows undesirable executions, and the other can be satisfied by useless protocols. 1 Introduction Group membership is an important component of several experimental or commercial fault-tolerant distributed systems such as the Highly Available System [Cri87], Isis [Bir93], Horus [vRBC + 93], Transis [ADKM92a], Amoeba [KT91], Newtop [EMS95], and Relacs [BDGB94]. Roughly speaking, a group membership protocol manages the formation and maintenance of a set of processes called a group. For example, a group may be a set of processes that are cooperating towards a common task (e.g., th...
Revisiting the Paxos algorithm
- In Marios Mavronicolas and Philippas Tsigas, editors, Proceedings of the 11th International Workshop on Distributed Algorithms (WDAG 97), volume 1320 of Lecture Notes in Computer Science
, 1997
"... . This paper develops a new I/O automaton model called the Clock General Timed Automaton (Clock GTA) model. The Clock GTA is based on the General Timed Automaton (GTA) of Lynch and Vaandrager. The Clock GTA provides a systematic way of describing timingbased systems in which there is a notion of "no ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
. This paper develops a new I/O automaton model called the Clock General Timed Automaton (Clock GTA) model. The Clock GTA is based on the General Timed Automaton (GTA) of Lynch and Vaandrager. The Clock GTA provides a systematic way of describing timingbased systems in which there is a notion of "normal" timing behavior, but that do not necessarily always exhibit this "normal" behavior. It can be used for practical time performance analysis based on the stabilization of the physical system. We use the Clock GTA automaton to model, verify and analyze the paxos algorithm. The paxos algorithm is an efficient and highly faulttolerant algorithm, devised by Lamport, for reaching consensus in a distributed system. Although it appears to be practical, it is not widely known or understood. This paper contains a new presentation of the paxos algorithm, based on a formal decomposition into several interacting components. It also contains a correctness proof and a time performance and fault-tole...
Dynamic and Adaptive Replication for Large-Scale Reliable Multi-Agent Systems
- In Proceedings of the SELMAS’02
, 2003
"... Abstract. In order to make large-scale multi-agent systems reliable, we propose an adaptive application of replication strategies. Critical agents are replicated to avoid failures. As criticality of agents may evolve during the course of computation and problem solving, we need to dynamically and au ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
Abstract. In order to make large-scale multi-agent systems reliable, we propose an adaptive application of replication strategies. Critical agents are replicated to avoid failures. As criticality of agents may evolve during the course of computation and problem solving, we need to dynamically and automatically adapt the number of replicas of agents, in order to maximize their reliability and availability based on available resources. We are studying an approach and mechanisms for evaluating the criticality of a given agent (based on application-level semantic information, e.g. messages intention, and also system-level statistical information, e.g., communication load) and for deciding what strategy to apply (e.g., active or passive replication) and how to parameterize it (e.g., number of replicas). 1
Fundamental Study Revisiting the PAXOS algorithm
, 2000
"... The PAXOS algorithm is an e#cient and highly fault-tolerant algorithm, devised by Lamport, for reaching consensus in a distributed system. Although it appears to be practical, it seems to be not widely known or understood. This paper contains a new presentation of the PAXOS algorithm, based on a for ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
The PAXOS algorithm is an e#cient and highly fault-tolerant algorithm, devised by Lamport, for reaching consensus in a distributed system. Although it appears to be practical, it seems to be not widely known or understood. This paper contains a new presentation of the PAXOS algorithm, based on a formal decomposition into several interacting components. It also contains a correctness proof and a time performance and fault-tolerance analysis. The formal framework used for the presentation of the algorithm is provided by the Clock General Timed Automaton (Clock GTA) model. The Clock GTA provides a systematic way of describing timing-based systems in which there is a notion of "normal" timing behavior, but that do not necessarily always exhibit this "normal" timing behavior. c 2000 Elsevier Science B.V. All rights reserved. Keywords: I=O automata models; Formal veri#cation; Distributed consensus; Partially synchronous systems; Fault-tolerance Contents 1.
Malicious Fault Tolerance: From Theoretical Algorithms to an Efficient Application Development Process
, 2002
"... In ma y situa,F"["1L,a tolera,F needs to be provided not only in the presence offaxz"[", faz"[",F" aa inca" ofma"[L"1, misbeha viour. Recentresea" hha provided severa theoretica"z well-foundedal-founde tha aa feafou inpraH[1H, Most work however focuses only on singleagle,"fl1zL a" gives only littlea ..."
Abstract
- Add to MetaCart
In ma y situa,F"["1L,a tolera,F needs to be provided not only in the presence offaxz"[", faz"[",F" aa inca" ofma"[L"1, misbeha viour. Recentresea" hha provided severa theoretica"z well-foundedal-founde tha aa feafou inpraH[1H, Most work however focuses only on singleagle,"fl1zL a" gives only littleattl tion toax1][x,F"z y to di#erentqua[H y-of-service requirements a, the whole softwax development process. This thesis outlineatl a malin threema jor contributions: First,it specifiesa modula al hitecture forma,x1xLL fax1xLL,F"1] t consensusasensus,"fl providinga generic interfax to upperla yers,including recovery mecha,""flfl a supporting switching between di#erent consensusstrasus," depending on QoS requirements. Second,it presents di#erentaxL11,F""flfl for thea,"fl]"H,F" developer,a[]"1,F" whicha"""z,F"flL fits best for which developer requirements, a" how theyca berea"]z, using the low-level modules. Third,it discusses how the a,xzflL1,Fx development process forma,"1"zfl fa"1"zfl,FxL[ t a1H[L,FxL[" ma y benefit froma genera"" ea"zLHx h,usinga flexible,evolvav, softwafl generaFx] ae traraFx]fl"fl, process. 1 Introducti1 Tolerax,F benign("fafl]Hflx,Fx faa in distributed systems isa well-understoodma""z [1]. In ma ysitua,Fxz however,the commonfamon,xH amon,xH"L" offa"HH is notat,""[1] Ha," a ma yfa1 ina non-predictaz, wa ycaLfl]H aLfl]H,F aLfl]H before the system is ha"Lfl" Softwafl ishaflH" ever free of bugsag ma y exhibit unspecified beha viour in a unpredictaz] wa y.LaH but notlea"fl[H,Ffl ks to exploit security holes of current systems, especiafl1 of Internet-ba[1 systems,a[ sowidesprea tha theyae a significa t source ofmaflflzfl[, corruption of nodes in distributedastributed," The goa of this thesis outline is to proposea novela" hitectureat softwa" development pr...

