Results 1 - 10
of
39
Group Communication Specifications: A Comprehensive Study
- ACM COMPUTING SURVEYS
, 1999
"... View-oriented group communication is an important and widely used building block for many distributed applications. Much current research has been dedicated to specifying the semantics and services of view-oriented Group Communication Systems (GCSs). However, the guarantees of different GCSs are for ..."
Abstract
-
Cited by 370 (15 self)
- Add to MetaCart
(Show Context)
View-oriented group communication is an important and widely used building block for many distributed applications. Much current research has been dedicated to specifying the semantics and services of view-oriented Group Communication Systems (GCSs). However, the guarantees of different GCSs are formulated using varying terminologies and modeling techniques, and the specifications vary in their rigor. This makes it difficult to analyze and compare the different systems. This paper provides a comprehensive set of clear and rigorous specifications, which may be combined to represent the guarantees of most existing GCSs. In the light of these specifications, over thirty published GCS specifications are surveyed. Thus, the specifications serve as a unifying framework for the classification, analysis and comparison of group communication systems. The survey also discusses over a dozen different applications of group communication systems, shedding light on the usefulness of the p...
Total order broadcast and multicast algorithms: Taxonomy and survey
- ACM COMPUTING SURVEYS
, 2004
"... ..."
QoS negotiation in real-time systems and its application to automated flight control,"
- IEEE Transactions on Computers,
, 2000
"... ..."
Coyote: A System for Constructing Fine-Grain Configurable Communication Services
- ACM TRANSACTIONS ON COMPUTER SYSTEMS
, 1998
"... Communication-oriented abstractions such as atomic multicast, group RPC, and protocols for location-independent mobile computing can simplify the development of complex applications built on distributed systems. This paper describes Coyote, a system that supports the construction of highly modular ..."
Abstract
-
Cited by 107 (15 self)
- Add to MetaCart
Communication-oriented abstractions such as atomic multicast, group RPC, and protocols for location-independent mobile computing can simplify the development of complex applications built on distributed systems. This paper describes Coyote, a system that supports the construction of highly modular and configurable versions of such abstractions. Coyote extends the notion of protocol objects and hierarchical composition found in existing systems with support for finer-grain objects called micro-protocols that implement individual semantic properties of the target service. A customized service is constructed by selecting micro-protocols based on their semantic guarantees and configuring them together with a standard runtime system to form a composite protocol implementing the service. Micro-protocols within a composite protocol can share data and are executed using an event-driven paradigm that enhances configurability. The overall approach is described and illustrated with exampl...
ControlWare: A Middleware Architecture for Feedback Control of Software
- In Proceedings of the 2002 International Conference on Distributed Computing Systems
, 2002
"... Attainment of software performance assurances in open, largely unpredictable environments has recently become an important focus for real-time research. Unlike closed embedded systems, many contemporary distributed real-time applications operate in environments where offered load and available resou ..."
Abstract
-
Cited by 73 (13 self)
- Add to MetaCart
(Show Context)
Attainment of software performance assurances in open, largely unpredictable environments has recently become an important focus for real-time research. Unlike closed embedded systems, many contemporary distributed real-time applications operate in environments where offered load and available resources suffer considerable random fluctuations, thereby complicating the performance assurance problem. Feedback control theory has recently been identified as a promising analytic foundation for controlling performance of such unpredictable, poorly modeled software systems, the same way other engineering disciplines have used this theory for physical process control.
Gossip versus Deterministic Flooding: Low Message Overhead and High Reliability for Broadcasting on Small Networks
"... Rumor mongering (also known as gossip) is an epidemiological protocol that implements broadcasting with a reliability that can be very high. Rumor mongering is attractive because it is generic, scalable, adapts well to failures and recoveries, and has a reliability that gracefully degrades with t ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
(Show Context)
Rumor mongering (also known as gossip) is an epidemiological protocol that implements broadcasting with a reliability that can be very high. Rumor mongering is attractive because it is generic, scalable, adapts well to failures and recoveries, and has a reliability that gracefully degrades with the number of failures in a run. However, rumor mongering uses random selection for communications. We study the impact of using random selection in this paper. We present a protocol that superficially resembles rumor mongering but is deterministic. We show that this new protocol has most of the same attractions as rumor mongering. The one attraction that rumor mongering has---namely graceful degradation---comes at a high cost in terms of the number of messages sent. We compare the two approaches both at an abstract level and in terms of how they perform in an Ethernet and small wide area network of Ethernets.
Real-Time Dependable Channels: Customizing QoS Attributes for Distributed Systems
- IEEE Transactions on Parallel and Distributed Systems
, 1998
"... Communication services that provide enhanced Quality of Service (QoS) guarantees related to dependability and real time are important for many applications in distributed systems. This paper presents real-time dependable (RTD) channels, a communication-oriented abstraction that can be configured to ..."
Abstract
-
Cited by 26 (12 self)
- Add to MetaCart
(Show Context)
Communication services that provide enhanced Quality of Service (QoS) guarantees related to dependability and real time are important for many applications in distributed systems. This paper presents real-time dependable (RTD) channels, a communication-oriented abstraction that can be configured to meet the QoS requirements of a variety of distributed applications. This customization ability is based on using CactusRT, a system that supports the construction of middleware services out of software modules called micro-protocols. Each micro-protocol implements a different semantic property or property variant, and interacts with other micro-protocols using an event-driven model supported by the CactusRT runtime system. In addition to RTD channels, CactusRT and its implementation are described. This prototype executes on a cluster of Pentium PCs running the OpenGroup/RI MK 7.3 Mach real-time operating system and CORDS, a system for building network protocols based on the x-kernel. July 9...
Planning and Resource Allocation for Hard Real-time, Fault-Tolerant Plan Execution
- Journal of Autonomous Agents and Multi-Agent Systems
, 1999
"... . We describe the interface between a real-time resource allocation system with an AI planner in order to create fault-tolerant plans that are guaranteed to execute in hard real-time. The planner specifies the task set and all execution deadlines required to ensure system safety, then the resource ..."
Abstract
-
Cited by 19 (7 self)
- Add to MetaCart
. We describe the interface between a real-time resource allocation system with an AI planner in order to create fault-tolerant plans that are guaranteed to execute in hard real-time. The planner specifies the task set and all execution deadlines required to ensure system safety, then the resource allocator schedules these plans off-line to analyze execution platform resource utilization. A new interface module combines information from planning and resource allocation to enforce development of plans feasible for execution during a varietyofinternal system faults. Plans that over-utilize any system resource trigger feedback to the planner, which then searches for an alternate plan. A valid plan for each specified fault, including the nominal no-fault situation, is stored in a plan cache for subsequent real-time execution. We situate this work in the context of CIRCA, the Cooperative Intelligent Real-time Control Architecture, whichfocusesondeveloping and scheduling plans that make hard real-time safety guarantees, and provide an example of an autonomous aircraft agent to illustrate how our planner-resource allocation interface improves CIRCA performance. Keywords: AI architectures, planning, real-time scheduling, fault-tolerance 1.
Early-Delivery Dynamic Atomic Broadcast (Extended Abstract)
- IN PROC. 16TH INTL. SYMP. ON DISTRIBUTED COMPUTING (DISC’02), D. MALKHI, ED. LNCS
, 2002
"... We consider a problem of atomic broadcast in a dynamic setting where processes may join, leave voluntarily, or fail (by stopping) during the course of computation. We provide a formal definition of the Dynamic Atomic Broadcast problem and present and analyze a new algorithm for its solution in ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
(Show Context)
We consider a problem of atomic broadcast in a dynamic setting where processes may join, leave voluntarily, or fail (by stopping) during the course of computation. We provide a formal definition of the Dynamic Atomic Broadcast problem and present and analyze a new algorithm for its solution in a variant of a synchronous model, where processes have approximately synchronized clocks. Our
Optimization of a Real-Time Primary-Backup Replication Service
- Parallel and Distributed Systems, IEEE Transactions on
, 1998
"... The primary-backup replication model is one of the commonly adopted approaches to providing fault tolerant data services. Its extension to the real-time environment, however, imposes the additional constraint of timing predictability, which requires a bounded overhead for managing redundancy. This p ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
The primary-backup replication model is one of the commonly adopted approaches to providing fault tolerant data services. Its extension to the real-time environment, however, imposes the additional constraint of timing predictability, which requires a bounded overhead for managing redundancy. This paper discusses the trade-off between reducing system overhead and increasing (temporal) consistency between the primary and backup, and explores ways to optimize such a system to minimize either the inconsistency or the system overhead while maintaining the temporal consistency guarantees of the system. An implementation built on top of the existing RTPB model [20] was developed within the x-kernel architecture on the Mach OSF platform running MK 7.2. Results of an experimental evaluation of the proposed optimization techniques are discussed. 1 Introduction A common approach to building fault-tolerant distributed systems is to replicate servers that fail independently. The main approaches ...