Results 1 - 10
of
30
Intrusion-Tolerant Architectures: Concepts and Design
"... methodologies and algorithms, both in the fields of fault tolerance and security. Whilst they have taken separate paths until recently, the problems to be solved are of similar nature. In classical dependability, fault tolerance has been the workhorse of many solutions. Classical security-related ..."
Abstract
-
Cited by 51 (32 self)
- Add to MetaCart
methodologies and algorithms, both in the fields of fault tolerance and security. Whilst they have taken separate paths until recently, the problems to be solved are of similar nature. In classical dependability, fault tolerance has been the workhorse of many solutions. Classical security-related work has on the other hand privileged, with few exceptions, intrusion prevention.
Uncertainty and Predictability: Can they be reconciled?
"... this paper, we discuss a novel design philosophy for distributed systems with uncertain or unknown attributes, such as synchrony, or failure modes. This philosophy is based on the existence of architectural constructs with privileged properties which endow systems with the capability of evading the ..."
Abstract
-
Cited by 38 (17 self)
- Add to MetaCart
this paper, we discuss a novel design philosophy for distributed systems with uncertain or unknown attributes, such as synchrony, or failure modes. This philosophy is based on the existence of architectural constructs with privileged properties which endow systems with the capability of evading the uncertainty of the environment for certain crucial steps of their operation where predictability is required. It may open new research avenues allowing to reconcile uncertainty with predictability
How to Tolerate Half Less One Byzantine Nodes in Practical Distributed Systems
- IN PROCEEDINGS OF THE 23RD IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS
, 2004
"... The application of dependability concepts and techniques to the design of secure distributed systems is raising a considerable amount of interest in both communities under the designation of intrusion tolerance. However, practical intrusion-tolerant replicated systems based on the state machine appr ..."
Abstract
-
Cited by 33 (20 self)
- Add to MetaCart
The application of dependability concepts and techniques to the design of secure distributed systems is raising a considerable amount of interest in both communities under the designation of intrusion tolerance. However, practical intrusion-tolerant replicated systems based on the state machine approach (SMA) can handle at most f Byzantine components out of a total of n = 3f + 1, which is the maximum resilience in asynchronous systems. This paper
Efficient Byzantine-Resilient Reliable Multicast on a Hybrid Failure Model
- In Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
, 2002
"... The paper presents a new reliable multicast protocol that tolerates arbitrary faults, including Byzantine faults. This protocol is developed using a novel way of designing secure protocols which is based on a well-founded hybrid failure model. Despite our claim of arbitrary failure resilience, the p ..."
Abstract
-
Cited by 30 (11 self)
- Add to MetaCart
The paper presents a new reliable multicast protocol that tolerates arbitrary faults, including Byzantine faults. This protocol is developed using a novel way of designing secure protocols which is based on a well-founded hybrid failure model. Despite our claim of arbitrary failure resilience, the protocol needs not necessarily incur the cost of “Byzantine agreement”, in number of participants and round/message complexity. It can rely on the existence of a simple distributed security kernel – the TTCB – where the participants only execute crucial parts of the protocol operation, under the protection of a crash failure model. Otherwise, participants follow an arbitrary failure model. The TTCB provides only a few basic services, which allow our protocol to have an efficiency similar to that of accidental fault-tolerant protocols: for f faults, our protocol requires f+2 processes, instead of 3f+1 in Byzantine systems. Besides, the TTCB (which is synchronous) allows secure operation of timed protocols, despite the unpredictable time behavior of the environment (possibly due to attacks on timing assumptions). 1
How resilient are distributed f fault/intrusion-tolerant systems
- In Proc. of the Int. Conf. on Dependable Systems and Networks
, 2005
"... Fault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is expected to have a bounded execution time, an asynchronous one may execute for an arbitrary amount of time, possibly suff ..."
Abstract
-
Cited by 26 (16 self)
- Add to MetaCart
Fault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is expected to have a bounded execution time, an asynchronous one may execute for an arbitrary amount of time, possibly sufficient for f + 1 nodes to fail. This can compromise the safety of the protocol and ultimately the safety of the system. Recent papers propose asynchronous protocols that can tolerate any number of faults over the lifetime of the system, provided that at most f nodes become faulty during a given interval. This is achieved through the socalled proactive recovery, which consists of periodically rejuvenating the system. Proactive recovery in asynchronous systems, though a major breakthrough, has some limitations which had not been identified before. In this paper, we introduce a system model expressive enough to represent these problems which remained in oblivion with the classical models. We introduce the predicate exhaustionsafe, meaning freedom from exhaustion-failures. Based on it, we predict the extent to which fault/intrusion-tolerant distributed systems (synchronous and asynchronous) can be made to work correctly. Namely, our model predicts the impossibility of guaranteeing correct behavior of asynchronous proactive recovery systems as exist today. To prove our point, we give an example of how these problems impact an existing fault/intrusion-tolerant distributed system, the CODEX system, and having identified the problem, we suggest one (certainly not the only) way to tackle it. 1
Low Complexity Byzantine-Resilient Consensus
- DISTRIB. COMPUT
, 2003
"... The application of the tolerance paradigm to security -- intrusion tolerance -- has been raising a good deal of attention in the dependability and security communities. This paper is concerned with a novel approach to intrusion tolerance. The idea is to use privileged distributed components -- ge ..."
Abstract
-
Cited by 20 (11 self)
- Add to MetaCart
The application of the tolerance paradigm to security -- intrusion tolerance -- has been raising a good deal of attention in the dependability and security communities. This paper is concerned with a novel approach to intrusion tolerance. The idea is to use privileged distributed components -- generically designated by wormholes -- to support the execution of intrusion-tolerant protocols, often called Byzantine-resilient protocols in the literature. The paper
Proactive resilience through architectural hybridization
- IN PROCEEDINGS OF THE 2006 ACM SYMPOSIUM ON APPLIED COMPUTING (SAC
, 2006
"... ..."
Solving vector consensus with a wormhole
- IEEE Transactions on Parallel and Distributed Systems
, 2005
"... Abstract—This paper presents a solution to the vector consensus problem for Byzantine asynchronous systems augmented with wormholes. Wormholes prefigure a hybrid distributed system model, embodying the notion of an enhanced part of the system with “good ” properties otherwise not guaranteed by the “ ..."
Abstract
-
Cited by 12 (9 self)
- Add to MetaCart
Abstract—This paper presents a solution to the vector consensus problem for Byzantine asynchronous systems augmented with wormholes. Wormholes prefigure a hybrid distributed system model, embodying the notion of an enhanced part of the system with “good ” properties otherwise not guaranteed by the “normal ” weak environment. A protocol built for this type of system runs in the asynchronous part, where f out of n 3fþ 1 processes might be corrupted by malicious adversaries. However, sporadically, processes can rely on the services provided by the wormhole for the correct execution of simple operations. One of the nice features of this setting is that it is possible to keep the protocol completely time-free and, in addition, to circumvent the FLP impossibility result by hiding all time-related assumptions in the wormhole. Furthermore, from a performance perspective, it leads to the design of a protocol with a good time complexity. Index Terms—Distributed systems, Byzantine asynchronous protocols, consensus. 1
Automated Rule-Based Diagnosis through a Distributed Monitor System
"... Abstract—In today’s world, where distributed systems form many of our critical infrastructures, dependability outages are becoming increasingly common. In many situations, it is necessary to not only detect a failure but also to diagnose the failure, that is, to identify the source of the failure. D ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract—In today’s world, where distributed systems form many of our critical infrastructures, dependability outages are becoming increasingly common. In many situations, it is necessary to not only detect a failure but also to diagnose the failure, that is, to identify the source of the failure. Diagnosis is challenging, since high-throughput applications with frequent interactions between the different components allow fast error propagation. It is desirable to consider applications as blackboxes for the diagnostic process. In this paper, we propose a Monitor architecture for diagnosing failures in large-scale network protocols. The Monitor only observes the message exchanges between the protocol entities (PEs) remotely and does not access the Internal Protocol state. At runtime, it builds a causal graph between the PEs based on their communication and uses this together with a rule base of allowed state-transition paths to diagnose the failure. The tests used for the diagnosis are based on the rule base and are assumed to have imperfect coverage. The hierarchical Monitor framework allows distributed diagnosis handling failures at individual Monitors. The framework is implemented and applied to a reliable multicast protocol executing on our campuswide network. Fault injection experiments are carried out to evaluate the accuracy and latency of the diagnosis. Index Terms—Distributed system diagnosis, runtime monitoring, hierarchical Monitor system, fault-injection-based evaluation. 1

