Results 1 - 10
of
30
Zyzzyva: Speculative byzantine fault tolerance
- In Symposium on Operating Systems Principles (SOSP
, 2007
"... We present Zyzzyva, a protocol that uses speculation to reduce the cost and simplify the design of Byzantine fault tolerant state machine replication. In Zyzzyva, replicas respond to a client’s request without first running an expensive three-phase commit protocol to reach agreement on the order in ..."
Abstract
-
Cited by 78 (10 self)
- Add to MetaCart
We present Zyzzyva, a protocol that uses speculation to reduce the cost and simplify the design of Byzantine fault tolerant state machine replication. In Zyzzyva, replicas respond to a client’s request without first running an expensive three-phase commit protocol to reach agreement on the order in which the request must be processed. Instead, they optimistically adopt the order proposed by the primary and respond immediately to the client. Replicas can thus become temporarily inconsistent with one another, but clients detect inconsistencies, help correct replicas converge on a single total ordering of requests, and only rely on responses that are consistent with this total order. This approach allows Zyzzyva to reduce replication overheads to near their theoretical minima.
Attested append-only memory: Making adversaries stick to their word
- In Proc. of SOSP
, 2007
"... Researchers have made great strides in improving the fault tolerance of both centralized and replicated systems against arbitrary (Byzantine) faults. However, there are hard limits to how much can be done with entirely untrusted components; for example, replicated state machines cannot tolerate more ..."
Abstract
-
Cited by 45 (7 self)
- Add to MetaCart
Researchers have made great strides in improving the fault tolerance of both centralized and replicated systems against arbitrary (Byzantine) faults. However, there are hard limits to how much can be done with entirely untrusted components; for example, replicated state machines cannot tolerate more than a third of their replica population being Byzantine. In this paper, we investigate how minimal trusted abstractions can push through these hard limits in practical ways. We propose Attested Append-Only Memory (A2M), a trusted system facility that is small, easy to implement and easy to verify formally. A2M provides the programming abstraction of a trusted log, which leads to protocol designs immune to equivocation – the ability of a faulty host to lie in different ways to different clients or servers – which is a common source of Byzantine headaches. Using A2M, we improve upon the state of the art in Byzantine-fault tolerant replicated state machines, producing A2M-enabled protocols (variants of Castro and Liskov’s PBFT) that remain correct (linearizable) and keep making progress (live) even when half the replicas are faulty, in contrast to the previous upper bound. We also present an A2M-enabled single-server shared storage protocol that guarantees linearizability despite server faults. We implement A2M and our protocols, evaluate them experimentally through micro- and macro-benchmarks, and argue that the improved fault tolerance is cost-effective for a broad range of uses, opening up new avenues for practical, more reliable services.
Byzantine replication under attack
- In Proceedings of the 38th IEEE/IFIP International Conference on Depe ndable Systems and Networks (DSN ’08
, 2008
"... Existing Byzantine-resilient replication protocols satisfy two standard correctness criteria, safety and liveness, in the presence of Byzantine faults. In practice, however, faulty processors can, in some protocols, significantly degrade performance by causing the system to make progress at an extre ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
Existing Byzantine-resilient replication protocols satisfy two standard correctness criteria, safety and liveness, in the presence of Byzantine faults. In practice, however, faulty processors can, in some protocols, significantly degrade performance by causing the system to make progress at an extremely slow rate. While “correct ” in the traditional sense, systems vulnerable to such performance degradation are of limited practical use in adversarial environments. This paper argues that techniques for mitigating such performance attacks are needed to bridge this “practicality gap ” for intrusion-tolerant replication systems. We propose a new performance-oriented correctness criterion, and we show how failure to meet this criterion can lead to performance degradation. We present a new Byzantine replication protocol that achieves the criterion and evaluate its performance in fault-free configurations and when under attack.
Fail-aware untrusted storage
, 2008
"... We consider a set of clients collaborating through an online service provider that is subject to attacks, and hence not fully trusted by the clients. We introduce the abstraction of a fail-aware untrusted service, with meaningful semantics even when the provider is faulty. In the common case, when t ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
We consider a set of clients collaborating through an online service provider that is subject to attacks, and hence not fully trusted by the clients. We introduce the abstraction of a fail-aware untrusted service, with meaningful semantics even when the provider is faulty. In the common case, when the provider is correct, such a service guarantees consistency (linearizability) and liveness (wait-freedom) of all operations. In addition, the service always provides accurate and complete consistency and failure detection. We illustrate our new abstraction by presenting a Fail-Aware Untrusted STorage service (FAUST). Existing storage protocols in this model guarantee so-called forking semantics. We observe, however, that none of the previously suggested protocols suffice for implementing fail-aware untrusted storage with the desired liveness and consistency properties (at least wait-freedom and linearizability when the server is correct). We present a new storage protocol, which does not suffer from this limitation, and implements a new consistency notion, called weak fork-linearizability. We show how to extend this protocol to provide eventual consistency and failure awareness in FAUST. 1
Trusting the Cloud
"... More and more users store data in “clouds ” that are accessed remotely over the Internet. We survey well-known cryptographic tools for providing integrity and consistency for data stored in clouds and discuss recent research in cryptography and distributed computing addressing these problems. Storin ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
More and more users store data in “clouds ” that are accessed remotely over the Internet. We survey well-known cryptographic tools for providing integrity and consistency for data stored in clouds and discuss recent research in cryptography and distributed computing addressing these problems. Storing data in clouds Many providers now offer a wide variety of flexible online data storage services, ranging from passive ones, such as online archiving, to active ones, such as collaboration and social networking. They have become known as computing and storage “clouds. ” Such clouds allow users to abandon local storage and use online alternatives, such as Amazon S3, Nirvanix CloudNAS, or Microsoft SkyDrive. Some cloud providers utilize the fact that online storage can be accessed from any location connected to the Internet, and offer additional functionality; for example, Apple MobileMe allows users to synchronize common applications that run on multiples devices. Clouds also offer computation resources, such as Amazon EC2, which can significantly reduce the cost of maintaining such resources locally. Finally, online collaboration tools, such as Google Apps or versioning repositories for source code, make it easy to collaborate with colleagues across organizations and countries, as practiced by the authors of this paper. What can go wrong? Although the advantages of using clouds are unarguable, there are many risks involved with releasing control over your data. One concern that many users are aware of is loss of privacy. Nevertheless, the popularity of social networks and online data sharing repositories suggests that many users are willing to forfeit privacy,
Large-scale Byzantine fault tolerance: Safe but not always live
- In Proceedings of the 3 rd Workshop on Hot Topics in System Dependability. USENIX Association
, 2007
"... The overall correctness of large-scale systems composed of many groups of replicas executing BFT protocols scales poorly with the number of groups. This is because the probability of at least one group being compromised (more than 1/3 faulty replicas) increases rapidly as the number of groups increa ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
The overall correctness of large-scale systems composed of many groups of replicas executing BFT protocols scales poorly with the number of groups. This is because the probability of at least one group being compromised (more than 1/3 faulty replicas) increases rapidly as the number of groups increases. In this paper we address this problem with a simple modification to Castro and Liskov’s BFT replication that allows for arbitrary choice of n (number of replicas) and f (failure threshold). The price to pay is a more restrictive liveness requirement, and we present the design of a large-scale BFT replicated system that obviates this problem. 1
Zeno: Eventually Consistent Byzantine Fault Tolerance
"... Many distributed services are hosted at large, shared, geographically diverse data centers, and they use replication to achieve high availability despite the failure of an entire data center. Recent events show that non-crash faults occur in these services and may lead to long outages. While Byzanti ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Many distributed services are hosted at large, shared, geographically diverse data centers, and they use replication to achieve high availability despite the failure of an entire data center. Recent events show that non-crash faults occur in these services and may lead to long outages. While Byzantine Fault Tolerance (BFT) could be used to withstand these faults, current BFT protocols can become unavailable if a small fraction of their replicas are unreachable. This is because existing BFT protocols favor strong safety guarantees (consistency) over liveness (availability). This paper presents a novel BFT state machine replication protocol called Zeno, that trades consistency for higher availability. In particular, Zeno replaces linearizability with eventual consistency, where clients can temporarily miss each other’s updates but when the network is stable the states from the individual partitions are merged by having the replicas agree on a total order for the requests. We have built a prototype of Zeno and our evaluation using micro-benchmarks shows that Zeno provides better availability than traditional BFT protocols, and that its impact on performance is low, even when partitions occur or heal. 1
Zeno: Eventually Consistent Byzantine-Fault Tolerance
, 2009
"... Many distributed services are hosted at large, shared, geographically diverse data centers, and they use replication to achieve high availability despite the unreachability of an entire data center. Recent events show that non-crash faults occur in these services and may lead to long outages. While ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Many distributed services are hosted at large, shared, geographically diverse data centers, and they use replication to achieve high availability despite the unreachability of an entire data center. Recent events show that non-crash faults occur in these services and may lead to long outages. While Byzantine-Fault Tolerance (BFT) could be used to withstand these faults, current BFT protocols can become unavailable if a small fraction of their replicas are unreachable. This is because existing BFT protocols favor strong safety guarantees (consistency) over liveness (availability). This paper presents a novel BFT state machine replication protocol called Zeno that trades consistency for higher availability. In particular, Zeno replaces strong consistency (linearizability) with a weaker guarantee (eventual consistency): clients can temporarily miss each other’s updates but when the network is stable the states from the individual partitions are merged by having the replicas agree on a total order for all requests. We have built a prototype of Zeno and our evaluation using micro-benchmarks shows that Zeno provides better availability than traditional BFT protocols.
SPORC: Group Collaboration using Untrusted Cloud Resources
- 9TH USENIX SYMPOSIUM ON OPERATING SYSTEMS SYSTEMS DESIGN AND IMPLEMENTATION (OSDI ’10)
, 2010
"... Cloud-based services are an attractive deployment model for user-facing applications like word processing and calendaring. Unlike desktop applications, cloud services allow multiple users to edit shared state concurrently and in real-time, while being scalable, highly available, and globally accessi ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
Cloud-based services are an attractive deployment model for user-facing applications like word processing and calendaring. Unlike desktop applications, cloud services allow multiple users to edit shared state concurrently and in real-time, while being scalable, highly available, and globally accessible. Unfortunately, these benefits come at the cost of fully trusting cloud providers with potentially sensitive and important data. To overcome this strict tradeoff, we present SPORC, a generic framework for building a wide variety of collaborative applications with untrusted servers. In SPORC, a server observes only encrypted data and cannot deviate from correct execution without being detected. SPORC allows concurrent, low-latency editing of shared state, permits disconnected operation, and supports dynamic access control even in the presence of concurrency. We demonstrate SPORC’s flexibility through two prototype applications: a causally-consistent key-value store and a browser-based collaborative text editor. Conceptually, SPORC illustrates the complementary benefits of operational transformation (OT) and fork* consistency. The former allows SPORC clients to execute concurrent operations without locking and to resolve any resulting conflicts automatically. The latter prevents a misbehaving server from equivocating about the order of operations unless it is willing to fork clients into disjoint sets. Notably, unlike previous systems, SPORC can automatically recover from such malicious forks by leveraging OT’s conflict resolution mechanism.
Depot: Cloud storage with minimal trust
"... Abstract: We describe the design, implementation, and evaluation of Depot, a cloud storage system that minimizes trust assumptions. Depot assumes less than any prior system about the correct operation of participating hosts—Depot tolerates Byzantine failures, including malicious or buggy behavior, b ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Abstract: We describe the design, implementation, and evaluation of Depot, a cloud storage system that minimizes trust assumptions. Depot assumes less than any prior system about the correct operation of participating hosts—Depot tolerates Byzantine failures, including malicious or buggy behavior, by any number of clients or servers—yet provides safety and availability guarantees (on consistency, staleness, durability, and recovery) that are useful. The key to safeguarding safety without sacrificing availability (and vice versa) in this environment is to join forks: participants (clients and servers) that observe inconsistent behaviors by other participants can join their forked view into a single view that is consistent with what each individually observed. Our experimental evaluation suggests that the costs of protecting the system are modest. Depot adds a few hundred bytes of metadata to each update and each stored object, and requires hashing and signing each update. 1

