Results 1 -
8 of
8
Database Replication: a Tale of Research across Communities
, 2010
"... Replication is a key mechanism to achieve scalability and fault-tolerance in databases. Its importance has recently been further increased because of the role it plays in achieving elasticity at the database layer. In database replication, the biggest challenge lies in the trade-off between performa ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Replication is a key mechanism to achieve scalability and fault-tolerance in databases. Its importance has recently been further increased because of the role it plays in achieving elasticity at the database layer. In database replication, the biggest challenge lies in the trade-off between performance and consistency. A decade ago, performance could only be achieved through lazy replication at the expense of transactional guarantees. The strong consistency of eager approaches came with a high cost in terms of reduced performance and limited scalability. Postgres-R combined results from distributed systems and databases to develop a replication solution that provided both scalability and strong consistency. The use of group communication primitives with strong ordering and delivery guarantees together with optimized transaction handling (tailored locking, transferring logs instead of re-executing updates, keeping the message overhead per transaction constant) were a drastic departure from the state-of-the-art at the time. Ten years later, these techniques are widely used in a variety of contexts but particularly in cloud computing scenarios. In this paper we review the original motivation for Postgres-R and discuss how the ideas behind the design have evolved over the years.
Dynamically Scaling Applications in the Cloud
"... Scalability is said to be one of the major advantages brought by the cloud paradigm and, more specifically, the one that makes it different to an “advanced outsourcing ” solution. However, there are some important pending issues before makingthedreamedautomatedscalingforapplicationscome true. In thi ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Scalability is said to be one of the major advantages brought by the cloud paradigm and, more specifically, the one that makes it different to an “advanced outsourcing ” solution. However, there are some important pending issues before makingthedreamedautomatedscalingforapplicationscome true. In this paper, the most notable initiatives towards whole application scalability in cloud environments are presented. We present relevant efforts at the edge of state of the art technology, providing an encompassing overview of the trends they each follow. We also highlight pending challenges that will likely be addressed in new research efforts and present an ideal scalable cloud system. Categoriesand SubjectDescriptors C.4[Performance of Systems]: reliability availabilityand serviceability, design studies
Efficient Middleware for Byzantine Fault Tolerant Database Replication
"... Byzantine fault tolerance (BFT) enhances the reliability and availability of replicated systems subject to software bugs, malicious attacks, or other unexpected events. This paper presents Byzantium, a BFT database replication middleware that provides snapshot isolation semantics. It is the first BF ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Byzantine fault tolerance (BFT) enhances the reliability and availability of replicated systems subject to software bugs, malicious attacks, or other unexpected events. This paper presents Byzantium, a BFT database replication middleware that provides snapshot isolation semantics. It is the first BFT database system that allows for concurrent transaction execution without relying on a centralized component, which is essential for having both performance and robustness. Byzantium builds on an existing BFT library but extends it with a set of techniques for increasing concurrency in the execution of operations, for optimistically executing operations in a single replica, and for striping and load-balancing read operations across replicas. Experimental results show that our replication protocols introduce only a modest performance overhead for read-write dominated workloads and perform better than a non-replicated database system for read-only workloads.
Database Replication in Large Scale Systems: Optimizing the Number of Replicas
"... In distributed systems, replication is used for ensuring availability and increasing performances. However, the heavy workload of distributed systems such as web2.0 applications or Global Distribution Systems, limits the benefit of replication if its degree (i.e., the number of replicas) is not cont ..."
Abstract
- Add to MetaCart
In distributed systems, replication is used for ensuring availability and increasing performances. However, the heavy workload of distributed systems such as web2.0 applications or Global Distribution Systems, limits the benefit of replication if its degree (i.e., the number of replicas) is not controlled. Since every replica must perform all updates eventually, there is a point beyond which adding more replicas does not increase the throughput, because every replica is saturated by applying updates. Moreover, if the replication degree exceeds the optimal threshold, the useless replica would generate an overhead due to extra communication messages. In this paper, we propose a suitable replication management solution in order to reduce useless replicas. To this end, we define two mathematical models which approximate the appropriate number of replicas to achieve a given level of performance. Moreover, we demonstrate the feasibility of our replication management model through simulation. The results expose the effectiveness of our models and their accuracy. 1.
Pangea: An Eager Database Replication Middleware guaranteeing Snapshot Isolation without Modification of Database Servers
, 2009
"... Recently, several middleware-based approaches have been proposed. If we implement all functionalities of database replication only in a middleware layer, we can avoid the high cost of modifying existing database servers or scratchbuilding. However, it is a big challenge to propose middleware which c ..."
Abstract
- Add to MetaCart
Recently, several middleware-based approaches have been proposed. If we implement all functionalities of database replication only in a middleware layer, we can avoid the high cost of modifying existing database servers or scratchbuilding. However, it is a big challenge to propose middleware which can enhance performance and scalability without modification of database servers because the restriction may cause extra overhead. Unfortunately, many existing middleware-based approaches suffer from several shortcomings, i.e., some cause a hidden deadlock, some provide only table-level locking, some rely on total order communication tools, and others need to modify existing database servers. In this paper, we propose Pangea, a new eager database replication middleware guaranteeing snapshot isolation that solves the drawbacks of existing middleware by exploiting the property of the first updater wins rule. We have implemented the prototype of Pangea on top of PostgreSQL servers without modification. An advantage of Pangea is that it uses less than 2000 lines of C code. Our experimental results with the TPC-W benchmark reveal that, compared to an existing middleware guaranteeing snapshot isolation without modification of database servers, Pangea provides better performance in terms of throughput and scalability.
Consistent Replication in Distributed Multi-Tier Architectures
"... Abstract—Replication is commonly used to address the scalability and availability requirements of collaborative web applications in domains such as computer supported cooperative work, social networking, e-commerce and e-banking. While providing substantial benefits, replication also introduces the ..."
Abstract
- Add to MetaCart
Abstract—Replication is commonly used to address the scalability and availability requirements of collaborative web applications in domains such as computer supported cooperative work, social networking, e-commerce and e-banking. While providing substantial benefits, replication also introduces the overhead of maintaining data consistent among the replicated servers. In this work we study the performance of common replication approaches with various consistency guarantees and argue for the feasibility of strong consistency. We propose an efficient, distributed, strong consistency protocol and reveal experimentally that its overhead is not prohibitive. We have implemented a replication middleware that offers different consistency protocols, including our strong consistency protocol. We use the TPC-W transactional web commerce benchmark to provide a comprehensive performance comparison of the different replication approaches under a variety of workload mixes. Keywords-Replication, Consistency, Multi-Tier Architectures. I.
Resiliency-Aware Data Management
"... Computing architectures change towards massively parallel environments with increasing numbers of heterogeneous components. The large scale in combination with decreasing feature sizes leads to dramatically increasing error rates. The heterogeneity further leads to new error types. Techniques for en ..."
Abstract
- Add to MetaCart
Computing architectures change towards massively parallel environments with increasing numbers of heterogeneous components. The large scale in combination with decreasing feature sizes leads to dramatically increasing error rates. The heterogeneity further leads to new error types. Techniques for ensuring resiliency in terms of robustness regarding these errors are typically applied at hardware abstraction and operating system levels. However, as errors become the normal case, we observe increasing costs in terms of computation overhead for ensuring robustness. In this paper, we argue that ensuring resiliency on the data management level can reduce the required overhead by exploiting context knowledge of query processing and data storage. Apart from reacting on already detected errors, this was mostly neglected in database research so far. We therefore give a broad overview of the background of resilient computing and existing techniques from the database perspective. Based on the lack of existing techniques on data management level, we raise three fundamental challenges of resiliency-aware data management and present example use cases. Finally, our vision of resiliency-aware data management opens many directions of future work. Fundamental research, including the partial reuse of underlying mechanisms, would allow data management systems to cope with future hardware characteristics by effectively and efficiently ensuring resiliency.
Serializable Snapshot Isolation for Replicated Databases in High-Update Scenarios
"... Many proposals for managing replicated data use sites running the Snapshot Isolation (SI) concurrency control mechanism, and provide 1-copy SI or something similar, as the global isolation level. This allows good scalability, since only ww-conflicts need to be managed globally. However, 1-copy SI ca ..."
Abstract
- Add to MetaCart
Many proposals for managing replicated data use sites running the Snapshot Isolation (SI) concurrency control mechanism, and provide 1-copy SI or something similar, as the global isolation level. This allows good scalability, since only ww-conflicts need to be managed globally. However, 1-copy SI can lead to data corruption and violation of integrity constraints [5]. 1-copy serializability is the global correctness condition that prevents data corruption. We propose a new algorithm Replicated Serializable Snapshot Isolation (RSSI) that uses SI at each site, and combines this with a certification algorithm to guarantee 1-copy serializable global execution. Management of ww-conflicts is similar to what is done in 1-copy SI. But unlike previous designs for 1-copy serializable systems, we do not need to prevent all rw-conflicts among concurrent transactions. We formalize this in a theorem that shows that many rw-conflicts are indeed false positives that do not risk non-serializable behavior. Our proposed RSSI algorithm will only abort a transaction when it detects a well-defined pattern of two consecutive rw-edges in the serialization graph. We have built a prototype that integrates our RSSI with the existing open-source Postgres-R(SI) system. Our performance evaluation shows that there is a worst-case overhead of about 15 % for getting full 1-copy serializability as compared to 1-copy SI in a cluster of 8 nodes, with our proposed RSSI clearly outperforming the previous work [6] for update-intensive workloads. 1.

