Results 1 - 10
of
28
Ganymed: Scalable Replication for Transactional Web Applications
- In Proceedings of the 5th ACM/IFIP/Usenix International Middleware Conference
, 2004
"... Data grids, large scale web applications generating dynamic content and database service providing pose significant scalability challenges to database engines. Replication is the most common solution but it involves di#cult trade-o#s. The most di#cult one is the choice between scalability and co ..."
Abstract
-
Cited by 93 (8 self)
- Add to MetaCart
Data grids, large scale web applications generating dynamic content and database service providing pose significant scalability challenges to database engines. Replication is the most common solution but it involves di#cult trade-o#s. The most di#cult one is the choice between scalability and consistency. Commercial systems give up consistency. Research solutions typically either o#er a compromise (limited scalability in exchange for consistency) or impose limitations on the data schema and the workload. In this paper we introduce Ganymed, a database replication middleware intended to provide scalability without sacrificing consistency and avoiding the limitations of existing approaches. The main idea is to use a novel transaction scheduling algorithm that separates update and read-only transactions. Transactions can be submitted to Ganymed through a special JDBC driver. Ganymed then routes updates to a main server and queries to a potentially unlimited number of readonly copies. The system guarantees that all transactions see a consistent data state (snapshot isolation). In the paper we describe the scheduling algorithm, the architecture of Ganymed, and present an extensive performance evaluation that proves the potential of the system.
Fault-scalable Byzantine fault-tolerant services
- In Proceedings of the 20th ACM Symposium on Operating Systems Principles
, 2005
"... A fault-scalable service can be configured to tolerate increasing numbers of faults without significant decreases in performance. The Query/Update (Q/U) protocol is a new tool that enables construction of fault-scalable Byzantine faulttolerant services. The optimistic quorum-based nature of the Q/U ..."
Abstract
-
Cited by 92 (6 self)
- Add to MetaCart
A fault-scalable service can be configured to tolerate increasing numbers of faults without significant decreases in performance. The Query/Update (Q/U) protocol is a new tool that enables construction of fault-scalable Byzantine faulttolerant services. The optimistic quorum-based nature of the Q/U protocol allows it to provide better throughput and fault-scalability than replicated state machines using agreement-based protocols. A prototype service built using the Q/U protocol outperforms the same service built using a popular replicated state machine implementation at all system sizes in experiments that permit an optimistic execution. Moreover, the performance of the Q/U protocol decreases by only 36 % as the number of Byzantine faults tolerated increases from one to five, whereas the performance of the replicated state machine decreases by 83%.
Middle-R: Consistent Database Replication at the Middleware Level
- ACM Trans. Comput. Syst
, 2005
"... The widespread use of clusters and web farms has increased the importance of data replication. In this paper, we show how to implement consistent and scalable data replication at the middleware level. We do this by combining transactional concurrency control with group communication primitives. The ..."
Abstract
-
Cited by 59 (7 self)
- Add to MetaCart
The widespread use of clusters and web farms has increased the importance of data replication. In this paper, we show how to implement consistent and scalable data replication at the middleware level. We do this by combining transactional concurrency control with group communication primitives. The paper presents different replication protocols, argues their correctness, describes their implementation as part of a generic middleware tool, and proves their feasibility with an extensive performance evaluation. The solution proposed is well suited for a variety of applications including web farms and distributed object platforms.
Chain Replication for Supporting High Throughput and Availability
"... Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-scale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees. Besides outlining the chain replication pro ..."
Abstract
-
Cited by 56 (3 self)
- Add to MetaCart
Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-scale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees. Besides outlining the chain replication protocols themselves, simulation experiments explore the performance characteristics of a prototype implementation. Throughput, availability, and several objectplacement strategies (including schemes based on distributed hash table routing) are discussed.
Adaptive middleware for data replication
- In Middleware
, 2004
"... Abstract. Dynamically adaptive systems sense their environment and adjust themselves to accommodate to changes in order to maximize performance. Depending on the type of change (e.g., modifications of the load, the type of workload, the available resources, the client distribution, etc.), different ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
Abstract. Dynamically adaptive systems sense their environment and adjust themselves to accommodate to changes in order to maximize performance. Depending on the type of change (e.g., modifications of the load, the type of workload, the available resources, the client distribution, etc.), different adjustments have to be made. Coordinating them is already difficult in a centralized system. Doing so in the currently prevalent component-based distributed systems is even more challenging. In this paper, we present an adaptive distributed middleware for data replication that is able to adjust to changes in the amount of load submitted to the different replicas and to the type of workload submitted. Its novelty lies in combining load-balancing techniques with feedback driven adjustments of multiprogramming levels (number of transactions that are allowed to execute concurrently). An extensive performance analysis shows that the proposed adaptive replication solution can provide high throughput, good scalability, and low response times for changing loads and workloads with little overhead. 1
Fine-Grained Replication and Scheduling with Freshness and Correctness Guarantees
- In Proceedings of the 31st International Conference on Very Large Data Bases
, 2005
"... Lazy replication protocols provide good scalability properties by decoupling transaction execution from the propagation of new values to replica sites while guaranteeing a correct and more efficient transaction processing and replica maintenance. ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Lazy replication protocols provide good scalability properties by decoupling transaction execution from the propagation of new values to replica sites while guaranteeing a correct and more efficient transaction processing and replica maintenance.
An integrated approach to recovery and high availability in an updatable, distributed data warehouse
- In Proc. VLDB
, 2006
"... Any highly available data warehouse will use some form of data replication to tolerate machine failures. In this paper, we demonstrate that we can leverage this data redundancy to build an integrated approach to recovery and high availability. Our approach, called HARBOR, revives a crashed site by q ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Any highly available data warehouse will use some form of data replication to tolerate machine failures. In this paper, we demonstrate that we can leverage this data redundancy to build an integrated approach to recovery and high availability. Our approach, called HARBOR, revives a crashed site by querying remote, online sites for missing updates and uses timestamps to determine which tuples need to be copied or updated. HARBOR does not require a stable log, recovers without quiescing the system, allows replicated data to be stored non-identically, and is simpler than a log-based recovery algorithm. We compare the runtime overhead and recovery performance of HARBOR to those of two-phase commit and AR-IES, the gold standard for log-based recovery, on a threenode distributed database system. Our experiments demonstrate that HARBOR suffers lower runtime overhead, has recovery performance comparable to ARIES’s, and can tolerate the fault of a worker and efficiently bring it back online. 1.
Boosting database replication scalability through partial replication and 1-copy-snapshotisolation
- In PRDC’07
"... Databases have become a crucial component in modern information systems. At the same time, they have become the main bottleneck in most systems. Database replication protocols have been proposed to solve the scalability problem by scaling out in a cluster of sites. Current techniques have attained s ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Databases have become a crucial component in modern information systems. At the same time, they have become the main bottleneck in most systems. Database replication protocols have been proposed to solve the scalability problem by scaling out in a cluster of sites. Current techniques have attained some degree of scalability, however there are two main limitations to existing approaches. Firstly, most solutions adopt a full replication model where all sites store a full copy of the database. The coordination overhead imposed by keeping all replicas consistent allows such approaches to achieve only medium scalabilitiy. Secondly, most replication protocols rely on the traditional consistency criterion, 1-copy-serializability, which limits concurrency, and thus scalability of the system. In this paper, we first analyze analytically the performance gains that can be achieved by various partial replication configurations, i.e., configurations where not all sites store all data. From there, we derive a partial replication protocol that provides 1-copy-snapshot isolation as correctness criterion. We have evaluated the protocol with TPC-W and the results show better scalability than full replication.
Middleware-based Database Replication: The Gaps Between Theory and Practice
, 2008
"... The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solution ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.
Lightweight reflection for middleware-based database replication
- In SRDS’06: Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems (SRDS’06
, 2006
"... Middleware-based database replication approaches have emerged in the last few years as an alternative to traditional database replication implemented within the database kernel. A middleware approach enables third party vendors to provide high availability solutions, a growing practice nowadays in t ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Middleware-based database replication approaches have emerged in the last few years as an alternative to traditional database replication implemented within the database kernel. A middleware approach enables third party vendors to provide high availability solutions, a growing practice nowadays in the software industry. However, middleware solutions often lack scalability and exhibit a number of consistency and performance issues. The reason is that in most cases the middleware has to handle the database as a black box, and hence, cannot take advantage of the many optimizations implemented in the database kernel. Thus, middleware solutions often reimplement key functionality but cannot achieve the same efficiency as a kernel implementation. Reflection has been proposed during the last decade as a fruitful paradigm to separate non-functional aspects from functional ones, simplifying software development and maintenance whilst fostering reuse. However, fully reflective databases are not feasible due to the high cost of reflection. Our claim is that by exposing some minimal database functionality through a lightweight reflective interface, efficient and scalable middleware database replication can be attained. In this paper we explore a wide variety of such lightweight reflective interfaces and discuss what kind of replication algorithms they enable. We also discuss implementation alternatives for some of these interfaces and evaluate their performance.

