Results 1 - 10
of
48
Tashkent+: Memory-aware load balancing and update filtering in replicated databases
- In EuroSys 2007: Proceedings of the 2nd European Conference on Computer Systems
, 2007
"... We present a memory-aware load balancing (MALB) technique to dispatch transactions to replicas in a replicated database. Our MALB algorithm exploits knowledge of the working sets of transactions to assign them to replicas in such a way that they execute in main memory, thereby reducing disk I/O. In ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
We present a memory-aware load balancing (MALB) technique to dispatch transactions to replicas in a replicated database. Our MALB algorithm exploits knowledge of the working sets of transactions to assign them to replicas in such a way that they execute in main memory, thereby reducing disk I/O. In support of MALB, we introduce a method to estimate the size and the contents of transaction working sets. We also present an optimization called update filtering that reduces the overhead of update propagation between replicas. We show that MALB greatly improves performance over other load balancing techniques – such as round robin, least connections, and locality-aware request distribution (LARD) – that do not use explicit information on how transactions use memory. In particular, LARD demonstrates good performance for read-only static content Web workloads, but it gives performance inferior to MALB for database workloads as it does not efficiently handle large requests. MALB combined with update filtering further boosts performance over LARD. We build a prototype replicated system, called Tashkent+, with which we demonstrate that MALB and update filtering techniques improve performance of the TPC-W and RUBiS benchmarks. In particular, in a 16-replica cluster and using the ordering mix of TPC-W, MALB doubles the throughput over least connections and improves throughput 52 % over LARD. MALB with update filtering further improves throughput to triple that of least connections and more than double that of LARD. Our techniques exhibit super-linear speedup; the throughput of the 16-replica cluster is 37 times the peak throughput of a standalone database due to better use of the cluster’s memory.
Tolerating Byzantine Faults in Transaction Processing Systems using Commit Barrier Scheduling ABSTRACT
"... This paper describes the design, implementation, and evaluation of a replication scheme to handle Byzantine faults in transaction processing database systems. The scheme compares answers from queries and updates on multiple replicas which are unmodified, off-the-shelf systems, to provide a single da ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
This paper describes the design, implementation, and evaluation of a replication scheme to handle Byzantine faults in transaction processing database systems. The scheme compares answers from queries and updates on multiple replicas which are unmodified, off-the-shelf systems, to provide a single database that is Byzantine fault tolerant. The scheme works when the replicas are homogeneous, but it also allows heterogeneous replication in which replicas come from different vendors. Heterogeneous replicas reduce the impact of bugs and security compromises because they are implemented independently and are thus less likely to suffer correlated failures. The main challenge in designing a replication scheme for transaction processing systems is ensuring that the different replicas execute transactions in equivalent serial orders while allowing a high degree of concurrency. Our scheme meets this goal using a novel concurrency control protocol, commit barrier scheduling (CBS). We have implemented CBS in the context of a replicated SQL database, HRDB (Heterogeneous Replicated DB), which has been tested with unmodified production versions of several commercial and open source databases as replicas. Our experiments show an HRDB configuration that can tolerate one faulty replica has only a modest performance overhead (about 17 % for the TPC-C benchmark). HRDB successfully masks several Byzantine faults observed in practice and we have used it to find a new bug in MySQL.
Ws-replication: a framework for highly available web services
- In WWW
, 2006
"... Due to the rapid acceptance of web services and its fast spreading, a number of mission-critical systems will be deployed as web services in next years. The availability of those systems must be guaranteed in case of failures and network disconnections. An example of web services for which availabil ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Due to the rapid acceptance of web services and its fast spreading, a number of mission-critical systems will be deployed as web services in next years. The availability of those systems must be guaranteed in case of failures and network disconnections. An example of web services for which availability will be a crucial issue are those belonging to coordination web service infrastructure, such as web services for transactional coordination (e.g., WS-CAF and WS-Transaction). These services should remain available despite site and connectivity failures to enable business interactions on a 24x7 basis. Some of the common techniques for attaining availability consist in the use of a clustering approach. However, in an Internet setting a domain can get partitioned from the network due to a link overload or some other connectivity problems. The unavailability of a coordination service impacts the availability of all
Sprint: a middleware for high-performance transaction processing
- In EuroSys ’07: Proceedings of the ACM SIGOPS/EuroSys Eu Conference on Computer Systems 2007
, 2007
"... Sprint is a middleware infrastructure for high performance and high availability data management. It extends the functionality of a standalone in-memory database (IMDB) server to a cluster of commodity shared-nothing servers. Applications accessing an IMDB are typically limited by the memory capacit ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Sprint is a middleware infrastructure for high performance and high availability data management. It extends the functionality of a standalone in-memory database (IMDB) server to a cluster of commodity shared-nothing servers. Applications accessing an IMDB are typically limited by the memory capacity of the machine running the IMDB. Sprint partitions and replicates the database into segments and stores them in several data servers. Applications are then limited by the aggregated memory of the machines in the cluster. Transaction synchronization and commitment rely on total-order multicast. Differently from previous approaches, Sprint does not require accurate failure detection to ensure strong consistency, allowing fast reaction to failures. Experiments conducted on a cluster with 32 data servers using TPC-C and a micro-benchmark showed that Sprint can provide very good performance and scalability.
Conflict-Aware LoadBalancing Techniques for Database Replication
, 2006
"... Middleware-based database replication protocols require few or no changes in the database engine. Thus, they are more portable and flexible than kernel-based protocols, but have coarser-grain information about transaction access data, resulting in reduced concurrency and increased aborts. This paper ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Middleware-based database replication protocols require few or no changes in the database engine. Thus, they are more portable and flexible than kernel-based protocols, but have coarser-grain information about transaction access data, resulting in reduced concurrency and increased aborts. This paper proposes conflict-aware load-balancing techniques to increase the concurrency and reduce the abort rate of middleware-based replication protocols. Our algorithms assign transactions to replicas so that the number of conflicting transactions executing on distinct servers is reduced and the processing load is equitably distributed over the servers. Experimental evaluation using a prototype of our system running the TPC-C benchmark showed that aborts can be reduced with no penalty in response time.
Boosting database replication scalability through partial replication and 1-copy-snapshotisolation
- In PRDC’07
"... Databases have become a crucial component in modern information systems. At the same time, they have become the main bottleneck in most systems. Database replication protocols have been proposed to solve the scalability problem by scaling out in a cluster of sites. Current techniques have attained s ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Databases have become a crucial component in modern information systems. At the same time, they have become the main bottleneck in most systems. Database replication protocols have been proposed to solve the scalability problem by scaling out in a cluster of sites. Current techniques have attained some degree of scalability, however there are two main limitations to existing approaches. Firstly, most solutions adopt a full replication model where all sites store a full copy of the database. The coordination overhead imposed by keeping all replicas consistent allows such approaches to achieve only medium scalabilitiy. Secondly, most replication protocols rely on the traditional consistency criterion, 1-copy-serializability, which limits concurrency, and thus scalability of the system. In this paper, we first analyze analytically the performance gains that can be achieved by various partial replication configurations, i.e., configurations where not all sites store all data. From there, we derive a partial replication protocol that provides 1-copy-snapshot isolation as correctness criterion. We have evaluated the protocol with TPC-W and the results show better scalability than full replication.
Middleware-based Database Replication: The Gaps Between Theory and Practice
, 2008
"... The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solution ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.
Osprey: Implementing MapReduce-Style Fault Tolerance in a Shared-Nothing Distributed Database
"... Abstract — In this paper, we describe a scheme for tolerating and recovering from mid-query faults in a distributed shared nothing database. Rather than aborting and restarting queries, our system, Osprey, divides running queries into subqueries, and replicates data such that each subquery can be re ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Abstract — In this paper, we describe a scheme for tolerating and recovering from mid-query faults in a distributed shared nothing database. Rather than aborting and restarting queries, our system, Osprey, divides running queries into subqueries, and replicates data such that each subquery can be rerun on a different node if the node initially responsible fails or returns too slowly. Our approach is inspired by the fault tolerance properties of MapReduce, in which map or reduce jobs are greedily assigned to workers, and failed jobs are rerun on other workers. Osprey is implemented using a middleware approach, with only a small amount of custom code to handle cluster coordination. Each node in the system is a discrete database system running on a separate machine. Data, in the form of tables, is partitioned amongst database nodes and each partition is replicated on several nodes, using a technique called chained declustering [1]. A coordinator machine acts as a standard SQL interface to users; it transforms an input SQL query into a set of subqueries that are then executed on the nodes. Each subquery represents only a small fraction of the total execution of the query; worker nodes are assigned a new subquery as they finish their current one. In this greedy-approach, the amount of work lost due to node failure is small (at most one subquery’s work), and the system is automatically load balanced, because slow nodes will be assigned fewer subqueries. We demonstrate Osprey’s viability as a distributed system for a small data warehouse data set and workload. Our experiments show that the overhead introduced by the middleware is small compared to the workload, and that the system shows promising load balancing and fault tolerance properties. I.
Lightweight reflection for middleware-based database replication
- In SRDS’06: Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems (SRDS’06
, 2006
"... Middleware-based database replication approaches have emerged in the last few years as an alternative to traditional database replication implemented within the database kernel. A middleware approach enables third party vendors to provide high availability solutions, a growing practice nowadays in t ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Middleware-based database replication approaches have emerged in the last few years as an alternative to traditional database replication implemented within the database kernel. A middleware approach enables third party vendors to provide high availability solutions, a growing practice nowadays in the software industry. However, middleware solutions often lack scalability and exhibit a number of consistency and performance issues. The reason is that in most cases the middleware has to handle the database as a black box, and hence, cannot take advantage of the many optimizations implemented in the database kernel. Thus, middleware solutions often reimplement key functionality but cannot achieve the same efficiency as a kernel implementation. Reflection has been proposed during the last decade as a fruitful paradigm to separate non-functional aspects from functional ones, simplifying software development and maintenance whilst fostering reuse. However, fully reflective databases are not feasible due to the high cost of reflection. Our claim is that by exposing some minimal database functionality through a lightweight reflective interface, efficient and scalable middleware database replication can be attained. In this paper we explore a wide variety of such lightweight reflective interfaces and discuss what kind of replication algorithms they enable. We also discuss implementation alternatives for some of these interfaces and evaluate their performance.
Transactional storage for geo-replicated systems
- In SOSP
, 2011
"... We describe the design and implementation of Walter, a key-value store that supports transactions and replicates data across distant sites. A key feature behind Walter is a new property called Parallel Snapshot Isolation (PSI). PSI allows Walter to replicate data asynchronously, while providing stro ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We describe the design and implementation of Walter, a key-value store that supports transactions and replicates data across distant sites. A key feature behind Walter is a new property called Parallel Snapshot Isolation (PSI). PSI allows Walter to replicate data asynchronously, while providing strong guarantees within each site. PSI precludes write-write conflicts, so that developers need not worry about conflict-resolution logic. To prevent write-write conflicts and implement PSI, Walter uses two new and simple techniques: preferred sites and counting sets. We use Walter to build a social networking application and port a Twitter-like application.

