Results 1 - 10
of
125
A new approach to developing and implementing eager database replication protocols
- ACM TODS
"... Database replication is traditionally seen as a way to increase the availability and performance of distributed databases. Although a large number of protocols providing data consistency and fault-tolerance have been proposed, few of these ideas have ever been used in commercial products due to thei ..."
Abstract
-
Cited by 101 (12 self)
- Add to MetaCart
Database replication is traditionally seen as a way to increase the availability and performance of distributed databases. Although a large number of protocols providing data consistency and fault-tolerance have been proposed, few of these ideas have ever been used in commercial products due to their complexity and performance implications. Instead, current products allow inconsistencies and often resort to centralized approaches which eliminates some of the advantages of replication. As an alternative, we propose a suite of replication protocols that addresses the main problems related to database replication. On the one hand, our protocols maintain data consistency and the same transactional semantics found in centralized systems. On the other hand, they provide flexibility and reasonable performance. To do so, our protocols take advantage of the rich semantics of group communication primitives and the relaxed isolation guarantees provided by most databases. This allows us to eliminate the possibility of deadlocks, reduce the message overhead and increase performance. A detailed simulation study shows the feasibility of the approach and the flexibility with which different types of bottlenecks can be circumvented.
Ganymed: Scalable Replication for Transactional Web Applications
- In Proceedings of the 5th ACM/IFIP/Usenix International Middleware Conference
, 2004
"... Data grids, large scale web applications generating dynamic content and database service providing pose significant scalability challenges to database engines. Replication is the most common solution but it involves di#cult trade-o#s. The most di#cult one is the choice between scalability and co ..."
Abstract
-
Cited by 93 (8 self)
- Add to MetaCart
Data grids, large scale web applications generating dynamic content and database service providing pose significant scalability challenges to database engines. Replication is the most common solution but it involves di#cult trade-o#s. The most di#cult one is the choice between scalability and consistency. Commercial systems give up consistency. Research solutions typically either o#er a compromise (limited scalability in exchange for consistency) or impose limitations on the data schema and the workload. In this paper we introduce Ganymed, a database replication middleware intended to provide scalability without sacrificing consistency and avoiding the limitations of existing approaches. The main idea is to use a novel transaction scheduling algorithm that separates update and read-only transactions. Transactions can be submitted to Ganymed through a special JDBC driver. Ganymed then routes updates to a main server and queries to a potentially unlimited number of readonly copies. The system guarantees that all transactions see a consistent data state (snapshot isolation). In the paper we describe the scheduling algorithm, the architecture of Ganymed, and present an extensive performance evaluation that proves the potential of the system.
On the correctness of transactional memory
- In PPoPP
, 2008
"... Transactional memory (TM) is an appealing abstraction for programming multi-core systems. Potential target applications for TM, such as business software and video games, are likely to involve complex data structures and large transactions, requiring specific software solutions (STM). So far, howeve ..."
Abstract
-
Cited by 76 (17 self)
- Add to MetaCart
Transactional memory (TM) is an appealing abstraction for programming multi-core systems. Potential target applications for TM, such as business software and video games, are likely to involve complex data structures and large transactions, requiring specific software solutions (STM). So far, however, STMs have been mainly evaluated and optimized for smaller scale benchmarks. We revisit the main STM design choices from the perspective of complex workloads and propose a new STM, which we call SwissTM. In short, SwissTM is lock- and word-based and uses (1) optimistic (commit-time) conflict detection for read/write conflicts and pessimistic (encounter-time) conflict detection for write/write conflicts, as well as (2) a new two-phase contention manager that ensures the progress of long transactions while inducing no overhead on short ones. SwissTM outperforms state-of-theart STM implementations, namely RSTM, TL2, and TinySTM, in our experiments on STMBench7, STAMP, Lee-TM and red-black tree benchmarks. Beyond SwissTM, we present the most complete evaluation to date of the individual impact of various STM design choices on the ability to support the mixed workloads of large applications.
Associating synchronization constraints with data in an object-oriented language
- In Proceedings of the ACM Symposium on the Principles of Programming Languages
, 2006
"... Concurrency-related bugs may happen when multiple threads access shared data and interleave in ways that do not correspond to any sequential execution. Their absence is not guaranteed by the traditional notion of “data race ” freedom. We present a new definition of data races in terms of 11 problema ..."
Abstract
-
Cited by 71 (3 self)
- Add to MetaCart
Concurrency-related bugs may happen when multiple threads access shared data and interleave in ways that do not correspond to any sequential execution. Their absence is not guaranteed by the traditional notion of “data race ” freedom. We present a new definition of data races in terms of 11 problematic interleaving scenarios, and prove that it is complete by showing that any execution not exhibiting these scenarios is serializable for a chosen set of locations. Our definition subsumes the traditional definition of a data race as well as high-level data races such as stale-value errors and inconsistent views. We also propose a language feature called atomic sets of locations, which lets programmers specify the existence of consistency properties between fields in objects, without specifying the properties themselves. We use static analysis to automatically infer those points in the code where synchronization is needed to avoid data races under our new definition. An important benefit of this approach is that, in general, far fewer annotations are required than is the case with existing approaches such as synchronized blocks or atomic sections. Our implementation successfully inferred the appropriate synchronization for a significant subset of Java’s Standard Collections framework.
Database replication using generalized snapshot isolation
- In SRDS
, 2005
"... Generalized snapshot isolation extends snapshot isolation as used in Oracle and other databases in a manner suitable for replicated databases. While (conventional) snapshot isolation requires that transactions observe the “latest” snapshot of the database, generalized snapshot isolation allows the u ..."
Abstract
-
Cited by 55 (6 self)
- Add to MetaCart
Generalized snapshot isolation extends snapshot isolation as used in Oracle and other databases in a manner suitable for replicated databases. While (conventional) snapshot isolation requires that transactions observe the “latest” snapshot of the database, generalized snapshot isolation allows the use of “older ” snapshots, facilitating a replicated implementation. We show that many of the desirable properties of snapshot isolation remain. In particular, read-only transactions never block or abort and they do not cause update transactions to block or abort. Moreover, under certain assumptions on the transaction workload the execution is serializable. An implementation of generalized snapshot isolation can choose which past snapshot it uses. An interesting choice for a replicated database is prefix-consistent snapshot isolation, in which the snapshot contains at least all the writes of locally committed transactions. We present two implementations of prefix-consistent snapshot isolation. We conclude with an analytical performance model of one implementation, demonstrating the benefits, in particular reduced latency for read-only transactions, and showing that the potential downsides, in particular change in abort rate of update transactions, are limited. 1.
Making Snapshots Isolation Serializable
, 2000
"... Snapshot Isolation (SI) is a multiversion concurrency control algorithm, first described in Berenson et al. [1995]. SI is attractive because it provides an isolation level that avoids many of the common concurrency anomalies, and has been implemented by Oracle and Microsoft SQL Server (with certain ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
Snapshot Isolation (SI) is a multiversion concurrency control algorithm, first described in Berenson et al. [1995]. SI is attractive because it provides an isolation level that avoids many of the common concurrency anomalies, and has been implemented by Oracle and Microsoft SQL Server (with certain minor variations). SI does not guarantee serializability in all cases, but the TPC-C benchmark application [TPC-C], for example, executes under SI without serialization anomalies. All major database system products are delivered with default nonserializable isolation levels, often ones that encounter serialization anomalies more commonly than SI, and we suspect that numerous isolation errors occur each day at many large sites because of this, leading to corrupt data sometimes noted in data warehouse applications. The classical justification for lower isolation levels is that applications can be run under such levels to improve efficiency when they can be shown not to result in serious errors, but little or no guidance has been offered to application programmers and DBAs by vendors as to how to avoid such errors. This article develops a theory that characterizes when nonserializable executions of applications can occur under SI. Near the end of the article, we apply this theory to demonstrate that the TPC-C benchmark application has no serialization anomalies under SI, and then discuss how this demonstration can be generalized to other applications. We also present a discussion on how to modify the program logic of applications that are nonserializable under SI so that serializability will be guaranteed.
Improving the Scalability of Fault-Tolerant Database Clusters
"... Replication has become a central element in modern information systems playing a dual role: increase availability and enhance scalability. Unfortunately, ..."
Abstract
-
Cited by 42 (4 self)
- Add to MetaCart
Replication has become a central element in modern information systems playing a dual role: increase availability and enhance scalability. Unfortunately,
A lazy snapshot algorithm with eager validation
- In Proceedings of the 20th International Symposium on Distributed Computing (DISC’06
, 2006
"... Abstract. Most high-performance software transactional memories (STM) use optimistic invisible reads. Consequently, a transaction might have an inconsistent view of the objects it accesses unless the consistency of the view is validated whenever the view changes. Although all STMs usually detect inc ..."
Abstract
-
Cited by 40 (10 self)
- Add to MetaCart
Abstract. Most high-performance software transactional memories (STM) use optimistic invisible reads. Consequently, a transaction might have an inconsistent view of the objects it accesses unless the consistency of the view is validated whenever the view changes. Although all STMs usually detect inconsistencies at commit time, a transaction might never reach this point because an inconsistent view can provoke arbitrary behavior in the application (e.g., enter an infinite loop). In this paper, we formally introduce a lazy snapshot algorithm that verifies at each object access that the view observed by a transaction is consistent. Validating previously accessed objects is not necessary for that, however, it can be used on-demand to prolong the view’s validity. We demonstrate both formally and by measurements that the performance of our approach is quite competitive by comparing other STMs with an STM that uses our algorithm. 1
Generalized Isolation Level Definitions
- IN ICDE
, 2000
"... Commercial databases support different isolation levels to allow programmers to trade off consistency for a potential gain in performance. The isolation levels are defined in the current ANSI standard, but the definitions are ambiguous and revised definitions proposed to correct the problem are too ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
Commercial databases support different isolation levels to allow programmers to trade off consistency for a potential gain in performance. The isolation levels are defined in the current ANSI standard, but the definitions are ambiguous and revised definitions proposed to correct the problem are too constrained since they allow only pessimistic (locking) implementations. This paper presents new specifications for the ANSI levels. Our specifications are portable; they apply not only to locking implementations, but also to optimistic and multi-version concurrency control schemes. Furthermore, unlike earlier definitions, our new specifications handle predicates in a correct and flexible manner at all levels.
Tashkent: Uniting Durability with Transaction Ordering for High-Performance Scalable Database Replication
- In EuroSys
, 2006
"... In stand-alone databases, the functions of ordering the transaction commits and making the effects of transactions durable are performed in one single action, namely the writing of the commit record to disk. For efficiency many of these writes are grouped into a single disk operation. In replicated ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
In stand-alone databases, the functions of ordering the transaction commits and making the effects of transactions durable are performed in one single action, namely the writing of the commit record to disk. For efficiency many of these writes are grouped into a single disk operation. In replicated databases in which all replicas agree on the commit order of update transactions, these two functions are typically separated. Specifically, the replication middleware determines the global commit order, while the database replicas make the transactions durable. The contribution of this paper is to demonstrate that this separation causes a significant scalability bottleneck. It forces some of the commit records to be written to disk serially, where in a standalone system they could have been grouped together in a single disk write. Two solutions are possible: (1) move durability from the database to the replication middleware, or (2) keep durability in the database and pass the global commit order from the replication middleware to the database. We implement these two solutions. Tashkent-MW is a pure middleware solution that combines durability and ordering in the middleware, and treats an unmodified database as a black box. In Tashkent-API, we modify the database API so that the middleware can specify the commit order to the database, thus, combining ordering and durability inside the database. We compare both Tashkent systems to an otherwise identical replicated system, called Base, in which ordering and durability remain separated. Under high update transaction loads both Tashkent systems greatly outperform Base in throughput and response time.

