Results 1 - 10
of
32
Privatization techniques for software transactional memory
- In Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing (PODC’07
"... Early implementations of software transactional memory (STM) assumed that sharable data would be accessed only within transactions. Memory may appear inconsistent in programs that violate this assumption, even when program logic would seem to make extra-transactional accesses safe. Designing STM sys ..."
Abstract
-
Cited by 50 (17 self)
- Add to MetaCart
Early implementations of software transactional memory (STM) assumed that sharable data would be accessed only within transactions. Memory may appear inconsistent in programs that violate this assumption, even when program logic would seem to make extra-transactional accesses safe. Designing STM systems that avoid such inconsistency has been dubbed the privatization problem. We argue that privatization comprises a pair of symmetric subproblems: private operations may fail to see updates made by transactions that have committed but not yet completed; conversely, transactions that are doomed but have not yet aborted may see updates made by private code, causing them to perform erroneous, externally visible operations. We explain how these problems arise in different styles of STM, present strategies to address them, and discuss their implementation tradeoffs. We also propose a taxonomy of contracts between the system and the user, analogous to programmer-centric memory consistency models, which allow us to classify programs based on their privatization requirements. Finally, we present empirical comparisons of several privatization strategies. Our results suggest that the best strategy may depend on application characteristics.
Conflict detection and validation strategies for software transactional memory
- In Proc. of the 20th Intl. Symp. on Distributed Computing
, 2006
"... Abstract. In a software transactional memory (STM) system, conflict detection is the problem of determining when two transactions cannot both safely commit. Validation is the related problem of ensuring that a transaction never views inconsistent data, which might potentially cause a doomed transact ..."
Abstract
-
Cited by 46 (14 self)
- Add to MetaCart
Abstract. In a software transactional memory (STM) system, conflict detection is the problem of determining when two transactions cannot both safely commit. Validation is the related problem of ensuring that a transaction never views inconsistent data, which might potentially cause a doomed transaction to exhibit irreversible, externally visible side effects. Existing mechanisms for conflict detection vary greatly in their degree of speculation and their relative treatment of read-write and write-write conflicts. Validation, for its part, appears to be a dominant factor—perhaps the dominant factor—in the cost of complex transactions. We present the most comprehensive study to date of conflict detection strategies, characterizing the tradeoffs among them and identifying the ones that perform the best for various types of workload. In the process we introduce a lightweight heuristic mechanism—the global commit counter—that can greatly reduce the cost of validation and of single-threaded execution. The heuristic also allows us to experiment with mixed invalidation, a more opportunistic interleaving of reading and writing transactions. Experimental results on a 16-processor SunFire machine running our RSTM system indicate that the choice of conflict detection strategy can have a dramatic impact on performance, and that the best choice is workload dependent. In workloads whose transactions rarely conflict, the commit counter does little to help (and can even hurt) performance. For less scalable applications, however—those in which STM performance has traditionally been most problematic—it can improve transaction throughput many fold. 1
Enforcing isolation and ordering in STM
- In the Proceedings of the Conf. on Programming Language Design and Implementation
, 2007
"... Transactional memory provides a new concurrency control mechanism that avoids many of the pitfalls of lock-based synchronization. High-performance software transactional memory (STM) implementations thus far provide weak atomicity: Accessing shared data both inside and outside a transaction can resu ..."
Abstract
-
Cited by 41 (6 self)
- Add to MetaCart
Transactional memory provides a new concurrency control mechanism that avoids many of the pitfalls of lock-based synchronization. High-performance software transactional memory (STM) implementations thus far provide weak atomicity: Accessing shared data both inside and outside a transaction can result in unexpected, implementation-dependent behavior. To guarantee isolation and consistent ordering in such a system, programmers are expected to enclose all shared-memory accesses inside transactions. A system that provides strong atomicity guarantees isolation even in the presence of threads that access shared data outside transactions. A strongly-atomic system also orders transactions with conflicting non-transactional memory operations in a consistent manner. In this paper, we discuss some surprising pitfalls of weak atomicity, and we present an STM system that avoids these problems
Lowering the overhead of nonblocking software transactional memory
- DEPT. OF COMPUTER SCIENCE, UNIV. OF ROCHESTER
, 2006
"... Recent years have seen the development of several different systems for software transactional memory (STM). Most either employ locks in the underlying implementation or depend on thread-safe general-purpose garbage collection to collect stale data and metadata. We consider the design of low-overhea ..."
Abstract
-
Cited by 38 (8 self)
- Add to MetaCart
Recent years have seen the development of several different systems for software transactional memory (STM). Most either employ locks in the underlying implementation or depend on thread-safe general-purpose garbage collection to collect stale data and metadata. We consider the design of low-overhead, obstruction-free software transactional memory for non-garbage-collected languages. Our design eliminates dynamic allocation of transactional metadata and co-locates data that are separate in other systems, thereby reducing the expected number of cache misses on the common-case code path, while preserving nonblocking progress and requiring no atomic instructions other than single-word load, store, and compare-and-swap (or load-linked/store-conditional). We also employ a simple, epoch-based storage management system and introduce a novel conservative mechanism to make reader transactions visible to writers without inducing additional metadata copying or dynamic allocation. Experimental results show throughput significantly higher than that of existing nonblocking STM systems, and highlight significant, application-specific differences among conflict detection and validation strategies.
Practical weak-atomicity semantics for Java STM
- In ACM Symposium on Parallellism in Algorithms and Architectures
, 2008
"... As memory transactions have been proposed as a languagelevel replacement for locks, there is growing need for welldefined semantics. In contrast to database transactions, transaction memory (TM) semantics are complicated by the fact that programs may access the same memory locations both inside and ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
As memory transactions have been proposed as a languagelevel replacement for locks, there is growing need for welldefined semantics. In contrast to database transactions, transaction memory (TM) semantics are complicated by the fact that programs may access the same memory locations both inside and outside transactions. Strongly atomic semantics, where non-transactional accesses are treated as implicit single-operation transactions, remain difficult to provide without specialized hardware support or significant performance overhead. As an alternative, many in the community have informally proposed that a single global lock semantics [18, 10], where transaction semantics are mapped to those of regions protected by a single global lock, provide an intuitive and efficiently implementable model for programmers. In this paper, we explore the implementation and performance implications of single global lock semantics in a weakly atomic STM from the perspective of Java, and we discuss why even recent STM implementations fall short of these semantics. We describe a new weakly atomic Java STM implementation that provides single global lock semantics while permitting concurrent execution, but we show that this comes at a significant performance cost. We also propose and implement various alternative semantics that loosen single lock requirements while still providing strong guarantees. We compare our new implementations to previous ones, including a strongly atomic STM. [24]
JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory
- In PACT ’07: Proc. of the 16th Intl. Conf. on Parallel Architecture and Compilation Techniques (PACT 2007), Brasov
, 2007
"... With the advent of chip-multiprocessors, we are faced with the challenge of parallelizing performance-critical software. Transactional memory (TM) has emerged as a promising programming model allowing programmers to focus on parallelism rather than maintaining correctness and avoiding deadlock. Many ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
With the advent of chip-multiprocessors, we are faced with the challenge of parallelizing performance-critical software. Transactional memory (TM) has emerged as a promising programming model allowing programmers to focus on parallelism rather than maintaining correctness and avoiding deadlock. Many implementations of hardware, software, and hybrid support for TM have been proposed; of these, software-only implementations (STMs) are especially compelling since they can be used with current commodity hardware. However, in addition to higher overheads, many existing STM systems are limited to either managed languages or intrusive APIs. Furthermore, transactions in STMs cannot normally contain calls to unobservable code such as shared libraries or system calls. In this paper we present JudoSTM, a novel dynamic binary-rewriting approach to implementing STM that supports C and C++ code. Furthermore, by using value-based conflict detection, JudoSTM additionally supports the transactional execution of both (i) irreversible system calls and (ii) library functions that may contain locks. We significantly lower overhead through several novel optimizations that improve the quality of rewritten code and reduce the cost of conflict detection and buffering. We show that our approach performs comparably to Rochester’s RSTM library-based implementation—demonstrating that a dynamic binary-rewriting approach to implementing STM is an interesting alternative. 1.
Understanding tradeoffs in software transactional memory
- IN PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION
, 2007
"... There has been a flurry of recent work on the design of high performance software and hybrid hardware/software transactional memories (STMs and HyTMs). This paper reexamines the design decisions behind several of these stateof-the-art algorithms, adopting some ideas, rejecting others, all in an atte ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
There has been a flurry of recent work on the design of high performance software and hybrid hardware/software transactional memories (STMs and HyTMs). This paper reexamines the design decisions behind several of these stateof-the-art algorithms, adopting some ideas, rejecting others, all in an attempt to make STMs faster. We created the transactional locking (TL) framework of STM algorithms and used it to conduct a range of comparisons of the performance of non-blocking, lock-based, and Hybrid STM algorithms versus fine-grained hand-crafted ones. We were able to make several illuminating observations regarding lock acquisition order, the interaction of STMs with memory management schemes, and the role of overheads and abort rates in STM performance.
Anatomy of a scalable software transactional memory
- 2009, 4th ACM SIGPLAN Workshop on Transactional Computing (TRANSACT’09
, 2009
"... Existing software transactional memory (STM) implementations often exhibit poor scalability, usually because of nonscalable mechanisms for read sharing, transactional consistency, and privatization; some STMs also have nonscalable centralized commit mechanisms. We describe novel techniques to elimin ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
Existing software transactional memory (STM) implementations often exhibit poor scalability, usually because of nonscalable mechanisms for read sharing, transactional consistency, and privatization; some STMs also have nonscalable centralized commit mechanisms. We describe novel techniques to eliminate bottlenecks from all of these mechanisms, and present SkySTM, which employs these techniques. SkySTM is the first STM that supports privatization and scales on modern multicore multiprocessors with hundreds of hardware threads on multiple chips. A central theme in this work is avoiding frequent updates to centralized metadata, especially for multi-chip systems, in which the cost of accessing centralized metadata increases dramatically. A key mechanism we use to do so is a scalable nonzero indicator (SNZI), which was designed for this purpose. A secondary contribution of the paper is a new and simplified SNZI algorithm. Our scalable privatization mechanism imposes only about 4% overhead in low-contention experiments; when contention is higher, the overhead still reaches only 35 % with over 250 threads. In contrast, prior approaches have been reported as imposing over 100 % overhead in some cases, even with only 8 threads. 1.
Kicking the tires of software transactional memory: Why the going gets tough
- In ACM Symp. on Parallelism in Algorithms and Arch. (SPAA
, 2008
"... Transactional Memory (TM) promises to simplify concurrent programming, which has been notoriously difficult but crucial in realizing the performance benefit of multi-core processors. Software Transaction Memory (STM), in particular, represents a body of important TM technologies since it provides a ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Transactional Memory (TM) promises to simplify concurrent programming, which has been notoriously difficult but crucial in realizing the performance benefit of multi-core processors. Software Transaction Memory (STM), in particular, represents a body of important TM technologies since it provides a mechanism to run transactional programs when hardware TM support is not available, or when hardware TM resources are exhausted. Nonetheless, most previous studies on STMs were constrained to executing trivial, small-scale workloads. The assumption was that the same techniques applied to small-scale workloads could readily be applied to real-life, large-scale workloads. However, by executing several nontrivial workloads such as particle dynamics simulation and game physics engine on a state of the art STM, we noticed that this assumption does not hold. Specifically, we identified four major performance bottlenecks that were unique to the case of executing large-scale workloads on an STM: false conflicts, overinstrumentation, privatization-safety cost, and poor amortization. We believe that these bottlenecks would be common for any STM targeting real-world applications. In this paper, we describe those identified bottlenecks in detail, and we propose novel solutions to alleviate the issues. We also thoroughly validate these approaches with experimental results on real machines.
PhTM: Phased transactional memory
- In Workshop on Transactional Computing (Transact), 2007. research.sun.com/scalable/pubs/ TRANSACT2007PhTM.pdf
"... Hybrid transactional memory (HyTM) [3] works in today’s systems, and can use future “best effort ” hardware transactional memory (HTM) support to improve performance. Best effort HTM can be substantially simpler than alternative “unbounded ” HTM designs being proposed in the literature, so HyTM both ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
Hybrid transactional memory (HyTM) [3] works in today’s systems, and can use future “best effort ” hardware transactional memory (HTM) support to improve performance. Best effort HTM can be substantially simpler than alternative “unbounded ” HTM designs being proposed in the literature, so HyTM both supports and encourages an incremental approach to adopting HTM. We introduce Phased Transactional Memory (PhTM), which supports switching between different “phases”, each implemented by a different form of transactional memory support. This allows us to adapt between a variety of different transactional memory implementations according to the current environment and workload. We describe a simple PhTM prototype, and present experimental results showing that PhTM can match the performance and scalability of unbounded HTM implementations better than our previous HyTM prototype when best effort HTM support is available and effective, and is more competitive with state-of-the-art software transactional memory implementations when it is not. 1.

