Results 1 - 10
of
32
TxLinux: Using and Managing Hardware Transactional Memory in an Operating System
- In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles
, 2007
"... TxLinux is a variant of Linux that is the first operating system to use hardware transactional memory (HTM) as a synchronization primitive, and the first to manage HTM in the scheduler. This paper describes and measures TxLinux and discusses two innovations in detail: cooperation between locks and t ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
TxLinux is a variant of Linux that is the first operating system to use hardware transactional memory (HTM) as a synchronization primitive, and the first to manage HTM in the scheduler. This paper describes and measures TxLinux and discusses two innovations in detail: cooperation between locks and transactions, and the integration of transactions with the OS scheduler. Mixing locks and transactions requires a new primitive, cooperative transactional spinlocks (cxspinlocks) that allow locks and transactions to protect the same data while maintaining the advantages of both synchronization primitives. Cxspinlocks allow the system to attempt execution of critical regions with transactions and automatically roll back to use locking if the region performs I/O. Integrating the scheduler with HTM eliminates priority inversion. On a series of real-world benchmarks TxLinux has similar performance to Linux, exposing concurrency with as many as 32 concurrent threads on 32 CPUs in the same critical region.
Transactions with isolation and cooperation
- In ACM Symposium on Object Oriented Programming: Systems, Languages, and Applications (OOPSLA
, 2007
"... We present the TIC (Transactions with Isolation and Cooperation) model for concurrent programming. TIC adds to standard transactional memory the ability for a transaction to observe the effects of other threads at selected points. This allows transactions to cooperate, as well as to invoke nonrepeat ..."
Abstract
-
Cited by 24 (3 self)
- Add to MetaCart
We present the TIC (Transactions with Isolation and Cooperation) model for concurrent programming. TIC adds to standard transactional memory the ability for a transaction to observe the effects of other threads at selected points. This allows transactions to cooperate, as well as to invoke nonrepeatable or irreversible operations, such as I/O. Cooperating transactions run the danger of exposing intermediate state and of having other threads change the transaction’s state. The TIC model protects against unanticipated interference by having the type system keep track of all operations that may (transitively) violate the atomicity of a transaction and require the programmer to establish consistency at appropriate points. The result is a programming model that is both general and simple. We have used the TIC model to re-engineer existing lock-based applications including a substantial multi-threaded web mail server and a memory allocator with coarse-grained locking. Our experience confirms the features of the TIC model: It is convenient for the programmer, while maintaining the benefits of transactional memory.
Implementing signatures for transactional memory
- 40th Intl. Symp. on Microarchitecture
, 2007
"... Transactional Memory (TM) systems must track the read and write sets—items read and written during a transaction—to detect conflicts among concurrent transactions. Several TMs use signatures, which summarize unbounded read/write sets in bounded hardware at a performance cost of false positives (conf ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
Transactional Memory (TM) systems must track the read and write sets—items read and written during a transaction—to detect conflicts among concurrent transactions. Several TMs use signatures, which summarize unbounded read/write sets in bounded hardware at a performance cost of false positives (conflicts detected when none exists). This paper examines different organizations to achieve hardware-efficient and accurate TM signatures. First, we find that implementing each signature with a single k-hashfunction Bloom filter (True Bloom signature) is inefficient, as it requires multi-ported SRAMs. Instead, we advocate using k single-hash-function Bloom filters in parallel (Parallel Bloom signature), using area-efficient single-ported SRAMs. Our formal analysis shows that both organizations perform equally well in theory and our simulationbased evaluation shows this to hold approximately in practice. We also show that by choosing high-quality hash functions we can achieve signature designs noticeably more accurate than the previously proposed implementations. Finally, we adapt Pagh and Rodler’s cuckoo hashing to implement Cuckoo-Bloom signatures. While this representation does not support set intersection, it mitigates false positives for the common case of small read/write sets and performs like a Bloom filter for large sets. 1.
Software Transactional Memory for Large Scale Clusters
- PPOPP'08
, 2008
"... While there has been extensive work on the design of software transactional memory (STM) for cache coherent shared memory systems, there has been no work on the design of an STM system for very large scale platforms containing potentially thousands of nodes. In this work, we present Cluster-STM, an ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
While there has been extensive work on the design of software transactional memory (STM) for cache coherent shared memory systems, there has been no work on the design of an STM system for very large scale platforms containing potentially thousands of nodes. In this work, we present Cluster-STM, an STM designed for high performance on large-scale commodity clusters. Our design addresses several novel issues posed by this domain, including aggregating communication, managing locality, and distributing transactional metadata onto the nodes. We also re-evaluate several STM design choices previously studied for cache-coherent machines and conclude that, in some cases, different choices are appropriate on clusters. Finally, we show that our design scales well up to 512 processors. This is because on a cluster, the main barrier to STM scalability is the remote communication overhead imposed by the STM operations, and our design aggregates most of that communication with the communication of the underlying data.
Tokentm: Efficient execution of large transactions with hardware transactional memory
- In Proceedings of the 35th Annual International Symposium on Computer Architecture
, 2008
"... Current hardware transactional memory systems seek to simplify parallel programming, but assume that large transactions are rare, so it is acceptable to penalize their performance or concurrency. However, future programmers may wish to use large transactions more often in order to integrate with hig ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Current hardware transactional memory systems seek to simplify parallel programming, but assume that large transactions are rare, so it is acceptable to penalize their performance or concurrency. However, future programmers may wish to use large transactions more often in order to integrate with higher-level programming models (e.g., database transactions) or perform selected I/O operations. To prevent the “small transactions are common” assumption from becoming self-fulfilling, this paper contributes TokenTM—an unbounded HTM that uses the abstraction of tokens to precisely track conflicts on an unbounded number of memory blocks. TokenTM implements tokens with new mechanisms, including
Pathological Interaction of Locks with Transactional Memory
- In Proceedings of the Third ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing
, 2008
"... Transactional memory (TM) promises to simplify multithreaded programming. Transactions provide mutual exclusion without the possibility of deadlock and the need to assign locks to data structures. To date, most investigations of transactional memory have looked at purely transactional systems that d ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
Transactional memory (TM) promises to simplify multithreaded programming. Transactions provide mutual exclusion without the possibility of deadlock and the need to assign locks to data structures. To date, most investigations of transactional memory have looked at purely transactional systems that do not interact with legacy code using locks. Unfortunately, the reality of software engineering is that such interaction is likely. We investigate the interaction of transactional memory implementations and lock-based code. We identify and discuss five pathologies that arise with different systems when a lock is accessed both within and outside a transaction: Blocking, Deadlock, Livelock, Early Release, and Invisible Locking. To address these pathologies we designed and implemented transaction-safe locks (TxLocks) by modifying the existing lock implementation of the OpenSolaris C Library and extending the conflict resolution policy of a hardware transactional memory system. 1.
Is Transactional Programming Actually Easier?
"... Chip multi-processors (CMPs) have become ubiquitous, while tools that ease concurrent programming have not. The promise of increased performance for all applications through ever more parallel hardware requires good tools for concurrent programming, especially for average programmers. Transactional ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Chip multi-processors (CMPs) have become ubiquitous, while tools that ease concurrent programming have not. The promise of increased performance for all applications through ever more parallel hardware requires good tools for concurrent programming, especially for average programmers. Transactional memory (TM) has enjoyed recent interest as a tool that can help programmers program concurrently. The TM research community claims that programming with transactional memory is easier than alternatives (like locks), but evidence is scant. In this paper, we describe a user-study in which 147 undergraduate students in an operating systems course implemented the same programs using coarse and fine-grain locks, monitors, and transactions. We surveyed the students after the assignment, and examined their code to determine the types and frequency of programming errors for each synchronization technique. Inexperienced programmers found baroque syntax a barrier to entry for transactional programming. On average, subjective evaluation showed that students found transactions harder to use than coarse-grain locks, but slightly easier to use than fine-grained locks. Detailed examination of synchronization errors in the students’ code tells a rather different story. Overwhelmingly, the number and types of programming errors the students made was much lower for transactions than for locks. On a similar programming problem, over 70 % of students made errors with fine-grained locking, while less than 10 % made errors with transactions. 1
Dependence-Aware Transactional Memory for Increased Concurrency
, 2008
"... Abstract—Transactional memory (TM) is a promising paradigm for helping programmers take advantage of emerging multicore platforms. Though they perform well under low contention, hardware TM systems have a reputation of not performing well under high contention, as compared to locks. This paper prese ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Abstract—Transactional memory (TM) is a promising paradigm for helping programmers take advantage of emerging multicore platforms. Though they perform well under low contention, hardware TM systems have a reputation of not performing well under high contention, as compared to locks. This paper presents a model and implementation of dependence-aware transactional memory (DATM), a novel solution to the problem of scaling under contention. Unlike many proposals to deal with write-shared data (which arise in common data structures like counters and linked lists), DATM operates transparently to the programmer. The main idea in DATM is to accept any transaction execution interleaving that is conflict serializable, including interleavings that contain simple conflicts. Current TM systems reduce useful concurrency by restarting conflicting transactions, even if the execution interleaving is conflict serializable. DATM manages dependences between uncommitted transactions, sometimes forwarding data between them to safely commit conflicting transactions. The evaluation of our prototype shows that DATM increases concurrency, for example by reducing the runtime of STAMP benchmarks by up to 39 % and reducing transaction restarts by up to 94%. I.
NV-Heaps: Making Persistent Objects Fast and Safe with Next-Generation, Non-Volatile Memories
"... Persistent, user-defined objects present an attractive abstraction for working with non-volatile program state. However, the slow speed of persistent storage (i.e., disk) has restricted their design and limited their performance. Fast, byte-addressable, non-volatile technologies, such as phase chang ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Persistent, user-defined objects present an attractive abstraction for working with non-volatile program state. However, the slow speed of persistent storage (i.e., disk) has restricted their design and limited their performance. Fast, byte-addressable, non-volatile technologies, such as phase change memory, will remove this constraint and allow programmers to build high-performance, persistent data structures in non-volatile storage that is almost as fast as DRAM. Creating these data structures requires a system that is lightweight enough to expose the performance of the underlying memories but also ensures safety in the presence of application and system failures by avoiding familiar bugs such as dangling pointers, multiple free()s, and locking errors. In addition, the system must prevent new types of hard-to-find pointer safety bugs that only arise with persistent objects. These bugs are especially dangerous since any corruption they cause will be permanent. We have implemented a lightweight, high-performance persistent object system called NV-heaps that provides transactional semantics while preventing these errors and providing a model for persistence that is easy to use and reason about. We implement search trees, hash tables, sparse graphs, and arrays using NV-heaps, BerkeleyDB, and Stasis. Our results show that NV-heap performance scales with thread count and that data structures implemented using NV-heaps out-perform BerkeleyDB and Stasis implementations by 32 × and 244×, respectively, by avoiding the operating system and minimizing other software overheads. We also quantify the cost of enforcing the safety guarantees that NV-heaps provide and measure the costs of NV-heap primitive operations.
Solving Difficult HTM Problems Without Difficult Hardware
- In ACM TRANSACT Workshop
, 2007
"... There are several classes of operations, including I/O and memory allocation, that are considered difficult to perform as part of a transaction. To allow such operations inside of transactions, previous hardware transactional memory systems have proposed additional mechanisms such as opennested tran ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
There are several classes of operations, including I/O and memory allocation, that are considered difficult to perform as part of a transaction. To allow such operations inside of transactions, previous hardware transactional memory systems have proposed additional mechanisms such as opennested transactions that use hardware management of software handlers. Open-nested transactions are not necessary, and add significant complexity to both HTM systems and the software written to take advantage of them. MetaTM is an HTM system designed to run TxLinux, an operating system that uses transactions for some synchronization. Inside the operating system, it is necessary to efficiently handle I/O and memory allocation. MetaTM and TxLinux handle both of these without requiring the significant extra hardware or overhead associated with open-nested transactions. The TxLinux kernel uses cooperative transactional spinlocks which provide the concurrency of transactions with the mutual exclusion needed to perform I/O. Through explicit management of transactional system calls, TxLinux ensures strong atomicity for system calls within a transaction, providing user-level transactions with a powerful and simple transactional programming model. 1.

