Results 1 - 10
of
22
Moneta: A High-performance Storage Array Architecture for Next-generation, Non-volatile Memories
"... Abstract—Emerging non-volatile memory technologies such as phase change memory (PCM) promise to increase storage system performance by a wide margin relative to both conventional disks and flash-based SSDs. Realizing this potential will require significant changes to the way systems interact with st ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Abstract—Emerging non-volatile memory technologies such as phase change memory (PCM) promise to increase storage system performance by a wide margin relative to both conventional disks and flash-based SSDs. Realizing this potential will require significant changes to the way systems interact with storage devices as well as a rethinking of the storage devices themselves. This paper describes the architecture of a prototype PCIeattached storage array built from emulated PCM storage called Moneta. Moneta provides a carefully designed hardware/software interface that makes issuing and completing accesses atomic. The atomic management interface, combined with hardware scheduling optimizations, and an optimized storage stack increases performance for small, random accesses by 18x and reduces software overheads by 60%. Moneta array sustain 2.8 GB/s for sequential transfers and 541K random 4 KB IO operations per second (8x higher than a state-of-the-art flash-based SSD). Moneta can perform a 512-byte write in 9 us (5.6x faster than the SSD). Moneta provides a harmonic mean speedup of 2.1x and a maximum speed up of 9x across a range of file system, paging, and database workloads. We also explore trade-offs in Moneta’s architecture between performance, power, memory organization, and memory latency. Keywords-software IO optimizations; non-volatile memories; phase change memories; storage systems I.
NV-Heaps: Making Persistent Objects Fast and Safe with Next-Generation, Non-Volatile Memories
"... Persistent, user-defined objects present an attractive abstraction for working with non-volatile program state. However, the slow speed of persistent storage (i.e., disk) has restricted their design and limited their performance. Fast, byte-addressable, non-volatile technologies, such as phase chang ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Persistent, user-defined objects present an attractive abstraction for working with non-volatile program state. However, the slow speed of persistent storage (i.e., disk) has restricted their design and limited their performance. Fast, byte-addressable, non-volatile technologies, such as phase change memory, will remove this constraint and allow programmers to build high-performance, persistent data structures in non-volatile storage that is almost as fast as DRAM. Creating these data structures requires a system that is lightweight enough to expose the performance of the underlying memories but also ensures safety in the presence of application and system failures by avoiding familiar bugs such as dangling pointers, multiple free()s, and locking errors. In addition, the system must prevent new types of hard-to-find pointer safety bugs that only arise with persistent objects. These bugs are especially dangerous since any corruption they cause will be permanent. We have implemented a lightweight, high-performance persistent object system called NV-heaps that provides transactional semantics while preventing these errors and providing a model for persistence that is easy to use and reason about. We implement search trees, hash tables, sparse graphs, and arrays using NV-heaps, BerkeleyDB, and Stasis. Our results show that NV-heap performance scales with thread count and that data structures implemented using NV-heaps out-perform BerkeleyDB and Stasis implementations by 32 × and 244×, respectively, by avoiding the operating system and minimizing other software overheads. We also quantify the cost of enforcing the safety guarantees that NV-heaps provide and measure the costs of NV-heap primitive operations.
Onyx: A Protoype Phase Change Memory Storage Array
"... We describe a prototype high-performance solid-state drive based on first-generation phase-change memory (PCM) devices called Onyx. Onyx has a capacity of 10 GB and connects to the host system via PCIe. We describe the internal architecture of Onyx including the PCM memory modules we constructed and ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We describe a prototype high-performance solid-state drive based on first-generation phase-change memory (PCM) devices called Onyx. Onyx has a capacity of 10 GB and connects to the host system via PCIe. We describe the internal architecture of Onyx including the PCM memory modules we constructed and the FPGAbased controller that manages them. Onyx can perform a 4 KB random read in 38 µs and sustain 191K 4 KB read IO operations per second. A 4 KB write requires 179 µs. We describe our experience tuning the Onyx system to reduce the cost of wear-leveling and increase performance. We find that Onyx out-performs a stateof-the-art flash-based SSD for small writes (< 2 KB) by between 72 and 120 % and for reads of all sizes. In addition, Onyx incurs 20-51 % less CPU overhead per IOP for small requests. Combined, our results demonstrate that even first-generation PCM SSDs can out-perform flash-based arrays for the irregular (and frequently readdominated) access patterns that define many of today’s “killer ” storage applications. Next generation PCM devices will widen the performance gap further and set the stage for PCM becoming a serious flash competitor in many applications. 1
Rethinking Database Algorithms for Phase Change Memory
"... Phase change memory (PCM) is an emerging memory technology with many attractive features: it is non-volatile, byte-addressable, 2–4X denser than DRAM, and orders of magnitude better than NAND Flash in read latency, write latency, and write endurance. In the near future, PCM is expected to become a c ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Phase change memory (PCM) is an emerging memory technology with many attractive features: it is non-volatile, byte-addressable, 2–4X denser than DRAM, and orders of magnitude better than NAND Flash in read latency, write latency, and write endurance. In the near future, PCM is expected to become a common component of the memory/storage hierarchy for a wide range of computer systems. In this paper, we describe the unique characteristics of PCM, and their potential impact on database system design. In particular, we present analytic metrics for PCM endurance, energy, and latency, and illustrate that current approaches for common database algorithms such as B +-trees and Hash Joins are suboptimal for PCM. We present improved algorithms that reduce both execution time and energy on PCM while increasing write endurance. 1.
Understanding the Impact of Emerging Non-Volatile Memories on High-Performance, IO-Intensive Computing
"... Abstract—Emerging storage technologies such as flash memories, phase-change memories, and spin-transfer torque memories are poised to close the enormous performance gap between disk-based storage and main memory. We evaluate several approaches to integrating these memories into computer systems by m ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract—Emerging storage technologies such as flash memories, phase-change memories, and spin-transfer torque memories are poised to close the enormous performance gap between disk-based storage and main memory. We evaluate several approaches to integrating these memories into computer systems by measuring their impact on IO-intensive, database, and memory-intensive applications. We explore several options for connecting solid-state storage to the host system and find that the memories deliver large gains in sequential and random access performance, but that different system organizations lead to different performance trade-offs. The memories provide substantial application-level gains as well, but overheads in the OS, file system, and application can limit performance. As a result, fully exploiting these memories ’ potential will require substantial changes to application and system software. Finally, paging to fast non-volatile memories is a viable option for some applications, providing an alternative to expensive, powerhungry DRAM for supporting scientific applications with large memory footprints. Index Terms—storage systems, non-volatile memory, IO performance, flash memory, phase change memory, spin-torque
Providing Safe, User Space Access to Fast, Solid State Disks
"... Emerging fast, non-volatile memories (e.g., phase change memories, spin-torque MRAMs, and the memristor) reduce storage access latencies by an order of magnitude compared to state-of-the-art flash-based SSDs. This improved performance means that software overheads that had little impact on the perfo ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Emerging fast, non-volatile memories (e.g., phase change memories, spin-torque MRAMs, and the memristor) reduce storage access latencies by an order of magnitude compared to state-of-the-art flash-based SSDs. This improved performance means that software overheads that had little impact on the performance of flash-based systems can present serious bottlenecks in systems that incorporate these new technologies. We describe a novel storage hardware and software architecture that nearly eliminates two sources of this overhead: Entering the kernel and performing file system permission checks. The new architecture provides a private, virtualized interface for each process and moves file system protection checks into hardware. As a result, applications can access file data without operating system intervention, eliminating OS and file system costs entirely for most accesses. We describe the support the system provides for fast permission checks in hardware, our approach to notifying applications when requests complete, and the small, easily portable changes required in the file system to support the new access model. Existing applications require no modification to use the new interface. We evaluate the performance of the system using a suite of microbenchmarks and database workloads and show that the new interface improves latency and bandwidth for 4 KB writes by 60 % and 7.2×, respectively, OLTP database transaction throughput by up to 2.0×, and Berkeley-DB throughput by up to 5.7×. A streamlined asynchronous file IO interface built to fully utilize the new interface enables an additional 5.5 × increase in throughput with 1 thread and 2.8 × increase in efficiency for 512 B transfers.
A Content-Aware Block Placement Algorithm for Reducing PRAM Storage Bit Writes
"... Phase-change random access memory (PRAM) is a promising storage-class memory technology that has the potential to replace flash memory and DRAM in many applications. Because individual cells in a PRAM can be written independently, only data cells whose current values differ from the corresponding bi ..."
Abstract
- Add to MetaCart
Phase-change random access memory (PRAM) is a promising storage-class memory technology that has the potential to replace flash memory and DRAM in many applications. Because individual cells in a PRAM can be written independently, only data cells whose current values differ from the corresponding bits in a write request need to be updated. Furthermore, when a block write request is received, the PRAM may contain many free blocks that are available for overwriting, and these free blocks will generally have different contents. For this reason, the number of bit programming operations required to write new data to the PRAM (and consequently power consumption and write bandwidth) depends on the location that is chosen to be overwritten. This paper describes a block placement algorithm for reducing PRAM bit writes based on the idea of indexing free blocks using a content-based signature; computing the signature value of a new block of data to be written allows a free block with similar contents to be located quickly. While the benefit that can be realized by the use of any block placement algorithm is heavily dependent on the workload, our evaluation results show that block placement using content-based signatures is able to reduce the number of PRAM bit programming operations by as much as an order of magnitude. 1.
1 A Phase Change Memory as a Secure Main Memory
"... Abstract—Phase change memory (PCM) technology appears as more scalable than DRAM technology. As PCM exhibits access time slightly longer but in the same range as DRAMs, several recent studies have proposed to use PCMs for designing main memory systems. Unfortunately PCM technology suffers from a lim ..."
Abstract
- Add to MetaCart
Abstract—Phase change memory (PCM) technology appears as more scalable than DRAM technology. As PCM exhibits access time slightly longer but in the same range as DRAMs, several recent studies have proposed to use PCMs for designing main memory systems. Unfortunately PCM technology suffers from a limited write endurance; typically each memory cell can be only be written a large but still limited number of times (10 7 to 10 9 writes are reported for current technology). Till now, research proposals have essentially focused their attention on designing memory systems that will survive to the average behavior of conventional applications. However PCM memory systems should be designed to survive worst-case applications, i.e., malicious attacks targeting the physical destruction of the memory through overwriting a limited number of memory cells. In this paper, we propose the design of a secure PCM-based main memory that would by construction survive to overwrite attacks. 1
Impact of Process Variation on Endurance Algorithms for Wear-Prone Memories
"... Non-volatile memories, such as Flash and Phase-Change Memory, are replacing other memory and storage technologies. Although these new technologies have desirable energy and scalability properties, they are prone to wear-out due to excessive write operations. Because wearout is an important phenomeno ..."
Abstract
- Add to MetaCart
Non-volatile memories, such as Flash and Phase-Change Memory, are replacing other memory and storage technologies. Although these new technologies have desirable energy and scalability properties, they are prone to wear-out due to excessive write operations. Because wearout is an important phenomenon, a number of endurance management schemes have been proposed. There is a trade-off between what techniques to use, depending on the range of bit cell lifetime within a device. This range in cell durability arises from effects due to process variation. In this paper, we describe modeling techniques to analyze trade-offs for endurance management based on the anticipated distribution of cell lifetime. This analysis considers two general endurance strategies (physical capacity degradation and physical sparing) under four distributions of cell lifetime (constant, linear, normal, and bimodal). The modeling techniques can be used to determine how much redundancy is needed when a sparing endurance strategy is adopted. With the correct choice of technique, the device lifetime can be doubled. I.

