Results 1 - 10
of
57
Understanding Intrinsic Characteristics and System Implications of Flash Memory based Solid State Drives
"... Flash Memory based Solid State Drive (SSD) has been called a “pivotal technology ” that could revolutionize data storage systems. Since SSD shares a common interface with the traditional hard disk drive (HDD), both physically and logically, an effective integration of SSD into the storage hierarchy ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
Flash Memory based Solid State Drive (SSD) has been called a “pivotal technology ” that could revolutionize data storage systems. Since SSD shares a common interface with the traditional hard disk drive (HDD), both physically and logically, an effective integration of SSD into the storage hierarchy is very important. However, details of SSD hardware implementations tend to be hidden behind such narrow interfaces. In fact, since sophisticated algorithms are usually, of necessity, adopted in SSD controller firmware, more complex performance dynamics are to be expected in SSD than in HDD systems. Most existing literature or product specifications on SSD just provide high-level descriptions and standard performance data, such as bandwidth and latency. In order to gain insight into the unique performance characteristics
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications
"... As our society becomes more information-driven, we have begun to amass data at an astounding and accelerating rate. At the same time, power concerns have made it difficult to bring the necessary processing power to bear on querying, processing, and understanding this data. We describe Gordon, a syst ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
As our society becomes more information-driven, we have begun to amass data at an astounding and accelerating rate. At the same time, power concerns have made it difficult to bring the necessary processing power to bear on querying, processing, and understanding this data. We describe Gordon, a system architecture for data-centric applications that combines low-power processors, flash memory, and datacentric programming systems to improve performance for data-centric applications while reducing power consumption. The paper presents an exhaustive analysis of the design space of Gordon systems, focusing on the trade-offs between power, energy, and performance that Gordon must make. It analyzes the impact of flash-storage and the Gordon architecture on the performance and power efficiency of data-centric applications. It also describes a novel flash translation layer tailored to data-intensive workloads and large flash storage arrays. Our data show that, using technologies available in the near future, Gordon systems can out-perform disk-based clusters by 1.5 × and deliver up to 2.5 × more performance per watt.
Online Maintenance of Very Large Random Samples on Flash Storage ABSTRACT
"... Recent advances in flash media have made it an attractive alternative for data storage in a wide spectrum of computing devices, such as embedded sensors, mobile phones, PDA’s, laptops, and even servers. However, flash media has many unique characteristics that make existing data management/analytics ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
Recent advances in flash media have made it an attractive alternative for data storage in a wide spectrum of computing devices, such as embedded sensors, mobile phones, PDA’s, laptops, and even servers. However, flash media has many unique characteristics that make existing data management/analytics algorithms designed for magnetic disks perform poorly with flash storage. For example, while random (page) reads are as fast as sequential reads, random (page) writes and in-place data updates are orders of magnitude slower than sequential writes. In this paper, we consider an important fundamental problem that would seem to be particularly challenging for flash storage: efficiently maintaining a very large (100 MBs or more) random sample of a data stream (e.g., of sensor readings). First, we show that previous algorithms such as reservoir sampling and geometric file are not readily adapted to flash. Second, we propose B-FILE, an energy-efficient abstraction for flash media to store self-expiring items, and show how a B-FILE can be used to efficiently maintain a large sample in flash. Our solution is simple, has a small (RAM) memory footprint, and is designed to cope with flash constraints in order to reduce latency and energy consumption. Third, we provide techniques to maintain biased samples with a B-FILE and to query the large sample stored in a B-FILE for a subsample of an arbitrary size. Finally, we present an evaluation with flash media that shows our techniques are several orders of magnitude faster and more energy-efficient than (flash-friendly versions of) reservoir sampling and geometric file. A key finding of our study, of potential use to many flash algorithms beyond sampling, is that “semi-random ” writes (as defined in the paper) on flash cards are over two orders of magnitude faster and more energy-efficient than random writes. 1.
Migrating server storage to ssds: Analysis of tradeoffs
- In EuroSys
, 2009
"... Recently, flash-based solid-state drives (SSDs) have become standard options for laptop and desktop storage, but their impact on enterprise server storage has not been studied. Provisioning server storage is challenging. It requires optimizing for the performance, capacity, power and reliability nee ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Recently, flash-based solid-state drives (SSDs) have become standard options for laptop and desktop storage, but their impact on enterprise server storage has not been studied. Provisioning server storage is challenging. It requires optimizing for the performance, capacity, power and reliability needs of the expected workload, all while minimizing financial costs. In this paper we analyze a number of workload traces from servers in both large and small data centers, to decide whether and how SSDs should be used to support each. We analyze both complete replacement of disks by SSDs, as well as use of SSDs as an intermediate tier between disks and DRAM. We describe an automated tool that, given device models and a block-level trace of a workload, determines the least-cost storage configuration that will support
Transactional flash
- In Proc. Symposium on Operating Systems Design and Implementation (OSDI
, 2008
"... Transactional flash (TxFlash) is a novel solid-state drive (SSD) that uses flash memory and exports a transactional interface (WriteAtomic) to the higher-level software. The copy-on-write nature of the flash translation layer and the fast random access makes flash memory the right medium to support ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Transactional flash (TxFlash) is a novel solid-state drive (SSD) that uses flash memory and exports a transactional interface (WriteAtomic) to the higher-level software. The copy-on-write nature of the flash translation layer and the fast random access makes flash memory the right medium to support such an interface. We further develop a novel commit protocol called cyclic commit for TxFlash; the protocol has been specified formally and model checked. Our evaluation, both on a simulator and an emulator on top of a real SSD, shows that TxFlash does not increase the flash firmware complexity significantly and provides transactional features with very small overheads (less than 1%), thereby making file systems easier to build. It further shows that the new cyclic commit protocol significantly outperforms traditional commit for small transactions (95 % improvement in transaction throughput) and completely eliminates the space overhead due to commit records. 1
Enabling enterprise solid state disks performance
- In Workshop on Integrating Solid-state Memory into the Storage Hierarchy
, 2009
"... Abstract—In this paper, we examine two modern enterprise Flash-based solid state devices and how varying usage patterns influence the performance one observes from the device. We observe that in order to achieve peak sequential and random performance of an SSD, a workload needs to meet certain crite ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
Abstract—In this paper, we examine two modern enterprise Flash-based solid state devices and how varying usage patterns influence the performance one observes from the device. We observe that in order to achieve peak sequential and random performance of an SSD, a workload needs to meet certain criteria such as high degree of concurrency. We measure the performance effects of intermediate operating system software layers between the application and device, varying the filesystem, I/O Scheduler, and whether or not the device is accessed in direct mode. Finally, we measure and discuss how device performance may degrade under sustained random write access across an SSD’s full address space. I.
Dfs: A file system for virtualized flash storage
- In FAST’10: Proc. of the Eighth USENIX Conf. on File and Storage Technologies (2010), USENIX Association
"... This paper presents the design, implementation and evaluation of Direct File System (DFS) for virtualized flash storage. Instead of using traditional layers of abstraction, our layers of abstraction are designed for directly accessing flash memory devices. DFS has two main novel features. First, it ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
This paper presents the design, implementation and evaluation of Direct File System (DFS) for virtualized flash storage. Instead of using traditional layers of abstraction, our layers of abstraction are designed for directly accessing flash memory devices. DFS has two main novel features. First, it lays out its files directly in a very large virtual storage address space provided by FusionIO’s virtual flash storage layer. Second, it leverages the virtual flash storage layer to perform block allocations and atomic updates. As a result, DFS performs better and it is much simpler than a traditional Unix file system with similar functionalities. Our microbenchmark results show that DFS can deliver 94,000 I/O operations per second (IOPS) for direct reads and 71,000 IOPS for direct writes with the virtualized flash storage layer on FusionIO’s ioDrive. For direct access performance, DFS is consistently better than ext3 on the same platform, sometimes by 20%. For buffered access performance, DFS is also consistently better than ext3, and sometimes by over 149%. Our application benchmarks show that DFS outperforms ext3 by 7% to 250 % while requiring less CPU power. 1
Everest: Scaling down peak loads through i/o off-loading
- In Proceedings of OSDI
, 2008
"... Bursts in data center workloads are a real problem for storage subsystems. Data volumes can experience peak I/O request rates that are over an order of magnitude higher than average load. This requires significant overprovisioning, and often still results in significant I/O request latency during pe ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Bursts in data center workloads are a real problem for storage subsystems. Data volumes can experience peak I/O request rates that are over an order of magnitude higher than average load. This requires significant overprovisioning, and often still results in significant I/O request latency during peaks. In order to address this problem we propose Everest, which allows data written to an overloaded volume to be temporarily off-loaded into a short-term virtual store. Everest creates the short-term store by opportunistically pooling underutilized storage resources either on a server or across servers within the data center. Writes are temporarily off-loaded from overloaded volumes to lightly loaded volumes, thereby reducing the I/O load on the former. Everest is transparent to and usable by unmodified applications, and does not change the persistence or consistency of the storage system. We evaluate Everest using traces from a production Exchange mail server as well as other benchmarks: our results show a 1.4–70 times reduction in mean response times during peaks. 1
FlashStore: High Throughput Persistent KeyValue Store
"... We present FlashStore, a high throughput persistent keyvalue store, that uses flash memory as a non-volatile cache between RAM and hard disk. FlashStore is designed to store the working set of key-value pairs on flash and use one flash read per key lookup. As the working set changes over time, space ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We present FlashStore, a high throughput persistent keyvalue store, that uses flash memory as a non-volatile cache between RAM and hard disk. FlashStore is designed to store the working set of key-value pairs on flash and use one flash read per key lookup. As the working set changes over time, space is made for the current working set by destaging recently unused key-value pairs to hard disk and recycling pages in the flash store. FlashStore organizes key-value pairs in a log-structure on flash to exploit faster sequential writeperformance. Itusesanin-memoryhashtabletoindex them, with hash collisions resolved by a variant of cuckoo hashing. The in-memory hash table stores compact key signatures instead of full keys so as to strike tradeoffs between RAM usage and false flash read operations. FlashStore can be used as a high throughput persistent key-value storage layer for a broad range of server class applications. We compare FlashStore with BerkeleyDB, an embedded key-value store application, running on hard disk and flash separately, so as to bring out the performance gain of FlashStore in not only using flash as a cache above hard disk but also in its use of flash aware algorithms. We use real-world data traces from two data center applications, namely, Xbox LIVE Primetime online multi-player game and inline storage deduplication, to drive and evaluate the design of FlashStore on traditional and low power server platforms. FlashStore outperforms BerkeleyDB by up to 60x on throughput (ops/sec), up to 50x on energy efficiency (ops/Joule), and up to 85x on cost efficiency (ops/sec/dollar) on the evaluated datasets. 1.
A performance evaluation of scientific i/o workloads on flash-based ssds
- In IASDS at CLUSTER
, 2009
"... Abstract — Flash-based solid state disks (SSDs) are an alternative form of storage device that promises to deliver higher performance than the traditional mechanically rotating hard drives. While SSDs have seen utilization in embedded, consumer, and server computer systems, there has been little und ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Abstract — Flash-based solid state disks (SSDs) are an alternative form of storage device that promises to deliver higher performance than the traditional mechanically rotating hard drives. While SSDs have seen utilization in embedded, consumer, and server computer systems, there has been little understanding of its performance effects with scientific I/O workloads. This paperprovides atrace driven performance evaluation of scientific I/O workloads on SSDs. We find that SSDs only provide modest performance gains over mechanical hard drives due to the writeintensive nature of many scientific workloads. Other workloads (likeread-mostly webservers) wouldlikelysee muchlarger gains. Additionally, we observe that the concurrent I/O (when multiple parallel processes simultaneously access a single storage device) may significantly affect the SSD performance. However, such effects appear to be dependent on specific SSD implementation features and they are hard to predict in a general fashion. These results suggest that abundant cautions are needed when supporting high-performance scientific I/O workloads on Flashbased SSDs. I.

