Results 1 -
8 of
8
�esis Committee:
, 2011
"... 0716287, and CCF-0964474, Intel, by gi�s from Network Appliance and Google, and through fellowships ..."
Abstract
- Add to MetaCart
0716287, and CCF-0964474, Intel, by gi�s from Network Appliance and Google, and through fellowships
Statement of Research Directions
"... Advancements in computer, electrical and material sciences are taking us closer every day to a computer that has a single, large, persistent, fast and directly addressable memory unit. Today, however, one could develop novel software techniques to bridge the performance, and structural gap between m ..."
Abstract
- Add to MetaCart
Advancements in computer, electrical and material sciences are taking us closer every day to a computer that has a single, large, persistent, fast and directly addressable memory unit. Today, however, one could develop novel software techniques to bridge the performance, and structural gap between memory and storage layers. This will help us move one step closer to such a memory unit. Conceptually, my research aims to use non-volatile memory technologies like NAND-flash memory (flash) for this purpose. Today, a combination of virtual memory, local and network based filesystems are used to build large data-centric systems [3, 13, 22, 33, 34]. These systems expose a single convenient namespace that can be used across an entire cluster for the applications ’ convenience. However, each of them has to deal with the overhead of having to tier data between memory and storage in a custom manner. New technologies like flash can help thin the gap between memory and storage, and help reduce the overhead of developing such applications. These new technologies can be thought of as slower but persistent random access memories or as incredibly fast block storage devices. Today, flash can scale to 20TB within a single rack unit (RU) server [19]. Additionally, state-of-the-art flash consumes ten times less power when compared to high-density DRAM. This allows for fundamentally more energy efficient datacenters that can process larger amounts of data faster than datacenters that are purely DRAM and disk based.
TABLEFS: Embedding a NoSQL Database Inside the Local File System
, 2012
"... Conventional file systems are optimzed for large file transfers instead of workloads that are dominated by metadata and small file accesses. This paper examines using techniques adopated from NoSQL databases to manage file system metadata and small files, which feature high rate of changes and effic ..."
Abstract
- Add to MetaCart
Conventional file systems are optimzed for large file transfers instead of workloads that are dominated by metadata and small file accesses. This paper examines using techniques adopated from NoSQL databases to manage file system metadata and small files, which feature high rate of changes and efficient out-of-core data representation. A FUSE file system prototype was built by storing file system metadata and small files into a modern key-value store LevelDB. We demonstrate that such techniques can improve the performance of modern local file systems in Linux as much as an order of magnitude for workloads dominated by metadata and tiny files. Acknowledgements: We thank the members and companies of the PDL Consortium (including APC, EMC, Facebook, Fusion-IO, Google,
Using Vector Interfaces to Deliver Millions of IOPS from a Networked Key-value Storage Server
"... The performance of non-volatile memories (NVM) has grown by a factor of 100 during the last several years: Flash devices today are capable of over 1 million I/Os per second. Unfortunately, this incredible growth has put strain on software storage systems looking to extract their full potential. To a ..."
Abstract
- Add to MetaCart
The performance of non-volatile memories (NVM) has grown by a factor of 100 during the last several years: Flash devices today are capable of over 1 million I/Os per second. Unfortunately, this incredible growth has put strain on software storage systems looking to extract their full potential. To address this increasing software-I/O gap, we propose using vector interfaces in high-performance networked systems. Vector interfaces organize requests and computation in a distributed system into collections of similar but independent units of work, thereby providing opportunities to amortize and eliminate the redundant work common in many high-performance systems. By integrating vector interfaces into storage and RPC components, we demonstrate that a single key-value storage server can provide 1.6 million requests per second with a median latency below one millisecond, over fourteen times greater than the same software absent the use of vector interfaces. We show that pervasively applying vector interfaces is necessary to achieve this potential and describe how to compose these interfaces together to ensure that vectors of work are propagated throughout a distributed system.
Memory-Efficient GroupBy-Aggregate using Compressed Buffer Trees
"... Memory is rapidly becoming a precious resource in many data processing environments. This paper introduces a new data structure called a Compressed Buffer Tree (CBT). Using a combination of buffering, compression, and lazy aggregation, CBTs can improve the memory efficiency of the GroupBy-Aggregate ..."
Abstract
- Add to MetaCart
Memory is rapidly becoming a precious resource in many data processing environments. This paper introduces a new data structure called a Compressed Buffer Tree (CBT). Using a combination of buffering, compression, and lazy aggregation, CBTs can improve the memory efficiency of the GroupBy-Aggregate abstraction which forms the basis of many data processing models like MapReduce and databases. We evaluate CBTs in the context of MapReduce aggregation, and show that CBTs can provide significant advantages over existing hashbased aggregation techniques: up to 2 × less memory and 1.5 × the throughput, at the cost of 2.5 × CPU. 1
Memory-Efficient GroupBy-Aggregate using Compressed Buffer Trees
"... Memory is rapidly becoming a precious resource in many data processing environments. This paper introduces a new data structure called a Compressed Buffer Tree (CBT). Using a combination of buffering, compression, and lazy aggregation, CBTs can improve the memory efficiency of the GroupBy-Aggregate ..."
Abstract
- Add to MetaCart
Memory is rapidly becoming a precious resource in many data processing environments. This paper introduces a new data structure called a Compressed Buffer Tree (CBT). Using a combination of buffering, compression, and lazy aggregation, CBTs can improve the memory efficiency of the GroupBy-Aggregate abstraction which forms the basis of many data processing models like MapReduce and databases. We evaluate CBTs in the context of MapReduce aggregation, and show that CBTs can provide significant advantages over existing hashbased aggregation techniques: up to 2 × less memory and 1.5 × the throughput, at the cost of 2.5 × CPU. 1
Practical Batch-Updatable External Hashing with Sorting
"... This paper presents a practical external hashing scheme that supports fast lookup (7 microseconds) for large datasets (millions to billions of items) with a small memory footprint (2.5 bits/item) and fast index construction (151 K items/s for 1-KiB key-value pairs). Our scheme combines three key tec ..."
Abstract
- Add to MetaCart
This paper presents a practical external hashing scheme that supports fast lookup (7 microseconds) for large datasets (millions to billions of items) with a small memory footprint (2.5 bits/item) and fast index construction (151 K items/s for 1-KiB key-value pairs). Our scheme combines three key techniques: (1) a new index data structure (Entropy-Coded Tries); (2) the use of sorting as the main data manipulation method; and (3) support for incremental index construction for dynamic datasets. We evaluate our scheme by building an external dictionary on flash-based drives and demonstrate our scheme’s high performance, compactness, and practicality. 1
TABLEFS: Enhancing Metadata Efficiency in the Local File System
, 2013
"... File systems that manage magnetic disks have long recognized the importance of sequential allocation and large transfer sizes for file data. Fast random access has dominated metadata lookup data structures with increasing use of B-trees on-disk. Yet our experiments with workloads dominated by metada ..."
Abstract
- Add to MetaCart
File systems that manage magnetic disks have long recognized the importance of sequential allocation and large transfer sizes for file data. Fast random access has dominated metadata lookup data structures with increasing use of B-trees on-disk. Yet our experiments with workloads dominated by metadata and small file access indicate that even sophisticated local disk file systems like Ext4, XFS and Btrfs leave a lot of opportunity for performance improvement in workloads dominated by metadata and small files. In this paper we present a stacked file system, TABLEFS, which uses another local file system as an object store. TABLEFS organizes all metadata into a single sparse table backed on disk using a Log-Structured Merge (LSM) tree, LevelDB in our experiments. By stacking, TABLEFS asks only for efficient large file allocation and access from the local file system. By using an LSM tree, TABLEFS ensures metadata is written to disk in large, non-overwrite, sorted and indexed logs. Even an inefficient FUSE based user level implementation of TABLEFS can perform comparably to Ext4, XFS and Btrfs on data-intensive benchmarks, and can outperform them by 50 % to as much as 1000 % for metadata-intensive workloads. Such promising performance results from TABLEFS suggest that local disk file systems can be significantly improved by more aggressive aggregation and batching of metadata updates. 1

