Results 1 - 10
of
92
The Hadoop Distributed File System
"... Abstract—The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributin ..."
Abstract
-
Cited by 317 (1 self)
- Add to MetaCart
Abstract—The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks
Distributed Metadata Management Scheme in HDFS
"... designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data. This ar ..."
Abstract
- Add to MetaCart
designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data
Survey on Metadata Management Schemes in HDFS
"... Abstract- Hadoop provides a reliable shared storage and analysis system.The storage is provided by Hadoop Distributed File System(HDFS). Hadoop is a popular open source implementation of mapreduce, a powerful tool designed for deep analysis and transformation of very large data sets. Managing metada ..."
Abstract
- Add to MetaCart
Abstract- Hadoop provides a reliable shared storage and analysis system.The storage is provided by Hadoop Distributed File System(HDFS). Hadoop is a popular open source implementation of mapreduce, a powerful tool designed for deep analysis and transformation of very large data sets. Managing
On the Duality of Data-intensive File System Design: Reconciling HDFS and PVFS
"... Data-intensive applications fall into two computing styles: Internet services (cloud computing) or high-performance computing (HPC). In both categories, the underlying file system is a key component for scalable application performance. In this paper, we explore the similarities and differences betw ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
between PVFS, a parallel file system used in HPC at large scale, and HDFS, the primary storage system used in cloud computing with Hadoop. We integrate PVFS into Hadoop and compare its performance to HDFS using a set of data-intensive computing benchmarks. We study how HDFS-specific optimizations can
Horus: Fine-Grained Encryption-Based Security for Large-Scale Storage
"... With the growing use of large-scale distributed systems, the likelihood that at least one node is compromised is increasing. Large-scale systems that process sensitive data such as geographic data with defense implications, drug modeling, nuclear explosion modeling, and private genomic data would be ..."
Abstract
- Add to MetaCart
With the growing use of large-scale distributed systems, the likelihood that at least one node is compromised is increasing. Large-scale systems that process sensitive data such as geographic data with defense implications, drug modeling, nuclear explosion modeling, and private genomic data would
Mention: INFORMATIQUE Ecole doctorale MATISSE présentée par
, 2011
"... large-scale, distributed systems ..."
Making a Case for Distributed File Systems at Exascale
- Invited Paper, ACM Workshop on Large-scale System and Application Performance (LSAP), 2011
"... Exascale computers will enable the unraveling of significant scientific mysteries. Predictions are that 2019 will be the year of exascale, with millions of compute nodes and billions of threads of execution. The current architecture of high-end computing systems is decades-old and has persisted as w ..."
Abstract
-
Cited by 24 (13 self)
- Add to MetaCart
petascale to exascale. At exascale, basic functionality at high concurrency levels will suffer poor performance, and combined with system mean-time-to-failure in hours, will lead to a performance collapse for large-scale heroic applications. Storage has the potential to be
TOWARDS A HIGH-PERFORMANCE SCALABLE STORAGE SYSTEM FOR WORKFLOW APPLICATIONS
, 2012
"... This thesis is motivated by the fact that there is an urgent need to run scientific many-task workflow applications efficiently and easily on large-scale machines. These applications run at large scale on supercomputers and perform large amount of storage I/O. The storage system is identified as the ..."
Abstract
- Add to MetaCart
This thesis is motivated by the fact that there is an urgent need to run scientific many-task workflow applications efficiently and easily on large-scale machines. These applications run at large scale on supercomputers and perform large amount of storage I/O. The storage system is identified
Storage Systems Group
, 2009
"... iii ACKNOWLEDGEMENTS “Oh, yes, the acknowledgements. I think not. I did it. I did it all, by myself.” – Olin Shivers, scsh reference manual My time at Georgia Tech has been long and challenging, but ultimately rewarding. I can credit my initial graduate school trajectory to several key experiences. ..."
Abstract
- Add to MetaCart
iii ACKNOWLEDGEMENTS “Oh, yes, the acknowledgements. I think not. I did it. I did it all, by myself.” – Olin Shivers, scsh reference manual My time at Georgia Tech has been long and challenging, but ultimately rewarding. I can credit my initial graduate school trajectory to several key experiences. I’d like to thank former College of Computing Professor and Associate Dean Kurt Eiselt for introducing me to functional programming in the form of Scheme; he also offered me and some of my friends an enthralling glimpse into the sheer depth of computing as a field in his introductory CS1311X programming class. Early in my undergraduate experience, Jim Greenlee and Bill Leahy challenged me in various classes, and their standards encouraged me to excel. I’d also like to thank Professor Olin Shivers for both his shrewd dissertation advice and for his eye-opening compilers class; he uses his classes as a pulpit to encourage students to pursue advanced research in computer science. In my second year at Georgia Tech, I started doing undergraduate research with Professor Kishore Ramachandran. Little did I know at the time that I would be a member of his
Review of Limitations on Namespace Distribution for
"... Abstract. There are many challenges today for storing, processing and transferring intensive amounts of data in a distributed, large scale environment like cloud computing systems, where Apache Hadoop is a recent, well-known platform to provide such services. Such platforms use HDSF File System orga ..."
Abstract
- Add to MetaCart
organized on two key components: the Hadoop Distributed File System (HDSF) for file storage and MapReduce, a distributed processing system for intensive cloud data applications. The main features of this structure are scalability, increased fault tolerance, efficiency and high-performance for the whole
Results 1 - 10
of
92