Results 1 -
3 of
3
Ceph: A Scalable Object-Based Storage System
, 2006
"... The data storage needs of large high-performance and general-purpose computing environments are generally best served by distributed storage systems. Traditional solutions, exemplified by NFS, provide a simple distributed storage system model, but cannot meet the demands of high-performance computin ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The data storage needs of large high-performance and general-purpose computing environments are generally best served by distributed storage systems. Traditional solutions, exemplified by NFS, provide a simple distributed storage system model, but cannot meet the demands of high-performance computing environments where a single server may become a bottleneck, nor do they scale well due to the need to manually partition (or repartition) the data among the servers. Object-based storage promises to address these needs through a simple networked data storage unit, the Object Storage Device (OSD) that manages all local storage issues and exports a simple read/write data interface. Despite this simple concept, many challenges remain, including efficient object storage, centralized metadata management, data and metadata replication, and data and metadata reliability. We describe Ceph, a distributed object-based storage system that meets these challenges, providing highperformance file storage that scales directly with the number of OSDs and Metadata servers.
Hybrid Host/Network Topologies for Massive Storage Clusters Abstract
"... The high demand for large scale storage capacity calls for the availability of massive storage solutions with high performance interconnects. Although cluster file systems are rapidly improving and have the potential to allow extremely large numbers of commodity storage nodes to be pooled into a sin ..."
Abstract
- Add to MetaCart
The high demand for large scale storage capacity calls for the availability of massive storage solutions with high performance interconnects. Although cluster file systems are rapidly improving and have the potential to allow extremely large numbers of commodity storage nodes to be pooled into a single large file-system, the number of ports on individual switches has not been increasing as quickly-- the largest switches available today support fewer than 2,000 Gigabit Ethernet ports. Our goal, therefore, is to develop a new interconnect topology that can connect hundreds of thousands of nodes and achieve performance comparable to a single switch of equivalent size. At the same time, such a new topology should be readily buildable using inexpensive components. Our proposed architecture exploits the multiple Ethernet ports that are now standard on servers and combines hostbased routing and forwarding with network-based switching to allow massively large storage clusters to be built. Simulation results have shown that our proposed design achieves 72 % to 90 % of the performance of a single switch capable of accommodating all storage nodes, but our approach scales to hundreds of thousands of nodes. Furthermore, we use common off-the-shelf layer-2 switches rather than more expensive models that support layer-3 routing. Finally, our approach is resilient to network faults because it maintains multiple paths between storage nodes. 1.
Adaptive Replica Management for Large-scale Object-based Storage Devices
"... Replica management is basic and challenging issue for distributed storage system designer. The objective of this paper is to dynamically create, migrate and delete replicas among nodes in response to changes in the access patterns. This paper presents an Adaptive Replica Management Model for large-s ..."
Abstract
- Add to MetaCart
Replica management is basic and challenging issue for distributed storage system designer. The objective of this paper is to dynamically create, migrate and delete replicas among nodes in response to changes in the access patterns. This paper presents an Adaptive Replica Management Model for large-scale Object-based Storage Devices (OSDs). The model expresses availability and consistency maintenance cost as functions of replica number and suggests lower bound and upper bound on replica reference number based on file availability requirement and available network bandwidth. The model can adapt to the changes of environment and maintains a rational number of replica, which not only satisfies object availability, improves access efficiency and balances overload, but also reduces bandwidth requirement and keeps the whole storage system stable. Our experimental evaluation results demonstrate that our model can perform well for system reliability and performance. 1.

