Results 11 - 20
of
122
Antiquity: Exploiting a secure log for wide-area distributed storage
- In EuroSys
, 2007
"... Antiquity is a wide-area distributed storage system designed to provide a simple storage service for applications like file systems and back-up. The design assumes that all servers eventually fail and attempts to maintain data despite those failures. Antiquity uses a secure log to maintain data inte ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
(Show Context)
Antiquity is a wide-area distributed storage system designed to provide a simple storage service for applications like file systems and back-up. The design assumes that all servers eventually fail and attempts to maintain data despite those failures. Antiquity uses a secure log to maintain data integrity, replicates each log on multiple servers for durability, and uses dynamic Byzantine faulttolerant quorum protocols to ensure consistency among replicas. We present Antiquity’s design and an experimental evaluation with global and local testbeds. Antiquity has been running for over two months on 400+ PlanetLab servers storing nearly 20,000 logs totaling more than 84 GB of data. Despite constant server churn, all logs remain durable.
AVMON: Optimal and scalable discovery of consistent availability monitoring overlays for distributed systems
- In Proc. ICDCS, 2007
, 2007
"... This paper addresses the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where hosts may be selfish or colluding. We motivate six significant goals for the problem- consistency, verifiability, and randomn ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
(Show Context)
This paper addresses the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where hosts may be selfish or colluding. We motivate six significant goals for the problem- consistency, verifiability, and randomness, in selecting the availability monitors of nodes, as well as discoverability, load-balancing, and scalability in finding these monitors. We then present a new system, called AVMON, that is the first to satisfy these six requirements. The core algorithmic contribution of this paper is a protocol for discovering the availability monitoring overlay in a scalable and efficient manner, given any arbitrary monitor selection scheme that is consistent and verifiable. We mathematically analyze the performance of AVMON’s discovery protocols, and derive an optimal variant that minimizes memory, bandwidth, computation, and discovery time of monitors. Our experimental evaluations of AVMON use three types of availability traces- synthetic, from PlanetLab, and from a peer-to-peer system (Overnet)- and demonstrate that AVMON works well in a variety of distributed systems.
Understanding the Dynamic of Peer-to-Peer Systems
- International Workshop on Peer-To-Peer Systems (IPTPS 2007). Available: http://www.iptps.org/papers-2007/TianDai.pdf
"... Though a few previous research efforts have inves-tigated the peer availability of P2P systems, the understanding of peer dynamic is far from adequate. Based on the running log of a file-sharing P2P sys-tem, we produced a more thorough measurement of the dynamic natures of a P2P system. We further s ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
Though a few previous research efforts have inves-tigated the peer availability of P2P systems, the understanding of peer dynamic is far from adequate. Based on the running log of a file-sharing P2P sys-tem, we produced a more thorough measurement of the dynamic natures of a P2P system. We further show that due to the methodology limitation, crawler based measurement can not precisely cap-ture system dynamic natures as a whole. In this pa-per, we also emphasize some simple yet important dynamic metrics, which are omitted or neglected by previous studies because of state of the art in dura-bility analysis at that time. By a fine-grained analy-sis on the preliminary findings, we reveal a series of useful implications for the design of P2P systems. 1.
Data Placement for Scientific Applications in Distributed Environments
"... Abstract — Scientific applications often perform complex computational analyses that consume and produce large data sets. We are concerned with data placement policies that distribute data in ways that are advantageous for application execution, for example, by placing data sets so that they may be ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
(Show Context)
Abstract — Scientific applications often perform complex computational analyses that consume and produce large data sets. We are concerned with data placement policies that distribute data in ways that are advantageous for application execution, for example, by placing data sets so that they may be staged into or out of computations efficiently or by replicating them for improved performance and reliability. In particular, we propose to study the relationship between data placement services and workflow management systems. In this paper, we explore the interactions between two services used in large-scale science today. We evaluate the benefits of prestaging data using the Data Replication Service versus using the native data stage-in mechanisms of the Pegasus workflow management system. We use the astronomy application, Montage, for our experiments and modify it to study the effect of input data size on the benefits of data prestaging. As the size of input data sets increases, prestaging using a data placement service can significantly improve the performance of the overall analysis. I.
Lithium: Virtual Machine Storage for the Cloud
"... To address the limitations of centralized shared storage for cloud computing, we are building Lithium, a distributed storage system designed specifically for virtualization workloads running in large-scale data centers and clouds. Lithium aims to be scalable, highly available, and compatible with co ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
To address the limitations of centralized shared storage for cloud computing, we are building Lithium, a distributed storage system designed specifically for virtualization workloads running in large-scale data centers and clouds. Lithium aims to be scalable, highly available, and compatible with commodity hardware and existing application software. The design of Lithium borrows ideas and techniques originating from research into Byzantine Fault Tolerance systems and popularized by distributed version control software, and demonstrates their practical applicability to the performancesensitive problem of VM hosting. To our initial surprise, we have found that seemingly expensive techniques such as versioned storage and incremental hashing can lead to a system that is not only more robust to data corruption and host failures, but also often faster than naïve approaches and, for a relatively small cluster of just eight hosts, performs well compared with an enterprise-class Fibre Channel disk array.
Ensuring content integrity for untrusted peer-to-peer content distribution networks
- In Proc. 4th USENIX/ACM NSDI
, 2007
"... Many existing peer-to-peer content distribution networks (CDNs) such as Na Kika, CoralCDN, and CoDeeN are deployed on PlanetLab, a relatively trusted environment. But scaling them beyond this trusted boundary requires protecting against content corruption by untrusted replicas. This paper presents R ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
(Show Context)
Many existing peer-to-peer content distribution networks (CDNs) such as Na Kika, CoralCDN, and CoDeeN are deployed on PlanetLab, a relatively trusted environment. But scaling them beyond this trusted boundary requires protecting against content corruption by untrusted replicas. This paper presents Repeat and Compare, a system for ensuring content integrity in untrusted peer-to-peer CDNs even when replicas dynamically generate content. Repeat and Compare detects misbehaving replicas through attestation records and sampled repeated execution. Attestation records, which are included in responses, cryptographically bind replicas to their code, inputs, and dynamically generated output. Clients then forward a fraction of these records to randomly selected replicas acting as verifiers. Verifiers, in turn, reliably identify misbehaving replicas by locally repeating response generation and comparing their results with the attestation records. We have implemented our system on top of Na Kika. We quantify its detection guarantees through probabilistic analysis and show through simulations that a small sample of forwarded records is sufficient to effectively and promptly cleanse a CDN, even if large fractions of replicas or verifiers are misbehaving. 1
Provable possession and replication of data over cloud servers
"... Abstract. Cloud Computing (CC) is an emerging computing paradigm that can potentially offer a number of important advantages. One of the fundamental advantages of CC is pay-as-you-go pricing model, where customers pay only according to their usage of the services. Currently, data generation is outpa ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
(Show Context)
Abstract. Cloud Computing (CC) is an emerging computing paradigm that can potentially offer a number of important advantages. One of the fundamental advantages of CC is pay-as-you-go pricing model, where customers pay only according to their usage of the services. Currently, data generation is outpacing users ’ storage availability, thus there is an increasing need to outsource such huge amount of data. Outsourcing data to a remote Cloud Service Provider (CSP) is a growing trend for numerous customers and organizations alleviating the burden of local data storage and maintenance. Moreover, customers rely on the data replication provided by the CSP to guarantee the availability and durability of their data. Therefore, Cloud Service Providers (CSPs) provide storage infrastructure and web services interface that can be used to store and retrieve an unlimited amount of data with fees metered in GB/month. The mechanisms used for data replication vary according to the nature of the data; more copies are needed for critical data that cannot easily be reproduced. This critical data should be replicated on multiple servers across multiple data centers. On the other hand, non-critical, reproducible data are stored at reduced levels of redundancy. The pricing model is related to the replication strategy. Therefore, it is of crucial importance to customers to have a strong evidence that they actually get the service they pay for. Moreover, they need to verify that all their data copies are not being tampered with or partially deleted over time. Consequently, the problem of Provable Data Possession (PDP) has been considered in many research papers. Unfortunately, previous PDP schemes focus on a single copy of the data and provide no guarantee that the CSP stores multiple copies of customers ’ data. In this paper we address this challenging issue and propose Efficient Multi-Copy Provable Data Possession (EMC-PDP) protocols. We prove the security of our protocols against colluding servers. Through extensive performance analysis and experimental results, we demonstrate the efficiency of our protocols.
Analysis of failure correlation impact in peer-to-peer storage systems
, 2008
"... ..."
(Show Context)