Results 11 - 20
of
99
Network Coding for Distributed Storage Systems
- In Proc. of IEEE INFOCOM
, 2007
"... Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, ..."
Abstract
-
Cited by 35 (3 self)
- Add to MetaCart
Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a node failure is for a new node to download subsets of data stored at a number of surviving nodes, reconstruct a lost coded block using the downloaded data, and store it at the new node. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to download functions of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff. I.
Non-transitive connectivity and DHTs
- In Proc. of the 2nd Workshop on Real Large Distributed Systems
, 2005
"... The most basic functionality of a distributed hash table, or DHT, is to partition a key space across the set of nodes in a distributed system such that all nodes agree on the partitioning. For example, the Chord DHT assigns each node ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
The most basic functionality of a distributed hash table, or DHT, is to partition a key space across the set of nodes in a distributed system such that all nodes agree on the partitioning. For example, the Chord DHT assigns each node
Colyseus: A Distributed Architecture for Online Multiplayer Games
- In Proc. Symposium on Networked Systems Design and Implementation (NSDI
, 2006
"... This paper presents the design, implementation, and evaluation of Colyseus, a distributed architecture for interactive multiplayer games. Colyseus takes advantage of a game’s tolerance for weakly consistent state and predictable workload to meet the tight latency constraints of game-play and maintai ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
This paper presents the design, implementation, and evaluation of Colyseus, a distributed architecture for interactive multiplayer games. Colyseus takes advantage of a game’s tolerance for weakly consistent state and predictable workload to meet the tight latency constraints of game-play and maintain scalable communication costs. In addition, it provides a rich distributed query interface and effective pre-fetching subsystem to help locate and replicate objects before they are accessed at a node. We have implemented Colyseus and modified Quake II, a popular first person shooter game, to use it. Our measurements of Quake II and our own Colyseus-based game with hundreds of players shows that Colyseus effectively distributes game traffic across the participating nodes, allowing Colyseus to support low-latency game-play for an order of magnitude more players than existing single server designs, with similar per-node bandwidth costs. 1
Maintaining High Bandwidth Under Dynamic Network Conditions
- In Proceedings of USENIX Annual Technical Conference
, 2005
"... The need to distribute large files across multiple wide-area sites is becoming increasingly common, for instance, in support of scientific computing, configuring distributed systems, distributing software updates such as open source ISOs or Windows patches, or disseminating multimedia content. Recen ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
The need to distribute large files across multiple wide-area sites is becoming increasingly common, for instance, in support of scientific computing, configuring distributed systems, distributing software updates such as open source ISOs or Windows patches, or disseminating multimedia content. Recently a number of techniques have been proposed for simultaneously retrieving portions of a file from multiple remote sites with the twin goals of filling the client’s pipe and overcoming any performance bottlenecks between the client and any individual server. While there are a number of interesting tradeoffs in locating appropriate download sites in the face of dynamically changing network conditions, to date there has been no systematic evaluation of the merits of different protocols. This paper explores the design space of file distribution protocols and conducts a detailed performance evaluation of a number of competing systems running in both controlled emulation environments and live across the Internet. Based on our experience with these systems under a variety of conditions, we propose, implement and evaluate Bullet ′ (Bullet prime), a mesh based high bandwidth data dissemination system that outperforms previous techniques under both static and dynamic conditions. 1
Proactive replication for data durability
- In Proceedings of the 5th Int’l Workshop on Peer-to-Peer Systems (IPTPS
, 2006
"... Many wide-area storage systems replicate data for durability. A common way of maintaining the replicas is to detect node failures and respond by creating additional copies of objects that were stored on failed nodes and hence suffered a loss of redundancy. Reactive techniques can minimize total byte ..."
Abstract
-
Cited by 28 (6 self)
- Add to MetaCart
Many wide-area storage systems replicate data for durability. A common way of maintaining the replicas is to detect node failures and respond by creating additional copies of objects that were stored on failed nodes and hence suffered a loss of redundancy. Reactive techniques can minimize total bytes sent since they only create replicas as needed; however, they can create spikes in network use after a failure. These spikes may overwhelm application traffic and can make it difficult to provision bandwidth. This paper explores a proactive approach that creates additional copies not in response to failures, but periodically at a fixed low rate. We introduce Tempo, a distributed hash table that allows each user to specify a maximum maintenance bandwidth and uses it to perform proactive replication. Results from a simulation study suggest that Tempo can deliver high durability despite only using several kilobytes per second of bandwidth, comparable to state-ofthe-art reactive systems. 1.
Heterogeneity and load balance in distributed hash tables
- In Proc. of IEEE INFOCOM
, 2005
"... Abstract — Existing solutions to achieve load balancing in DHTs incur a high overhead either in terms of routing state or in terms of load movement generated by nodes arriving or departing the system. In this paper, we propose a set of general techniques and use them to develop a protocol based on C ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Abstract — Existing solutions to achieve load balancing in DHTs incur a high overhead either in terms of routing state or in terms of load movement generated by nodes arriving or departing the system. In this paper, we propose a set of general techniques and use them to develop a protocol based on Chord, called Y0, that achieves load balancing with minimal overhead under the typical assumption that the load is uniformly distributed in the identifier space. In particular, we prove that Y0 can achieve near-optimal load balancing, while moving little load to maintain the balance, and increasing the size of the routing tables by at most a constant factor. Using extensive simulations based on real-world and synthetic capacity distributions, we show that Y0 reduces the load imbalance of Chord from O(log n) to a less than 4 without increasing the number of links that a node needs to maintain. In addition, we study the effect of heterogeneity on both DHTs, demonstrating significantly reduced average route length as node capacities become increasingly heterogeneous. For a real-word distribution of node capacities, the route length in Y0 is asymptotically less than half the route length in the case of a homogeneous system. Index Terms — System design, Simulations I.
OverCite: A Cooperative Digital Research Library
, 2005
"... CiteSeer is a well-known online resource for the computer science research community, allowing users to search and browse a large archive of research papers. Unfortunately, its current centralized incarnation is costly to run. Although members of the community would presumably be willing to donate h ..."
Abstract
-
Cited by 24 (9 self)
- Add to MetaCart
CiteSeer is a well-known online resource for the computer science research community, allowing users to search and browse a large archive of research papers. Unfortunately, its current centralized incarnation is costly to run. Although members of the community would presumably be willing to donate hardware and bandwidth at their own sites to assist CiteSeer, the current architecture does not facilitate such distribution of resources. OverCite is a design for a new architecture for a distributed and cooperative research library based on a distributed hash table (DHT). The new architecture harnesses donated resources at many sites to provide document search and retrieval service to researchers worldwide. A preliminary evaluation of an initial OverCite prototype shows that it can service more queries per second than a centralized system, and that it increases total storage capacity by a factor of n/4 in a system of n nodes. OverCite can exploit these additional resources by supporting new features such as document alerts, and by scaling to larger data sets.
GoCast: Gossip-enhanced Overlay Multicast for Fast and Dependable Group Communication
- in DSN
, 2005
"... We study dependable group communication for largescale and delay-sensitive mission critical applications. The goal is to design a protocol that imposes low loads on bottleneck network links and provides both stable throughput and fast delivery of multicast messages even in the presence of frequent n ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
We study dependable group communication for largescale and delay-sensitive mission critical applications. The goal is to design a protocol that imposes low loads on bottleneck network links and provides both stable throughput and fast delivery of multicast messages even in the presence of frequent node and link failures. To this end, we propose our GoCast protocol. GoCast builds a resilient overlay network that is proximity aware and has balanced node degrees. Multicast messages propagate rapidly through an efficient tree embedded in the overlay. In the background, nodes exchange message summaries (gossips) with their overlay neighbors and pick up missing messages due to disruptions in the tree-based multicast. Our simulation based on real Internet data shows that, compared with a traditional gossip-based multicast protocol, GoCast can reduce the delivery delay of multicast messages by a factor of 8.9 when no node fails or a factor of 2.3 when 20 % nodes fail. 1.
EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management
- IN PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON NETWORKS
, 2004
"... EpiChord is a DHT lookup algorithm that demonstrates that we can remove the O(log n)-state-per- node restriction on existing DHT topologies to achieve significantly better lookup performance and resilience using a novel reactive routing state maintenance strategy that amortizes network maintenance c ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
EpiChord is a DHT lookup algorithm that demonstrates that we can remove the O(log n)-state-per- node restriction on existing DHT topologies to achieve significantly better lookup performance and resilience using a novel reactive routing state maintenance strategy that amortizes network maintenance costs into existing lookups and by issuing parallel queries. Our technique allows us to design a new class of unlimited-state-per-node DHTs that is able to adapt naturally to a wide range of lookup workloads. EpiChord is able to achieve O(1)-hop lookup performance under lookup-intensive workloads, and at least O(log n)- hop lookup performance under churn-intensive workloads even in the worst case (though it is expected to perform better on average). Our reactive
A Distributed Hash Table
, 2005
"... DHash is a new system that harnesses the storage and network resources of computers distributed across the Internet by providing a wide-area storage service, DHash. DHash frees applications from re-implementing mechanisms common to any system that stores data on a collection of machines: it maintain ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
DHash is a new system that harnesses the storage and network resources of computers distributed across the Internet by providing a wide-area storage service, DHash. DHash frees applications from re-implementing mechanisms common to any system that stores data on a collection of machines: it maintains a mapping of objects to servers, replicates data for durability, and balances load across participating servers. Applications access data stored in DHash through a familiar hash-table interface: put stores data in the system under a key; get retrieves the data. DHash has proven useful to a number of application builders and has been used to build a content-distribution system [34], a Usenet replacement [118], and new Internet naming architectures [133, 132]. These applications demand low-latency, high-throughput access

