Results 1 - 10
of
22
Rarest first and choke algorithms are enough
- version 3 - 6 September 2006), INRIA, Sophia Antipolis
, 2006
"... The performance of peer-to-peer file replication comes from its piece and peer selection strategies. Two such strategies have been introduced by the BitTorrent protocol: the rarest first and choke algorithms. Whereas it is commonly admitted that BitTorrent performs well, recent studies have proposed ..."
Abstract
-
Cited by 83 (15 self)
- Add to MetaCart
The performance of peer-to-peer file replication comes from its piece and peer selection strategies. Two such strategies have been introduced by the BitTorrent protocol: the rarest first and choke algorithms. Whereas it is commonly admitted that BitTorrent performs well, recent studies have proposed the replacement of the rarest first and choke algorithms in order to improve efficiency and fairness. In this paper, we use results from real experiments to advocate that the replacement of the rarest first and choke algorithms cannot be justified in the context of peer-to-peer file replication in the Internet. We instrumented a BitTorrent client and ran experiments on real torrents with different characteristics. Our experimental evaluation is peer oriented, instead of tracker oriented, which allows us to get detailed information on all exchanged messages and protocol events. We go beyond the mere observation of the good efficiency of both algorithms. We show that the rarest first algorithm guarantees close to ideal diversity of the pieces among peers. In particular, on our experiments, replacing the rarest first algorithm with source or network coding solutions cannot be justified. We also show that the choke algorithm in its latest version fosters reciprocation and is robust to free riders. In particular, the choke algorithm is fair and its replacement with a bit level tit-for-tat solution is not appropriate. Finally, we identify new areas of improvements for efficient peer-to-peer file replication protocols.
detecting the unexpected in distributed systems
- In NSDI’06: Proceedings of the 3rd conference on 3rd Symposium on Networked Systems Design & Implementation
"... Bugs in distributed systems are often hard to find. Many bugs reflect discrepancies between a system’s behavior and the programmer’s assumptions about that behavior. We present Pip 1, an infrastructure for comparing actual behavior and expected behavior to expose structural errors and performance pr ..."
Abstract
-
Cited by 75 (6 self)
- Add to MetaCart
Bugs in distributed systems are often hard to find. Many bugs reflect discrepancies between a system’s behavior and the programmer’s assumptions about that behavior. We present Pip 1, an infrastructure for comparing actual behavior and expected behavior to expose structural errors and performance problems in distributed systems. Pip allows programmers to express, in a declarative language, expectations about the system’s communications structure, timing, and resource consumption. Pip includes system instrumentation and annotation tools to log actual system behavior, and visualization and query tools for exploring expected and unexpected behavior 2. Pip allows a developer to quickly understand and debug both familiar and unfamiliar systems. We applied Pip to several applications, including FAB, SplitStream, Bullet, and RanSub. We generated most of the instrumentation for all four applications automatically. We found the needed expectations easy to write, starting in each case with automatically generated expectations. Pip found unexpected behavior in each application, and helped to isolate the causes of poor performance and incorrect behavior. 1
Scale and performance in the CoBlitz largefile distribution service
- In Proceedings of the 3rd USENIX/ACM Symposium on Networked Systems Design and Implementation (NSDI
"... Scalable distribution of large files has been the area of much research and commercial interest in the past few years. In this paper, we describe the CoBlitz system, which efficiently distributes large files using a content distribution network (CDN) designed for HTTP. As a result, CoBlitz is able t ..."
Abstract
-
Cited by 40 (4 self)
- Add to MetaCart
Scalable distribution of large files has been the area of much research and commercial interest in the past few years. In this paper, we describe the CoBlitz system, which efficiently distributes large files using a content distribution network (CDN) designed for HTTP. As a result, CoBlitz is able to serve large files without requiring any modifications to standard Web servers and clients, making it an interesting option both for end users as well as infrastructure services. Over the 18 months that CoBlitz and its partner service, CoDeploy, have been running on PlanetLab, we have had the opportunity to observe its algorithms in practice, and to evolve its design. These changes stem not only from observations on its use, but also from a better understanding of their behavior in real-world conditions. This utilitarian approach has led us to better understand the effects of scale, peering policies, replication behavior, and congestion, giving us new insights into how to better improve their performance. With these changes, CoBlitz is able to deliver in excess of 1 Gbps on PlanetLab, and to outperform a range of systems, including research systems as well as the widely-used BitTorrent. 1
Dandelion: Cooperative content distribution with robust incentives
- In USENIX
, 2007
"... Online content distribution has increasingly gained popularity among the entertainment industry and the consumers alike. A key challenge in online content distribution is a cost-efficient solution to handle demand peaks. To address this challenge, we propose Dandelion, a system for robust cooperativ ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
Online content distribution has increasingly gained popularity among the entertainment industry and the consumers alike. A key challenge in online content distribution is a cost-efficient solution to handle demand peaks. To address this challenge, we propose Dandelion, a system for robust cooperative (peer-to-peer) content distribution. Dandelion explicitly addresses two crucial issues in cooperative content distribution. First, it provides robust incentives for clients who possess content to serve others. A client that honestly serves other clients is rewarded with credit that can be redeemed for future downloads at the content server. Second, Dandelion discourages unauthorized content distribution. A client that uploads to another client is rewarded for its service only after the server has verified the other client’s legitimacy. Our preliminary evaluation of a prototype system running on commodity hardware with 1 Mbps uplink and 1 Mbps downlink indicates that Dandelion can achieve aggregate client download throughput three orders of magnitude higher than the one achieved by an HTTP/FTP-like server. 1
Exploiting similarity for multi-source downloads using file handprints
- in Proc. 4th USENIX NSDI
, 2007
"... Many contemporary approaches for speeding up large file transfers attempt to download chunks of a data object from multiple sources. Systems such as BitTorrent quickly locate sources that have an exact copy of the desired object, but they are unable to use sources that serve similar but non-identica ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
Many contemporary approaches for speeding up large file transfers attempt to download chunks of a data object from multiple sources. Systems such as BitTorrent quickly locate sources that have an exact copy of the desired object, but they are unable to use sources that serve similar but non-identical objects. Other systems automatically exploit cross-file similarity by identifying sources for each chunk of the object. These systems, however, require a number of lookups proportional to the number of chunks in the object and a mapping for each unique chunk in every identical and similar object to its corresponding sources. Thus, the lookups and mappings in such a system can be quite large, limiting its scalability. This paper presents a hybrid system that provides the best of both approaches, locating identical and similar sources for data objects using a constant number of lookups and inserting a constant number of mappings per object. We first demonstrate through extensive data analysis that similarity does exist among objects of popular file types, and that making use of it can sometimes substantially improve download times. Next, we describe handprinting, a technique that allows clients to locate similar sources using a constant number of lookups and mappings. Finally, we describe the design, implementation and evaluation of Similarity-Enhanced Transfer (SET), a system that uses this technique to download objects. Our experimental evaluation shows that by using sources of similar objects, SET is able to significantly out-perform an equivalently configured BitTorrent. 1
CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems
"... We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighbor ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This paper describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.
Optimal scheduling of peer-topeer file dissemination
- J. Scheduling
, 2006
"... Peer-to-peer (P2P) overlay networks such as BitTorrent and Avalanche are increasingly used for disseminating potentially large files from a server to many end users via the Internet. The key idea is to divide the file into many equally-sized parts and then let users download each part (or, for netwo ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
Peer-to-peer (P2P) overlay networks such as BitTorrent and Avalanche are increasingly used for disseminating potentially large files from a server to many end users via the Internet. The key idea is to divide the file into many equally-sized parts and then let users download each part (or, for network coding based systems such as Avalanche, linear combinations of the parts) either from the server or from another user who has already downloaded it. However, their performance evaluation has typically been limited to comparing one system relative to another and typically been realized by means of simulation and measurements. In contrast, we provide an analytic performance analysis that is based on a new uplink-sharing version of the well-known broadcasting problem. Assuming equal upload capacities, we show that the minimal time to disseminate the file is the same as for the simultaneous send/receive version of the broadcasting problem. For general upload capacities, we provide a mixed integer linear program (MILP) solution and a complementary fluid limit solution. We thus provide a lower bound which can be used as a performance benchmark for any P2P file dissemination system. We also investigate the performance of a decentralized strategy, providing evidence that the performance of necessarily decentralized P2P file dissemination systems should be close to this bound and therefore that it is useful in practice. 1
Mace: Language support for building distributed systems
- In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, 2007
"... Building distributed systems is particularly difficult because of the asynchronous, heterogeneous, and failure-prone environment where these systems must run. Tools for building distributed systems must strike a compromise between reducing programmer effort and increasing system efficiency. We prese ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Building distributed systems is particularly difficult because of the asynchronous, heterogeneous, and failure-prone environment where these systems must run. Tools for building distributed systems must strike a compromise between reducing programmer effort and increasing system efficiency. We present Mace, a C++ language extension and source-to-source compiler that translates a concise but expressive distributed system specification into a C++ implementation. Mace overcomes the limitations of low-level languages by providing a unified framework for networking and event handling, and the limitations of high-level languages by allowing programmers to write program components in a controlled and structured manner in C++. By imposing structure and restrictions on how applications can be written, Mace supports debugging at a higher level, including support for efficient model checking and causal-path debugging. Because Mace programs compile to C++, programmers can use existing C++ tools, including optimizers, profilers, and debuggers to analyze their systems.
Enabling dvd-like features in p2p video-on-demand systems
- in SIGCOMM Peer-to-Peer Streaming and IP-TV Workshop
, 2007
"... Peer-to-peer (p2p) video-on-demand (VoD) is increasingly popular with Internet users. Currently deployed pure p2p VoD systems provide poor general performance and they lack advanced features such as fast forward and seeking to arbitrary points. Peer-assisted VoD systems can provide such services, bu ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Peer-to-peer (p2p) video-on-demand (VoD) is increasingly popular with Internet users. Currently deployed pure p2p VoD systems provide poor general performance and they lack advanced features such as fast forward and seeking to arbitrary points. Peer-assisted VoD systems can provide such services, but they require very well provisioned source servers (or server farms). We propose BulletMedia, a system that uses proactive caching to attempt to provide advanced features without requiring a well provisioned server. In BulletMedia, blocks are altruistically replicated by peers not to aid immediate playback but to simply increase the number of replicas of each block. This helps ensure that blocks are available in-overlay and reduces dependence on the source. BulletMedia combines a traditional overlay mesh approach with a structured overlay. The overlay mesh is used to fetch blocks at a high rate, while the structured overlay is used to enable efficient block discovery and to control block replication. Initial experimental results from a prototype BulletMedia implementation demonstrate that it can both effectively control in-overlay block replication and can efficiently use these replicas to perform forward seeks.
Adaptive file transfers for diverse environments
- In Proc. USENIX Annual Technical Conference
, 2008
"... This paper presents dsync, a file transfer system that can dynamically adapt to a wide variety of environments. While many transfer systems work well in their specialized context, their performance comes at the cost of generality, and they perform poorly when used elsewhere. In contrast, dsync adapt ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
This paper presents dsync, a file transfer system that can dynamically adapt to a wide variety of environments. While many transfer systems work well in their specialized context, their performance comes at the cost of generality, and they perform poorly when used elsewhere. In contrast, dsync adapts to its environment by intelligently determining which of its available resources is the best to use at any given time. The resources dsync can draw from include the sender, the local disk, and network peers. While combining these resources may appear easy, in practice it is difficult because these resources may have widely different performance or contend with each other. In particular, the paper presents a novel mechanism that enables dsync to aggressively search the receiver’s local disk for useful data without interfering with concurrent network transfers. Our evaluation on several workloads in various network environments shows that dsync outperforms existing systems by a factor of 1.4 to 5 in one-to-one and one-to-many transfers. 1

