Results 1 - 10
of
97
iPlane: An information plane for distributed services
- In OSDI 2006
"... Abstract — In this paper, we present the design, implementation, and evaluation of the iPlane, a scalable service providing accurate predictions of Internet path performance for emerging overlay services. Unlike the more common black box latency prediction techniques in use today, the iPlane builds ..."
Abstract
-
Cited by 297 (25 self)
- Add to MetaCart
(Show Context)
Abstract — In this paper, we present the design, implementation, and evaluation of the iPlane, a scalable service providing accurate predictions of Internet path performance for emerging overlay services. Unlike the more common black box latency prediction techniques in use today, the iPlane builds an explanatory model of the Internet. We predict end-to-end performance by composing measured performance of segments of known Internet paths. This method allows us to accurately and efficiently predict latency, bandwidth, capacity and loss rates between arbitrary Internet hosts. We demonstrate the feasibility and utility of the iPlane service by applying it to several representative overlay services in use today: content distribution, swarming peer-to-peer filesharing, and voice-over-IP. In each case, we observe that using iPlane’s predictions leads to a significant improvement in end user performance. 1
Operating System Support for Planetary-Scale Network Services
, 2004
"... PlanetLab is a geographically distributed overlay network designed to support the deployment and evaluation of planetary-scale network services. Two high-level goals shape its design. First, to enable a large research community to share the infrastructure, PlanetLab provides distributed virtualizati ..."
Abstract
-
Cited by 266 (20 self)
- Add to MetaCart
(Show Context)
PlanetLab is a geographically distributed overlay network designed to support the deployment and evaluation of planetary-scale network services. Two high-level goals shape its design. First, to enable a large research community to share the infrastructure, PlanetLab provides distributed virtualization, whereby each service runs in an isolated slice of PlanetLab’s global resources. Second, to support competition among multiple network services, PlanetLab decouples the operating system running on each node from the networkwide services that define PlanetLab, a principle referred to as unbundled management. This paper describes how Planet-Lab realizes the goals of distributed virtualization and unbundled management, with a focus on the OS running on each node. 1
Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control
- IN OSDI
, 2004
"... ..."
A Scalable Distributed Information Management System
"... We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing detailed views of nearby information and summary ..."
Abstract
-
Cited by 192 (17 self)
- Add to MetaCart
We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing detailed views of nearby information and summary views of global information. To serve as a basic building block, a SDIMS should have four properties: scalability to many nodes and attributes, flexibility to accommodate a broad range of applications, administrative isolation for security and availability, and robustness to node and network failures. We design, implement and evaluate a SDIMS that (1) leverages Distributed Hash Tables (DHT) to create scalable aggregation trees, (2) provides flexibility through a simple API that lets applications control propagation of reads and writes, (3) provides administrative isolation through simple extensions to current DHT algorithms, and (4) achieves robustness to node and network reconfigurations through lazy reaggregation, on-demand reaggregation, and tunable spatial replication. Through extensive simulations and micro-benchmark experiments, we observe that our system is an order of magnitude more scalable than existing approaches, achieves isolation properties at the cost of modestly increased read latency in comparison to flat DHTs, and gracefully handles failures.
Design and Implementation Tradeoffs for Wide-Area Resource Discovery
- In Proceedings of 14th IEEE Symposium on High Performance, Research Triangle Park
, 2005
"... We describe the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. In contrast to previous systems, SWORD allows users to describe desired resources as a topology of interconnected groups with required intra-group, inter-group, and per-node c ..."
Abstract
-
Cited by 98 (13 self)
- Add to MetaCart
We describe the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. In contrast to previous systems, SWORD allows users to describe desired resources as a topology of interconnected groups with required intra-group, inter-group, and per-node characteristics, along with the utility that the application derives from specified ranges of metric values. This design gives users the flexibility to find geographically distributed resources for applications that are sensitive to both node and network characteristics, and allows the system to rank acceptable configurations based on their quality for that application. Rather than evaluating a single implementation of SWORD, we explore a variety of architectural designs that deliver the required functionality in a scalable and highly-available manner. We discuss the tradeoffs of using a centralized architecture as compared to a fully decentralized design to perform wide-area resource discovery. To summarize our results, we found that a centralized architecture based on 4-node server cluster sites at network peering facilities outperforms a decentralized DHT-based resource discovery infrastructure with respect to query latency for all but the smallest number of sites. However, although a centralized architecture shows significant promise in stable environments, we find that our decentralized implementation has acceptable performance and also benefits from the DHT’s self-healing properties in more volatile environments. We evaluate the advantages and disadvantages of centralized and distributed resource discovery architectures on 1000 hosts in emulation and on approximately 200 PlanetLab nodes spread across the Internet.
The Architecture of PIER: an Internet-Scale Query Processor
- In CIDR
, 2005
"... This paper presents the architecture of PIER , an Internetscale query engine we have been building over the last three years. PIER is the first general-purpose relational query processor targeted at a peer-to-peer (p2p) architecture of thousands or millions of participating nodes on the Internet. ..."
Abstract
-
Cited by 88 (8 self)
- Add to MetaCart
This paper presents the architecture of PIER , an Internetscale query engine we have been building over the last three years. PIER is the first general-purpose relational query processor targeted at a peer-to-peer (p2p) architecture of thousands or millions of participating nodes on the Internet. It supports massively distributed, database-style dataflows for snapshot and continuous queries. It is intended to serve as a building block for a diverse set of Internet-scale informationcentric applications, particularly those that tap into the standardized data readily available on networked machines, including packet headers, system logs, and file names
Distributed resource discovery on PlanetLab with SWORD
- In WORLDS
, 2004
"... Large-scale distributed services such as content distribution networks, peer-to-peer storage, distributed games, and scientific applications, have recently received substantial interest from both researchers and industry. At ..."
Abstract
-
Cited by 58 (0 self)
- Add to MetaCart
(Show Context)
Large-scale distributed services such as content distribution networks, peer-to-peer storage, distributed games, and scientific applications, have recently received substantial interest from both researchers and industry. At
Providing Packet Obituaries
, 2004
"... The Internet is transparent to success but opaque to failure. This veil of ignorance prevents ISPs from detecting failures by peering partners, and hosts from intelligently adapting their routes to adverse network conditions. To rectify this, we propose an accountability framework that would tell ho ..."
Abstract
-
Cited by 38 (5 self)
- Add to MetaCart
The Internet is transparent to success but opaque to failure. This veil of ignorance prevents ISPs from detecting failures by peering partners, and hosts from intelligently adapting their routes to adverse network conditions. To rectify this, we propose an accountability framework that would tell hosts where their packets have died. We describe a preliminary version of this framework and discuss its viability.
Workload and Failure Characterization on a Large-Scale Federated Testbed
, 2003
"... Recently, a number of federated distributed computational and communication infrastructures have emerged, including the Grid, PlanetLab, and Content Distribution Networks. In these environments, mutually distrustful autonomous domains pool resources together for their mutual benefit, for instance to ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
Recently, a number of federated distributed computational and communication infrastructures have emerged, including the Grid, PlanetLab, and Content Distribution Networks. In these environments, mutually distrustful autonomous domains pool resources together for their mutual benefit, for instance to gain access to: unique computational resources, multiple vantage points on the network, or more computation than available locally. Key challenges for such federated infrastructures include resource allocation, scheduling, and constructing highly available services in the face of faulty end hosts and unpredictable network behavior. Developing such appropriate mechanisms and policies requires an understanding of the usage characteristics and operating environment of the target environment. In this paper, we present a detailed characterization of the actual use of the PlanetLab network testbed. PlanetLab consists of 240 nodes spread across 100 autonomous domains with over 500 active users. Using a variety of measurement tools, we present a three-month study on the network, CPU, memory and disk usage of individual PlanetLab nodes and sites. On the consumer side, we further characterize the consumption of individual users. Next, we present results on the availability and reliability of system nodes and the network interconnecting them. Finally, we discuss the implications of our measurements for emerging federated environments.
Loss and delay accountability for the internet
- In Proc. IEEE International Conference on Network Protocols. IEEE
, 2007
"... Abstract — The Internet provides no information on the fate of transmitted packets, and end systems cannot determine who is responsible for dropping or delaying their traffic. As a result, they cannot verify that their ISPs are honoring their service level agreements, nor can they react to adverse n ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
(Show Context)
Abstract — The Internet provides no information on the fate of transmitted packets, and end systems cannot determine who is responsible for dropping or delaying their traffic. As a result, they cannot verify that their ISPs are honoring their service level agreements, nor can they react to adverse network conditions appropriately. While current probing tools provide some assistance in this regard, they only give feedback on probes, not actual traffic. Moreover, service providers could, at any time, render their network opaque to such tools. We propose AudIt, an explicit accountability interface, through which ISPs can pro-actively supply feedback to traffic sources on loss and delay, at administrative-domain granularity. Notably, our interface is resistant to ISP lies and can be implemented with a modest NetFlow modification. On our Click-based prototype, playback of real traces from a Tier-1 ISP reveals less than 2% bandwidth overhead. Finally, our proposal benefits not only end systems, but also ISPs, who can now control the amount and quality of information revealed about their internals. I.