Results 1 - 10
of
26
PAST: Scalable Ethernet for data centers
- in ACM SIGCOMM CoNext Conference
, 2012
"... We present PAST, a novel network architecture for data center Ethernet networks that implements a Per-Address Spanning Tree routing algorithm. PAST preserves Ethernet’s self-configuration and mobility support while increasing its scalability and usable bandwidth. PAST is explicitly designed to accom ..."
Abstract
-
Cited by 38 (6 self)
- Add to MetaCart
(Show Context)
We present PAST, a novel network architecture for data center Ethernet networks that implements a Per-Address Spanning Tree routing algorithm. PAST preserves Ethernet’s self-configuration and mobility support while increasing its scalability and usable bandwidth. PAST is explicitly designed to accommodate unmodified commodity hosts and Ethernet switch chips. Surprisingly, we find that PAST can achieve performance comparable to or greater than Equal-Cost Multipath (ECMP) forwarding, which is currently limited to layer-3 IP networks, without any multipath hardware support. In other words, the hardware and firmware changes proposed by emerging standards like TRILL are not required for highperformance, scalable Ethernet networks. We evaluate PAST on Fat Tree, HyperX, and Jellyfish topologies, and show that it is able to capitalize on the advantages each offers. We also describe an
Scalable rule management for data centers
- in NSDI
, 2013
"... Cloud operators increasingly need more and more fine-grained rules to better control individual network flows for various traffic management policies. In this paper, we explore automated rule management in the context of a system called vCRIB (a virtual Cloud Rule Informa-tion Base), which provides ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
(Show Context)
Cloud operators increasingly need more and more fine-grained rules to better control individual network flows for various traffic management policies. In this paper, we explore automated rule management in the context of a system called vCRIB (a virtual Cloud Rule Informa-tion Base), which provides the abstraction of a central-ized rule repository. The challenge in our approach is the design of algorithms that automatically off-load rule processing to overcome resource constraints on hypervi-sors and/or switches, while minimizing redirection traf-fic overhead and responding to system dynamics. vCRIB contains novel algorithms for finding feasible rule place-ments and adapting traffic overhead induced by rule placement in the face of traffic changes and VM migra-tion. We demonstrate that vCRIB can find feasible rule placements with less than 10 % traffic overhead even in cases where the traffic-optimal rule placement may be in-feasible with respect to hypervisor CPU or memory con-straints. 1
A Guided Tour of Data-Center Networking
"... development led by queue.acm.org doi:10.1145/2184319.2184335 A good user experience depends on predictable performance within the data-center network. ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
development led by queue.acm.org doi:10.1145/2184319.2184335 A good user experience depends on predictable performance within the data-center network.
Practical DCB for Improved Data Center Networks
"... Abstract—Storage area networking is driving commodity data center switches to support lossless Ethernet (DCB). Unfortu-nately, to enable DCB for all traffic on arbitrary network topologies, we must address several problems that can arise in lossless networks, e.g., large buffering delays, unfairness ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Storage area networking is driving commodity data center switches to support lossless Ethernet (DCB). Unfortu-nately, to enable DCB for all traffic on arbitrary network topologies, we must address several problems that can arise in lossless networks, e.g., large buffering delays, unfairness, head of line blocking, and deadlock. We propose TCP-Bolt, a TCP variant that not only addresses the first three problems but reduces flow completion times by as much as 70%. We also introduce a simple, practical deadlock-free routing scheme that eliminates deadlock while achieving aggregate network throughput within 15 % of ECMP routing. This small compromise in potential routing capacity is well worth the gains in flow completion time. We note that our results on deadlock-free routing are also of independent interest to the storage area networking community. Further, as our hardware testbed illustrates, these gains are achievable today, without hardware changes to switches or NICs. I.
vCRIB: Virtualized Rule Management in the Cloud
, 2012
"... Cloud operators increasingly need many fine-grained rules to better control individual network flows for var-ious management tasks. While previous approaches have advocated placing rules either on hypervisors or switches, we argue that future data centers would benefit from leveraging rule processin ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Cloud operators increasingly need many fine-grained rules to better control individual network flows for var-ious management tasks. While previous approaches have advocated placing rules either on hypervisors or switches, we argue that future data centers would benefit from leveraging rule processing capabilities at both for better scalability and performance. In this paper, we pro-pose vCRIB, a virtualized Cloud Rule Information Base that allows operators to freely define different manage-ment policies without the need to consider underlying resource constraints. The challenge in our approach is the design of a vCRIB manager that automatically par-titions and places rules at both hypervisors and switches to achieve a good trade-off between resource usage and performance. 1
Panopticon: Reaping the Benefits of Incremental SDN Deployment in Enterprise Networks
"... The operational challenges posed in enterprise net-works present an appealing opportunity for automated orchestration by way of Software-Defined Networking (SDN). The primary challenge to SDN adoption in the enterprise is the deployment problem: How to deploy and operate a network consisting of both ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
The operational challenges posed in enterprise net-works present an appealing opportunity for automated orchestration by way of Software-Defined Networking (SDN). The primary challenge to SDN adoption in the enterprise is the deployment problem: How to deploy and operate a network consisting of both legacy and SDN switches, while benefiting from simplified management and enhanced flexibility of SDN. This paper presents the design and implementation of Panopticon, an architecture for operating networks that combine legacy and SDN switches. Panopticon exposes an abstraction of a logical SDN in a partially upgraded legacy network, where SDN benefits can extend over the entire network. We demonstrate the feasibility and eval-uate the efficiency of our approach through both testbed experiments with hardware switches and through simula-tion on real enterprise campus network topologies entail-ing over 1500 switches and routers. Our results suggest that when as few as 10 % of distribution switches support SDN, most of an enterprise network can be operated as a single SDN while meeting key resource constraints. 1
Presto: Edge-based Load Balancing for Fast Datacenter Networks
"... Datacenter networks deal with a variety of workloads, rang-ing from latency-sensitive small flows to bandwidth-hungry large flows. Load balancing schemes based on flow hash-ing, e.g., ECMP, cause congestion when hash collisions oc-cur and can perform poorly in asymmetric topologies. Re-cent proposal ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Datacenter networks deal with a variety of workloads, rang-ing from latency-sensitive small flows to bandwidth-hungry large flows. Load balancing schemes based on flow hash-ing, e.g., ECMP, cause congestion when hash collisions oc-cur and can perform poorly in asymmetric topologies. Re-cent proposals to load balance the network require central-ized traffic engineering, multipath-aware transport, or ex-pensive specialized hardware. We propose a mechanism that avoids these limitations by (i) pushing load-balancing func-tionality into the soft network edge (e.g., virtual switches) such that no changes are required in the transport layer, cus-tomer VMs, or networking hardware, and (ii) load balanc-ing on fine-grained, near-uniform units of data (flowcells) that fit within end-host segment offload optimizations used to support fast networking speeds. We design and implement such a soft-edge load balancing scheme, called Presto, and evaluate it on a 10 Gbps physical testbed. We demonstrate the computational impact of packet reordering on receivers and propose a mechanism to handle reordering in the TCP receive offload functionality. Presto’s performance closely tracks that of a single, non-blocking switch over many work-loads and is adaptive to failures and topology asymmetry. CCS Concepts •Networks → Network architectures; End nodes; Pro-grammable networks; Data center networks; Data path
Plinko: Building Provably Resilient Forwarding Tables
"... This paper introduces Plinko, a network architecture that uses a novel forwarding model and routing algorithm to build networks with forwarding paths that, assuming arbi-trarily large forwarding tables, are provably resilient against t link failures, ∀t ∈ N. However, in practice, there are clearly l ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
This paper introduces Plinko, a network architecture that uses a novel forwarding model and routing algorithm to build networks with forwarding paths that, assuming arbi-trarily large forwarding tables, are provably resilient against t link failures, ∀t ∈ N. However, in practice, there are clearly limits on the size of forwarding tables. Nonetheless, when constrained to hardware comparable to modern top-of-rack (TOR) switches, Plinko scales with high resilience to networks with up to ten thousand hosts. Thus, as long as t or fewer links have failed, the only reason packets of any flow in a Plinko network will be dropped are congestion, packet corruption, and a partitioning of the network topol-ogy, and, even after t+1 failures, most, if not all, flows may be unaffected. In addition, Plinko is topology independent, supports arbitrary paths for routing, provably bounds stretch, and does not require any additional computation during for-warding. To the best of our knowledge, Plinko is the first network to have all of these properties.
Network Virtualization for QoS-Aware Resource Management in Cloud Data Centers: A Survey
"... Abstract — The increasing popularity of Cloud Computing is leading to the emergence of large virtualized data centers hosting increasingly complex and dynamic IT systems and services. Over the past decade, the efficient sharing of computational resources through virtualization has been subject to in ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract — The increasing popularity of Cloud Computing is leading to the emergence of large virtualized data centers hosting increasingly complex and dynamic IT systems and services. Over the past decade, the efficient sharing of computational resources through virtualization has been subject to intensive research, while network management in cloud data centers has received less attention. A variety of network-intensive applications require QoS (Quality-of-Service) provisioning, performance isolation and support for flexible and efficient migration of virtual machines. In this paper, we survey existing network virtualization approaches and evaluate the extent to which they can be used as a basis for realizing the mentioned requirements in a cloud data center. More specifically, we identify generic network virtualization techniques, characterize them according to their features related to QoS management and performance isolation, and show how they can be composed together and used as building blocks for complex network virtualization solutions. We then present an overview of selected representative cloud platforms and show how they leverage the generic techniques as a basis for network resource management. Finally, we outline open issues and research challenges in the area of performance modeling and proactive resource management of virtualized data center infrastructures. I.
1 SAL: Scaling Data Centers Using Smart Address Learning
"... Multi-tenant data centers provide a cost-effective many-server infrastructure for hosting large-scale applications. These data centers can run multiple virtual machines (VMs) for each tenant, and potentially place any of these VMs on any of the servers. Therefore, for inter-VM communication, they al ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Multi-tenant data centers provide a cost-effective many-server infrastructure for hosting large-scale applications. These data centers can run multiple virtual machines (VMs) for each tenant, and potentially place any of these VMs on any of the servers. Therefore, for inter-VM communication, they also need to provide a VM resolution method that can quickly determine the server location of any VM. Unfortunately, existing methods suffer from a scalability bottleneck in the network load of the address resolution messages and/or in the size of the resolution tables. In this paper, we propose Smart Address Learning (SAL), a novel approach that expands the scalability of both the network load and the resolution table sizes, making it implementable on faster memory devices. The key property of the approach is to selectively learn the addresses in the resolution tables, by using the fact that the VMs of different tenants do not communicate. We further compare the various resolution methods and analyze the tradeoff between network load and table sizes. We also evaluate our results using real-life trace simulations. Our analysis shows that SAL can reduce both the network load and the resolution table sizes by several orders of magnitude. 1.