Results 1 - 10
of
141
Achieving 100% Throughput in an Input-Queued Switch
- IEEE TRANSACTIONS ON COMMUNICATIONS
, 1996
"... It is well known that head-of-line (HOL) blocking limits the throughput of an input-queued switch with FIFO queues. Under certain conditions, the throughput can be shown to be limited to approximately 58%. It is also known that if non-FIFO queueing policies are used, the throughput can be increas ..."
Abstract
-
Cited by 313 (22 self)
- Add to MetaCart
It is well known that head-of-line (HOL) blocking limits the throughput of an input-queued switch with FIFO queues. Under certain conditions, the throughput can be shown to be limited to approximately 58%. It is also known that if non-FIFO queueing policies are used, the throughput can be increased. However, it has not been previously shown that if a suitable queueing policy and scheduling algorithm are used then it is possible to achieve 100% throughput for all independent arrival processes. In this paper we prove this to be the case using a simple linear programming argument and quadratic Lyapunov function. In particular, we assume that each input maintains a separate FIFO queue for each output and that the switch is scheduled using a maximum weight bipartite matching algorithm. We introduce two maximum weight matching algorithms: LQF and OCF. Both
Scheduling Algorithms for Input-queued Cell Switches
, 1995
"... The algorithms described in this thesis are designed to schedule cells in a very high-speed, parallel, input-queued crossbar switch. We present several novel scheduling algorithms that we have devised, each aims to match the set of inputs of an input-queued switch to the set of outputs more effici ..."
Abstract
-
Cited by 109 (4 self)
- Add to MetaCart
The algorithms described in this thesis are designed to schedule cells in a very high-speed, parallel, input-queued crossbar switch. We present several novel scheduling algorithms that we have devised, each aims to match the set of inputs of an input-queued switch to the set of outputs more efficiently, fairly and quickly than existing techniques. In Chapter 2 we present the simplest and fastest of these algorithms: SLIP --- a parallel algorithm that uses rotating priority ("round-robin") arbitration. SLIP is simple: it is readily implemented in hardware and can operate at high speed. SLIP has high performance: for uniform i.i.d. Bernoulli arrivals, SLIP is stable for any admissible load, because the arbiters tend to desynchronize. We present analytical results to model this behavior. However, SLIP is not always stable and is not always monotonic: adding more traffic can actually make the algorithm operate more efficiently. We present an approximate analytical model of this behavior. SLIP prevents starvation: all contending inputs are eventually served. We present simulation results, indicating SLIP's performance. We argue that SLIP can be readily implemented for a 32x32 switch on a single chip. In Chapter 3 we present i-SLIP, an iterative algorithm that improves upon SLIP by converging on a maximal size match. The performance of i-SLIP improves with up to log 2 N iterations. We show that although it has a longer running time than SLIP, an i-SLIP scheduler is little more complex to implement. In Chapter 4 we describe maximum or maximal weight matching algorithms based on the occupancy of queues, or waiting times of cells. These algorithms are stabl...
Load Balanced Birkhoff-von Neumann Switches, Part II: Multi-stage Buffering
, 2001
"... The main objective of this sequel is to solve the out-of-sequence problem that occurs in the load balanced Birkhoff-von Neumann switch with one-stage buffering. We do this by adding a load-balancing buffer in front of the first stage and a resequencing-and-output buffer after the second stage. Moreo ..."
Abstract
-
Cited by 89 (12 self)
- Add to MetaCart
The main objective of this sequel is to solve the out-of-sequence problem that occurs in the load balanced Birkhoff-von Neumann switch with one-stage buffering. We do this by adding a load-balancing buffer in front of the first stage and a resequencing-and-output buffer after the second stage. Moreover, packets are distributed at the first stage according to their flows, instead of their arrival times in Part I. In this paper, we consider multicasting ows with two types of scheduling policies: the First Come First Served (FCFS) policy and the Earliest Deadline First (EDF) policy. The FCFS policy requires a jitter control mechanism in front of the second stage to ensure proper ordering of the traffic entering the second stage. For the EDF scheme, there is no need for jitter control. It uses the departure times of the corresponding FCFS output-buffered switch as deadlines and schedules packets according to their deadlines. For both policies, we show that the end-to-end delay through our multistage switch is bounded above by the sum of the delay from the corresponding FCFS output-buffered switch and a constant that only depends on the size of the switch and the number of multicasting flows supported by the switch.
The Tiny Tera: A Packet Switch Core
, 1996
"... In this paper, we present the Tiny Tera: a small packet switch with an aggregate bandwidth of 320Gb/s. The Tiny Tera is a CMOS-based input-queued, fixed-size packet switch suitable for a wide range of applications such as a highperformance ATM switch, the core of an Internet router or as a fast mult ..."
Abstract
-
Cited by 83 (5 self)
- Add to MetaCart
In this paper, we present the Tiny Tera: a small packet switch with an aggregate bandwidth of 320Gb/s. The Tiny Tera is a CMOS-based input-queued, fixed-size packet switch suitable for a wide range of applications such as a highperformance ATM switch, the core of an Internet router or as a fast multiprocessor interconnect. Using off-the-shelf technology, we plan to demonstrate that a very highbandwidth switch can be built without the need for esoteric optical switching technology. By employing novel scheduling algorithms for both unicast and multicast traffic, the switch will have a maximum throughput close to 100%. Using novel highspeed chip-to-chip serial link technology, we plan to reduce the physical size and complexity of the switch, as well as the system pin-count.
Beyond Best Effort: Router Architectures for the Differentiated Services of Tomorrow’s Internet
- IEEE Communications Magazine
, 1998
"... With the transformation of the Internet into a commercial infrastructure, the ability to provide differentiated services to users with widely varying requirements is rapidly becoming as important as meeting the massive increases in bandwidth demand. Hence, while deploying routers, switches, and tran ..."
Abstract
-
Cited by 63 (0 self)
- Add to MetaCart
With the transformation of the Internet into a commercial infrastructure, the ability to provide differentiated services to users with widely varying requirements is rapidly becoming as important as meeting the massive increases in bandwidth demand. Hence, while deploying routers, switches, and transmission systems of ever increasing capacity, Internet service providers would also like to provide customer-specific differentiated services using the same shared network infrastructure. In this article, we describe router architectures that can support the two trends of rising bandwidth demand and rising demand for differentiated services. We focus on router mechanisms that can support differentiated services at a level not contemplated in proposals currently under consideration due to concern regarding their implementability at high speeds. We consider the types of differentiated services that service providers may want to offer and then discuss the mechanisms needed in routers to support them. We describe plausible implementations of these mechanisms (the scalability and performance of which have been demonstrated by implementation in a prototype system) and argue that it is
Implementing Distributed Packet Fair Queueing in a Scalable Switch Architecture
, 1998
"... To support the Internet's explosive growth and expansion into a true integrated services network, there is a need for cost-effective switching technologies that can simultaneously provide high capacity switching and advanced QoS. Unfortunately, these two goals are largely believed to be contradictor ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
To support the Internet's explosive growth and expansion into a true integrated services network, there is a need for cost-effective switching technologies that can simultaneously provide high capacity switching and advanced QoS. Unfortunately, these two goals are largely believed to be contradictory in nature. To support QoS, sophisticated packet scheduling algorithms, such as Fair Queueing, are needed to manage queueing points. However, the bulk of current research in packet scheduling algorithms assumes an output buffered switch architecture, whereas most high performance switches (both commercial and research) are input buffered. While output buffered systems may have the desired quality of service, they lack the necessary scalability. Input buffered systems, while scalable, lack the necessary quality of service features. In this paper, we propose the construction of switching systems that are both input and output buffered, with the scalability of input buffered switches and the r...
Reliable and Efficient Hop-by-Hop Flow Control
- IEEE Journal on Sel. Areas in Communications
, 1995
"... Hop-by-hop flow control can be used to fairly share the bandwidth of a network among competing flows. No data is lost even in overload conditions; yet each flow gets access to the maximum throughput when the network is lightly loaded. However, some schemes for hop-by-hop flow control require too muc ..."
Abstract
-
Cited by 51 (0 self)
- Add to MetaCart
Hop-by-hop flow control can be used to fairly share the bandwidth of a network among competing flows. No data is lost even in overload conditions; yet each flow gets access to the maximum throughput when the network is lightly loaded. However, some schemes for hop-by-hop flow control require too much memory; some of them are not resilient to errors. We propose a scheme for making hop-by-hop flow control resilient and show that it has advantages over schemes proposed by Kung. We also describe a novel method for sharing the available buffers among the flows on a link; our scheme allows us to potentially reduce the memory requirement (or increase the number of flows that can be supported) by an order of magnitude. Most of the work is described in the context of an ATM network that Digital Equipment Corporation, Networks Engineering /Advanced Development, LKG1-2/E10, 550 King Street, Littleton, MA 01460. y Digital Equipment Corporation, Networks Engineering /High Performance Networks,...
Scaling Internet Routers Using Optics
- ACM SIGCOMM
, 2003
"... Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make e#cient use of their expensive long-haul links. In this paper we consider how optics can be used to scale capacity and ..."
Abstract
-
Cited by 49 (15 self)
- Add to MetaCart
Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make e#cient use of their expensive long-haul links. In this paper we consider how optics can be used to scale capacity and reduce power in a router. We start with the promising load-balanced switch architecture proposed by CS. Chang. This approach eliminates the scheduler, is scalable, and guarantees 100% throughput for a broad class of tra#c. But several problems need to be solved to make this architecture practical: (1) Packets can be mis-sequenced, (2) Pathological periodic tra#c patterns can make throughput arbitrarily small, (3) The architecture requires a rapidly configuring switch fabric, and (4) It does not work when linecards are missing or have failed. In this paper we solve each problem in turn, and describe new architectures that include our solutions. We motivate our work by designing a 100Tb/s packet-switched router arranged as 640 linecards, each operating at 160Gb/s. We describe two di#erent implementations based on technology available within the next three years.
Exact Emulation of an Output Queueing Switch by a Combined Input Output Queueing Switch
- In Sixth IEEE/IFIP International Workshop on Quality of Service
, 1998
"... Combined input output queueing switches (CIOQ) have better scaling properties than output queueing (OQ) switches. However, a CIOQ switch may have lower switch throughput, and more importantly, it is difficult to control delay in a CIOQ switch due to the existence of multiple queueing points. In this ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
Combined input output queueing switches (CIOQ) have better scaling properties than output queueing (OQ) switches. However, a CIOQ switch may have lower switch throughput, and more importantly, it is difficult to control delay in a CIOQ switch due to the existence of multiple queueing points. In this paper, we study the following problem, originally formulated and studied by Prabhakar and Mckeown [16]: Can a CIOQ switch be designed to behave identically to an OQ switch? In [16], an algorithm was proposed so that a CIOQ switch with an internal speedup of four can behave identically to an OQ switch with FIFO as the output queueing discipline. In this paper, we propose a new switch scheduling algorithm called Joined Preferred Matching (JPM) that improves Prahhakar and Mckeown's results in two aspects. First, with JPM, the internal speedup needed for a CIOQ switch to achieve exact emulation of an OQ switch is only 2 instead of 4. Second, the result applies to OQ switches that employ a gener...

