Results 1 - 10
of
170
Chernoff-Hoeffding Bounds for Applications with Limited Independence
- SIAM J. Discrete Math
, 1993
"... Chernoff--Hoeffding bounds are fundamental tools used in bounding the tail probabilities of the sums of bounded and independent random variables. We present a simple technique which gives slightly better bounds than these, and which more importantly requires only limited independence among the rando ..."
Abstract
-
Cited by 88 (10 self)
- Add to MetaCart
Chernoff--Hoeffding bounds are fundamental tools used in bounding the tail probabilities of the sums of bounded and independent random variables. We present a simple technique which gives slightly better bounds than these, and which more importantly requires only limited independence among the random variables, thereby importing a variety of standard results to the case of limited independence for free. Additional methods are also presented, and the aggregate results are sharp and provide a better understanding of the proof techniques behind these bounds. They also yield improved bounds for various tail probability distributions and enable improved approximation algorithms for jobshop scheduling. The "limited independence" result implies that a reduced amount of randomness and weaker sources of randomness are sufficient for randomized algorithms whose analyses use the Chernoff--Hoeffding bounds, e.g., the analysis of randomized algorithms for random sampling and oblivious packet routi...
Randomized routing and sorting on fixed-connection networks
- Journal of Algorithms
, 1994
"... This paper presents a general paradigm for the design of packet routing algorithms for fixed-connection networks. Its basis is a randomized on-line algorithm for scheduling any set of N packets whose paths have congestion c on any bounded-degree leveled network with depth L in O(c + L + log N) steps ..."
Abstract
-
Cited by 84 (13 self)
- Add to MetaCart
This paper presents a general paradigm for the design of packet routing algorithms for fixed-connection networks. Its basis is a randomized on-line algorithm for scheduling any set of N packets whose paths have congestion c on any bounded-degree leveled network with depth L in O(c + L + log N) steps, using constant-size queues. In this paradigm, the design of a routing algorithm is broken into three parts: (1) showing that the underlying network can emulate a leveled network, (2) designing a path selection strategy for the leveled network, and (3) applying the scheduling algorithm. This strategy yields randomized algorithms for routing and sorting in time proportional to the diameter for meshes, butterflies, shuffle-exchange graphs, multidimensional arrays, and hypercubes. It also leads to the construction of an area-universal network: an N-node network with area Θ(N) that can simulate any other network of area O(N) with slowdown O(log N).
Minimizing Congestion in General Networks
- IN PROCEEDINGS OF THE 43RD IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS
, 2002
"... A principle task in parallel and distributed systems is to reduce the communication load in the interconnection network, as this is usually the major bottleneck for the performance of distributed applications. In this paper we introduce a framework for solving on-line problems that aim to minimize t ..."
Abstract
-
Cited by 83 (11 self)
- Add to MetaCart
A principle task in parallel and distributed systems is to reduce the communication load in the interconnection network, as this is usually the major bottleneck for the performance of distributed applications. In this paper we introduce a framework for solving on-line problems that aim to minimize the congestion (i.e. the maximum load of a network link) in general topology networks. We apply this
The Power of Two Random Choices: A Survey of Techniques and Results
- in Handbook of Randomized Computing
, 2000
"... ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately ..."
Abstract
-
Cited by 79 (2 self)
- Add to MetaCart
ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately log n= log log n with high probability. Now suppose instead that the balls are placed sequentially, and each ball is placed in the least loaded of d 2 bins chosen independently and uniformly at random. Azar, Broder, Karlin, and Upfal showed that in this case, the maximum load is log log n= log d + (1) with high probability [ABKU99]. The important implication of this result is that even a small amount of choice can lead to drastically different results in load balancing. Indeed, having just two random choices (i.e.,...
Models of Machines and Computation for Mapping in Multicomputers
, 1993
"... It is now more than a quarter of a century since researchers started publishing papers on mapping strategies for distributing computation across the computation resource of multiprocessor systems. There exists a large body of literature on the subject, but there is no commonly-accepted framework ..."
Abstract
-
Cited by 76 (1 self)
- Add to MetaCart
It is now more than a quarter of a century since researchers started publishing papers on mapping strategies for distributing computation across the computation resource of multiprocessor systems. There exists a large body of literature on the subject, but there is no commonly-accepted framework whereby results in the field can be compared. Nor is it always easy to assess the relevance of a new result to a particular problem. Furthermore, changes in parallel computing technology have made some of the earlier work of less relevance to current multiprocessor systems. Versions of the mapping problem are classified, and research in the field is considered in terms of its relevance to the problem of programming currently available hardware in the form of a distributed memory multiple instruction stream multiple data stream computer: a multicomputer.
Optimal Oblivious Routing in Polynomial Time
, 2003
"... A recent seminal result of Räcke is that for any network there is an oblivious routing algorithm with a polylog competitive ratio with respect to congestion. Unfortunately, Räcke's construction is not polynomial time. We give a polynomial time construction that guarantee's Räcke's bounds, and more g ..."
Abstract
-
Cited by 55 (8 self)
- Add to MetaCart
A recent seminal result of Räcke is that for any network there is an oblivious routing algorithm with a polylog competitive ratio with respect to congestion. Unfortunately, Räcke's construction is not polynomial time. We give a polynomial time construction that guarantee's Räcke's bounds, and more generally gives the true optimal ratio for any network.
Scheduling Nonuniform Traffic In A Packet Switching System With Small Propagation Delay
- IEEE/ACM Transactions on Networking
, 1994
"... A new model of nonuniform traffic is introduced for a single-hop packet switching system. This traffic model allows arbitrary traffic streams subject only to a constraint on the number of data packets which can arrive at any individual source in the system or for any individual destination in the sy ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
A new model of nonuniform traffic is introduced for a single-hop packet switching system. This traffic model allows arbitrary traffic streams subject only to a constraint on the number of data packets which can arrive at any individual source in the system or for any individual destination in the system over time periods of specified length. The nonuniform traffic model is flexible enough to cover integrated data networks carrying diverse classes of data. The system model is rather general and includes passive optical star wavelength division networks. Transmission algorithms are introduced for a single-hop packet switching system with such nonuniform traffic and with propagation delay that is negligible relative to the packet length. The algorithms are based on collision-free scheduling of packets using graph matching algorithms, since the global state of the system is known to all stations at any time. A companion paper introduces transmission algorithms for the same network and traf...
Scaling Internet Routers Using Optics
- ACM SIGCOMM
, 2003
"... Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make e#cient use of their expensive long-haul links. In this paper we consider how optics can be used to scale capacity and ..."
Abstract
-
Cited by 49 (15 self)
- Add to MetaCart
Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make e#cient use of their expensive long-haul links. In this paper we consider how optics can be used to scale capacity and reduce power in a router. We start with the promising load-balanced switch architecture proposed by CS. Chang. This approach eliminates the scheduler, is scalable, and guarantees 100% throughput for a broad class of tra#c. But several problems need to be solved to make this architecture practical: (1) Packets can be mis-sequenced, (2) Pathological periodic tra#c patterns can make throughput arbitrarily small, (3) The architecture requires a rapidly configuring switch fabric, and (4) It does not work when linecards are missing or have failed. In this paper we solve each problem in turn, and describe new architectures that include our solutions. We motivate our work by designing a 100Tb/s packet-switched router arranged as 640 linecards, each operating at 160Gb/s. We describe two di#erent implementations based on technology available within the next three years.
RouteBricks: Exploiting Parallelism to Scale Software Routers
- In Proceedings of the 22nd ACM Symposium on Operating Systems Principles
, 2009
"... We revisit the problem of scaling software routers, motivated by recent advances in server technology that enable highspeed parallel processing—a feature router workloads appear ideally suited to exploit. We propose a software router architecture that parallelizes router functionality both across mu ..."
Abstract
-
Cited by 49 (8 self)
- Add to MetaCart
We revisit the problem of scaling software routers, motivated by recent advances in server technology that enable highspeed parallel processing—a feature router workloads appear ideally suited to exploit. We propose a software router architecture that parallelizes router functionality both across multiple servers and across multiple cores within a single server. By carefully exploiting parallelism at every opportunity, we demonstrate a 35Gbps parallel router prototype; this router capacity can be linearly scaled through the use of additional servers. Our prototype router is fully programmable using the familiar Click/Linux environment and is built entirely from off-the-shelf, general-purpose server hardware. 1
A polynomial-time tree decomposition to minimize congestion
- in Proceedings of the 15th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA
, 2003
"... ABSTRACT R"acke recently gave a remarkable proof showing that any undirected multicommodity flow problem can be routed in an oblivious fashion with congestion that is within a factor of O(log 3 n) of the best off-line solution to the problem. He also presented interesting applications of this r ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
ABSTRACT R"acke recently gave a remarkable proof showing that any undirected multicommodity flow problem can be routed in an oblivious fashion with congestion that is within a factor of O(log 3 n) of the best off-line solution to the problem. He also presented interesting applications of this result to distributed computing. Maggs, Miller, Parekh, Ravi and Wu have shown that such a decomposition also has an application to speeding up iterative solvers of linear systems. R"acke's construction finds a decomposition tree of the underlying graph, along with a method to obliviously route in a hierarchical fashion on the tree. The construction, however, uses exponential-time procedures to build the decomposition. The non-constructive nature of his result was remedied, in part, by Azar, Cohen, Fiat, Kaplan, and R"acke, who gave a polynomial time method for building an oblivious routing strategy. Their construction was not based on finding a hierarchical decomposition, and this precludes its application to iterative methods for solving linear systems. In this paper, we show how to compute a hierarchical decomposition and a corresponding oblivious routing strategy in polynomial time. In addition, our decomposition gives an improved competitive ratio for congestion of O(log 2 n log log n). In an independent result in this conference, Bienkowski, Korzeniowski, and R"acke give a polynomial-time method for constructing a decomposition tree with competitive ratio O(log 4 n). We note that our original submission used essentially the same algorithm, and we appreciate them allowing us to present this improved version.

