Results 1  10
of
10
A new approach to the maximum flow problem
 Journal of the ACM
, 1988
"... Abstract. All previously known efftcient maximumflow algorithms work by finding augmenting paths, either one path at a time (as in the original Ford and Fulkerson algorithm) or all shortestlength augmenting paths at once (using the layered network approach of Dinic). An alternative method based on ..."
Abstract

Cited by 506 (31 self)
 Add to MetaCart
Abstract. All previously known efftcient maximumflow algorithms work by finding augmenting paths, either one path at a time (as in the original Ford and Fulkerson algorithm) or all shortestlength augmenting paths at once (using the layered network approach of Dinic). An alternative method based on the preflow concept of Karzanov is introduced. A preflow is like a flow, except that the total amount flowing into a vertex is allowed to exceed the total amount flowing out. The method maintains a preflow in the original network and pushes local flow excess toward the sink along what are estimated to be shortest paths. The algorithm and its analysis are simple and intuitive, yet the algorithm runs as fast as any other known method on dense. graphs, achieving an O(n)) time bound on an nvertex graph. By incorporating the dynamic tree data structure of Sleator and Tarjan, we obtain a version of the algorithm running in O(nm log(n’/m)) time on an nvertex, medge graph. This is as fast as any known method for any graph density and faster on graphs of moderate density. The algorithm also admits efticient distributed and parallel implementations. A parallel implementation running in O(n’log n) time using n processors and O(m) space is obtained. This time bound matches that of the ShiloachVishkin algorithm, which also uses n processors but requires O(n’) space.
Applying parallel computation algorithms in the design of serial algorithms
 J. ACM
, 1983
"... Abstract. The goal of this paper is to point out that analyses of parallelism in computational problems have practical implications even when multiprocessor machines are not available. This is true because, in many cases, a good parallel algorithm for one problem may turn out to be useful for design ..."
Abstract

Cited by 232 (7 self)
 Add to MetaCart
Abstract. The goal of this paper is to point out that analyses of parallelism in computational problems have practical implications even when multiprocessor machines are not available. This is true because, in many cases, a good parallel algorithm for one problem may turn out to be useful for designing an efficient serial algorithm for another problem. A d ~ eframework d for cases like this is presented. Particular cases, which are discussed in this paper, provide motivation for examining parallelism in sorting, selection, minimumspanningtree, shortest route, maxflow, and matrix multiplication problems, as well as in scheduling and locational problems.
Explicit MultiThreading (XMT) Bridging Models for Instruction Parallelism
 Proc. 10th ACM Symposium on Parallel Algorithms and Architectures (SPAA
, 1998
"... The paper envisions an extension to a standard instruction set which efficiently implements PRAM algorithms using explicit multithreaded instructionlevel parallelism (ILP); that is, Explicit MultiThreading (XMT), a finegrained computational paradigm covering the spectrum from algorithms throu ..."
Abstract

Cited by 29 (12 self)
 Add to MetaCart
The paper envisions an extension to a standard instruction set which efficiently implements PRAM algorithms using explicit multithreaded instructionlevel parallelism (ILP); that is, Explicit MultiThreading (XMT), a finegrained computational paradigm covering the spectrum from algorithms through architecture to implementation is introduced; new elements are added where needed. The more detailed presentation is by way of a bridging model. Among other things, a bridging model provides a design space for algorithm designers and programmers, as well as a design space for computer architects. It is convenient to describe our wider vision regarding "parallelcomputingonachip" as a twostage development and therefore two bridging models are presented: Spawnbased multithreading (SpawnMT) and Elastic multithreading (EMT). The case for SpawnMT (or, alternatively, EMT) as a bridging model relies on the following evidence. (1) SpawnMT comprises an "instruction set level", wh...
Tradeoffs Between Communication Throughput and Parallel Time
, 1994
"... We study the effect of limited communication throughput on parallel computation in a setting where the number of processors is much smaller than the length of the input. Our model has p processors that communicate through a shared memory of size m. The input has size n, and can be read directly by a ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
We study the effect of limited communication throughput on parallel computation in a setting where the number of processors is much smaller than the length of the input. Our model has p processors that communicate through a shared memory of size m. The input has size n, and can be read directly by all the processors. We will be primarily interested in studying cases where n AE p AE m. As a test case we study the list reversal problem. For this problem we prove a time lower bound of \Omega\Gamma n p mp ). (A similar lower bound holds also for the problems of sorting, finding all unique elements, convolution, and universal hashing.) This result shows that limiting the communication (i.e., small m) has significant effect on parallel computation. We show an almost matching upper bound of O( n p mp log O(1) n). The upper bound requires the development of a few interesting techniques which can alleviate the limited communication in some
Can Parallel Algorithms Enhance Serial Implementation? (Extended Abstract)
, 1996
"... The broad thesis presented in this paper suggests that the serial emulation of a parallel algorithm has the potential advantage of running on a serial machine faster than a standard serial algorithm for the same problem. It is too early to reach definite conclusions regarding the significance of th ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
The broad thesis presented in this paper suggests that the serial emulation of a parallel algorithm has the potential advantage of running on a serial machine faster than a standard serial algorithm for the same problem. It is too early to reach definite conclusions regarding the significance of this thesis. However, using some imagination, validity of the thesis and some arguments supporting it may lead to several farreaching outcomes: (1) Reliance on "predictability of reference" in the design of computer systems will increase. (2) Parallel algorithms will be taught as part of the standard computer science and engineering undergraduate curriculum irrespective of whether (or when) parallel processing will become ubiquitous in the generalpurpose computing world. (3) A strategic agenda for highperformance parallel computing: A multistage agenda, which in no stage compromises userfriendliness of the programmer 's...
A SelfStabilizing Algorithm For The Maximum Flow Problem
 Distributed Computing
, 1995
"... . The maximum flow problem is a fundamental problem in graph theory and combinatorial optimization with a variety of important applications. Known distributed algorithms for this problem do not tolerate faults or adjust to dynamic changes in network topology. This paper presents the first distribute ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
. The maximum flow problem is a fundamental problem in graph theory and combinatorial optimization with a variety of important applications. Known distributed algorithms for this problem do not tolerate faults or adjust to dynamic changes in network topology. This paper presents the first distributed selfstabilizing algorithm for the maximum flow problem. Starting from an arbitrary state, the algorithm computes the maximum flow in a acyclic network in finitely many steps. Since the algorithm is selfstabilizing, it is inherently tolerant to transient faults and can automatically adjust to topology changes and to changes in other parameters of the problem. The paper presents extensive experimental results to indicate that the algorithm requires n 2 moves in an averagecase setting. A slight modification of the original algorithm is also presented and it is conjectured that the new algorithm computes a maximum flow in arbitrary networks. Key words. distributed algorithms, faulttoler...
The Maximum Flow Problem: A RealTime Approach
 Proceedings of the Thirteenth Conference on Parallel and Distributed Computing and Systems
, 2001
"... The dynamic version of the maximum flow problem allows the graph underlying the flow network to change over time. The graph receives corrections to its structure or capacities and consequently the value of the maximum flow is modified. These corrections arrive in real time. In this paper, parallel a ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
The dynamic version of the maximum flow problem allows the graph underlying the flow network to change over time. The graph receives corrections to its structure or capacities and consequently the value of the maximum flow is modified. These corrections arrive in real time. In this paper, parallel and sequential solutions to the realtime maximum flow problem are developed on the Reconfigurable Multiple Bus Machine (RMBM) model and on the Random Access Machine (RAM) model, respectively. The parallel solution successfully meets the deadlines imposed in real time, while the sequential one fails to do so. The two solutions are then applied to a realtime process scheduler, an extension of Stone's static twoprocessor allocation problem. The scheduler allows processes to be created and destroyed, the amount of communication between two processes to change with time, and so on. The parallel algorithm is always able to compute the optimal schedule, while the solution obtained sequentially is only an approximation. The improvement provided by the parallel approach over the sequential one is superlinear in the number of processors used by the parallel model. Key words and phrases: maximum flow, parallelism, realtime computation, module allocation. 1
An Effective Load Balancing Policy for Geometric Decaying Algorithms
"... Parallel algorithms are often first designed as a sequence of rounds, where each round includes any number of independent constant time operations. This socalled worktime presentation is then followed by a processor scheduling implementation ona more concrete computational model. Many parallel alg ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Parallel algorithms are often first designed as a sequence of rounds, where each round includes any number of independent constant time operations. This socalled worktime presentation is then followed by a processor scheduling implementation ona more concrete computational model. Many parallel algorithms are geometricdecaying in the sense that the sequence of work loads is upper bounded by a decreasing geometric series. A standard scheduling implementation of such algorithms consists of a repeated application of load balancing. We present a more effective, yet as simple, policy for the utilization of load balancing in geometric decaying algorithms. By making a more careful choice of when and how often load balancing should be employed, and by using a simple amortization argument, we showthat the number of required applications of load balancing should be nearlyconstant. The policy is not restricted to any particular model of parallel computation, and, up to a constant factor, it is the best possible.
The Parallel Maxflow Problem is easy for almost all Graphs
, 1997
"... We present a novel algorithm for the integer maxflow problem. The algorithm is designed for a PRAM CREW with n 2 processing units. For almost all (in a strong sense) graphs of a number of common classes of random graphs, it finds a maxflow in O((log C + log 4 n) \Delta log n= log(m=n)) time. Her ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We present a novel algorithm for the integer maxflow problem. The algorithm is designed for a PRAM CREW with n 2 processing units. For almost all (in a strong sense) graphs of a number of common classes of random graphs, it finds a maxflow in O((log C + log 4 n) \Delta log n= log(m=n)) time. Here C is the average edge capacity and m the (expected) number of edges. The same time bound holds for the averagecase time of the algorithm. The algorithm exploits the regularity of random graphs. It saturates all edges and subsequently balances the resulting preflow. Keywords Algorithm, PRAM CREW, almost always, averagecase analysis, pseudopolylog, graphs, maxflow, preflow, balancing. 1 Introduction In this paper we present an algorithm for finding the maxflow of a directed graph with integer capacities. The problem of finding a maxflow (the MFP) has attracted many attention. To mention only some important results: algorithm number of PUs time order remark Sleator 1 n \Delta m \Delta lo...
FPGAbased Prototype of a . . .
 CF'08
, 2008
"... PRAM (Parallel Random Access Model) has been widely regarded a desirable parallel machine model for many years, but it is also believed to be “impossible in reality. ” As the new billiontransistor processor era begins, the eXplicit MultiThreading (XMT) PRAMOnChip project is attempting to design ..."
Abstract
 Add to MetaCart
PRAM (Parallel Random Access Model) has been widely regarded a desirable parallel machine model for many years, but it is also believed to be “impossible in reality. ” As the new billiontransistor processor era begins, the eXplicit MultiThreading (XMT) PRAMOnChip project is attempting to design an onchip parallel processor that efficiently supports PRAM algorithms. This paper presents the first prototype of the XMT architecture that incorporates 64 simple inorder processors operating at 75MHz. The microarchitecture of the prototype is described and the performance is studied with respect to some microbenchmarks. Using cycle accurate emulation, the projected performance of an 800MHz XMT ASIC processor is compared with AMD Opteron 2.6GHz, which uses similar area as would a 64processor