Results 11  20
of
253
Provably efficient scheduling for languages with finegrained parallelism
 IN PROC. SYMPOSIUM ON PARALLEL ALGORITHMS AND ARCHITECTURES
, 1995
"... Many highlevel parallel programming languages allow for finegrained parallelism. As in the popular worktime framework for parallel algorithm design, programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to processors. A ..."
Abstract

Cited by 82 (25 self)
 Add to MetaCart
Many highlevel parallel programming languages allow for finegrained parallelism. As in the popular worktime framework for parallel algorithm design, programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to processors. A common concern in executing such programs is to schedule tasks to processors dynamically so as to minimize not only the execution time, but also the amount of space (memory) needed. Without careful scheduling, the parallel execution on p processors can use a factor of p or larger more space than a sequential implementation of the same program. This paper first identifies a class of parallel schedules that are provably efficient in both time and space. For any
Better Bounds For Online Scheduling
 SIAM JOURNAL ON COMPUTING
, 1997
"... We study a classical problem in online scheduling. A sequence of jobs must be scheduled on m identical parallel machines. As each job arrives, its processing time is known. The goal is to minimize the makespan. Bartal, Fiat, Karloff and Vohra [3] gave a deterministic online algorithm that is 1.986c ..."
Abstract

Cited by 75 (3 self)
 Add to MetaCart
We study a classical problem in online scheduling. A sequence of jobs must be scheduled on m identical parallel machines. As each job arrives, its processing time is known. The goal is to minimize the makespan. Bartal, Fiat, Karloff and Vohra [3] gave a deterministic online algorithm that is 1.986competitive. Karger, Phillips and Torng [11] generalized the algorithm and proved an upper bound of 1.945. The best lower bound currently known on the competitive ratio that can be achieved by deterministic online algorithms it equal to 1.837. In this paper we present an improved deterministic online scheduling algorithm that is 1.923competitive, for all m 2. The algorithm is based on a new scheduling strategy, i.e., it is not a generalization of the approach by Bartal et al. Also, the algorithm has a simple structure. Furthermore, we develop a better lower bound. We prove that, for general m, no deterministic online scheduling algorithm can be better than 1.852competitive.
OnLine Routing of Virtual Circuits with Applications to Load Balancing and Machine Scheduling
, 1993
"... In this paper we study the problem of online allocation of routes to virtual circuits (both pointtopoint and multicast) where the goal is to minimize the required bandwidth. We concentrate on the case of permanent virtual circuits (i.e., once a circuit is established, it exists forever), and descr ..."
Abstract

Cited by 72 (7 self)
 Add to MetaCart
In this paper we study the problem of online allocation of routes to virtual circuits (both pointtopoint and multicast) where the goal is to minimize the required bandwidth. We concentrate on the case of permanent virtual circuits (i.e., once a circuit is established, it exists forever), and describe an algorithm that achieves an O(log n) competitive ratio with respect to maximum congestion, where n is the number of nodes in the network. Informally, our results show that instead of knowing all of the future requests, it is sufficient to increase the bandwidth of the communication links by an O(log n) factor. We also show that this result is tight, i.e. for any online algorithm there exists a scenario in which O(log n) increase in bandwidth is necessary. We view virtual circuit routing as a generalization of an online load balancing problem, defined as follows: jobs arrive on line and each job must be assigned to one of the machines immediately upon arrival. Assigning a job to a machine increases this machine’s load by an amount that depends both on the job and on the machine. The goal is to minimize the maximum load. For the related machines case, we describe the first algorithm that achieves constant competitive ratio. For the unrelated case (with n machines), we describe a new method that yields O(log n)competitive
Data Partitioning and Load Balancing in Parallel Disk Systems
, 1994
"... Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via interrequest and intrarequest parallelism. In this paper we discuss the main issues in performance tuning of such systems, namely striping and load balancing, and show their relationship to ..."
Abstract

Cited by 62 (8 self)
 Add to MetaCart
Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via interrequest and intrarequest parallelism. In this paper we discuss the main issues in performance tuning of such systems, namely striping and load balancing, and show their relationship to response time and throughput. We outline the main components of an intelligent file system that optimizes striping by taking into account the requirements of the applications, and performs load balancing by judicious file allocation and dynamic redistributions of the data when access patterns change. Our system uses simple but effective heuristics that incur only little overhead. We present performance experiments based on synthetic workloads and reallife traces.
Competitive Routing of Virtual Circuits with Unknown Duration
 In Proc. 5th ACMSIAM Symposium on Discrete Algorithms
, 1994
"... In this paper we present a strategy to route unknown duration virtual circuits in a highspeed communication network. Previous work on virtual circuit routing concentrated on the case where the call duration is known in advance. We show that by allowing O(log n) reroutes per call, we can achieve O(lo ..."
Abstract

Cited by 60 (16 self)
 Add to MetaCart
In this paper we present a strategy to route unknown duration virtual circuits in a highspeed communication network. Previous work on virtual circuit routing concentrated on the case where the call duration is known in advance. We show that by allowing O(log n) reroutes per call, we can achieve O(log n) competitive ratio with respect to the maximum load (congestion) for the unknown duration case, were n is the number of nodes in the network. This is in contrast to the ( 4p n)lower bound on the competitive ratio for this case if no rerouting is allowed [3]. Our routing algorithm can be also applied in the context of machine load balancing of tasks with unknown duration. We present an algorithm that makes O(log n) reassignments per task and achieves O(log n) competitive ratio with respect to the load, where n is the number of parallel machines. For a special case of unit load tasks we design a constant competitive algorithm. The previously known algorithms that achieve up to polylogarithmic competitive ratio for load balancing of tasks with unknown duration dealt only with special cases of related machines case and unitload tasks with restricted assignment[4,11].
Online Load Balancing of Temporary Tasks
, 1993
"... This paper considers the nonpreemptive online load balancing problem where tasks have limited duration in time. Upon arrival, each task has to be immediately assigned to one of the machines, increasing the load on this machine for the duration of the task by an amount that depends on both the m ..."
Abstract

Cited by 60 (12 self)
 Add to MetaCart
This paper considers the nonpreemptive online load balancing problem where tasks have limited duration in time. Upon arrival, each task has to be immediately assigned to one of the machines, increasing the load on this machine for the duration of the task by an amount that depends on both the machine and the task. The goal is to minimize the maximum load. Azar, Broder and Karlin studied the unknown duration case where the duration of a task is not known upon its arrival [4]. They focused on the special case in which for each task there is a subset of machines capable of executing it, and the increase in load due to assigning the task to one of these machines depends only on the task and not on the machine. For this case, they showed an O(n 2=3 )competitive algorithm, and an \Omega\Gamma p n) lower bound on the competitive ratio, where n is the number of the machines. This paper closes the gap by giving an O( p n)competitive algorithm. In addition, trying to overco...
A Better Algorithm For an Ancient Scheduling Problem
 Journal of Algorithms
, 1996
"... One of the oldest and simplest variants of multiprocessor scheduling is the online scheduling problem studied by Graham in 1966. In this problem, the jobs arrive online and must be scheduled nonpreemptively on m identical machines so as to minimize the makespan. The size of a job is known on arri ..."
Abstract

Cited by 60 (2 self)
 Add to MetaCart
One of the oldest and simplest variants of multiprocessor scheduling is the online scheduling problem studied by Graham in 1966. In this problem, the jobs arrive online and must be scheduled nonpreemptively on m identical machines so as to minimize the makespan. The size of a job is known on arrival. Graham proved that the List Processing Algorithm which assigns each job to the currently least loaded machine has competitive ratio (2 \Gamma 1=m). Recently algorithms with smaller competitive ratios than List Processing have been discovered, culminating in Bartal, Fiat, Karloff, and Vohra's construction of an algorithm with competitive ratio bounded away from 2. Their algorithm has a competitive ratio of at most (2 \Gamma 1=70) 1:986 for all m; hence for m ? 70, their algorithm is provably better than List Processing. We present a more natural algorithm that outperforms List Processing for any m 6 and has a competitive ratio of at most 1:945 for all m, which is significantly closer ...
Allocating Bandwidth for Bursty Connections
 SIAM J. Comput
, 1997
"... Abstract. In this paper, we undertake the first study of statistical multiplexing from the perspective of approximation algorithms. The basic issue underlying statistical multiplexing is the following: in highspeed networks, individual connections (i.e., communication sessions) are very bursty, wit ..."
Abstract

Cited by 44 (0 self)
 Add to MetaCart
Abstract. In this paper, we undertake the first study of statistical multiplexing from the perspective of approximation algorithms. The basic issue underlying statistical multiplexing is the following: in highspeed networks, individual connections (i.e., communication sessions) are very bursty, with transmission rates that vary greatly over time. As such, the problem of packing multiple connections together on a link becomes more subtle than in the case when each connection is assumed to have a fixed demand. We consider one of the most commonly studied models in this domain: that of two communicating nodes connected by a set of parallel edges, where the rate of each connection between them is a random variable. We consider three related problems: (1) stochastic load balancing, (2) stochastic binpacking, and (3) stochastic knapsack. In the first problem the number of links is given and we want to minimize the expected value of the maximum load. In the other two problems the link capacity and an allowed overflow probability p are given, and the objective is to assign connections to links, so that the probability that the load of a link exceeds the link capacity is at most p. In binpacking we need to assign each connection to a link using as few links as possible. In the knapsack problem each connection has a value, and we have only one link. The problem is to accept as many
The Cilk System for Parallel Multithreaded Computing
, 1996
"... Although costeffective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel programming. In particular, it is still diiticult to build eiticient implementations of parallel applications ..."
Abstract

Cited by 42 (2 self)
 Add to MetaCart
Although costeffective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel programming. In particular, it is still diiticult to build eiticient implementations of parallel applications whose communication patterns are either highly irregular or dependent upon dynamic information. Multithreading has become an increasingly popular way to implement these dynamic, asynchronous, concurrent programs. Cilk (pronounced "silk") is our Cbased multithreaded computing system that provides provably good performance guarantees. This thesis describes the evolution of the Cilk language and runtime system, and describes applications which affected the evolution of the system.
Coordination mechanisms
 PROCEEDINGS OF THE 31ST INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES AND PROGRAMMING, IN: LECTURE NOTES IN COMPUTER SCIENCE
, 2004
"... We introduce the notion of coordination mechanisms to improve the performance in systems with independent selfish and noncolluding agents. The quality of a coordination mechanism is measured by its price of anarchy—the worstcase performance of a Nash equilibrium over the (centrally controlled) soc ..."
Abstract

Cited by 42 (5 self)
 Add to MetaCart
We introduce the notion of coordination mechanisms to improve the performance in systems with independent selfish and noncolluding agents. The quality of a coordination mechanism is measured by its price of anarchy—the worstcase performance of a Nash equilibrium over the (centrally controlled) social optimum. We give upper and lower bounds for the price of anarchy for selfish task allocation and congestion games.