Results 1 - 10
of
37
Effective Distributed Scheduling of Parallel Workloads
, 1996
"... We present a distributed algorithm for time-sharing parallel workloads that is competitive with coscheduling. Implicit scheduling allows each local scheduler in the system to make independent decisions that dynamically coordinate the scheduling of cooperating processes across processors. Of particul ..."
Abstract
-
Cited by 127 (5 self)
- Add to MetaCart
(Show Context)
We present a distributed algorithm for time-sharing parallel workloads that is competitive with coscheduling. Implicit scheduling allows each local scheduler in the system to make independent decisions that dynamically coordinate the scheduling of cooperating processes across processors. Of particular importance is the blocking algorithm which decides the action of a process waiting for a communication or synchronization event to complete. Through simulation of bulk-synchronous parallel applications, we find that a simple two-phase fixed-spin blocking algorithm performs well; a two-phase adaptive algorithm that gathers run-time data on barrier wait-times performs slightly better. Our results hold for a range of machine parameters and parallel program characteristics. These findings are in direct contrast to the literature that states explicit coscheduling is necessary for fine-grained programs. We show that the choice of the local scheduler is crucial, with a priority-based scheduler p...
Improved Utilization and Responsiveness with Gang Scheduling.
- Scheduling Strategies for Parallel Processing,
, 1997
"... Abstract Most commercial multicomputers use space-slicing schemes in which each scheduling decision has an unknown impact on the future: should a job be scheduled, risking that it will block other larger jobs later, or should the processors be left idle for now in anticipation of future arrivals? T ..."
Abstract
-
Cited by 119 (20 self)
- Add to MetaCart
(Show Context)
Abstract Most commercial multicomputers use space-slicing schemes in which each scheduling decision has an unknown impact on the future: should a job be scheduled, risking that it will block other larger jobs later, or should the processors be left idle for now in anticipation of future arrivals? This dillema is solved by using gang scheduling, because then the impact of each decision is limited to its time slice, and future arrivals can be accommodated in other time slices. This added flexibility y is shown to improve overall system utilization and responsiveness. Empirical evidence from using gang scheduling on a Cray T3D installed at Lawrence Livermore National Lab corroborates these results, and shows conclusively that gang scheduling can be very effective with current technology.
Scheduling in the Dark
, 1999
"... We considered non-clairvoyant multiprocessor scheduling of jobs with arbitrary arrival times and changing execution characteristics. The problem has been studied extensively when either the jobs all arrive at time zero, or when all the jobs are fully parallelizable, or when the scheduler has conside ..."
Abstract
-
Cited by 94 (13 self)
- Add to MetaCart
We considered non-clairvoyant multiprocessor scheduling of jobs with arbitrary arrival times and changing execution characteristics. The problem has been studied extensively when either the jobs all arrive at time zero, or when all the jobs are fully parallelizable, or when the scheduler has considerable knowledge about the jobs. This paper considers for the first time this problem without any of these three restrictions and provides new upper and lower bound techniques applicable in this more difficult scenario. The results are of both theoretical and practical interest. In our model, a job can arrive at any arbitrary time and its execution characteristics can change through the life of the job from being anywhere from fully parallelizable to completely sequential. We assume that the scheduler has no knowledge about the jobs except for knowing when a job arrives and knowing when it completes. (This is why we say that the scheduler is completely in the dark.) Given all this, we prove t...
Implicit Coscheduling: Coordinated Scheduling with Implicit Information in Distributed Systems
- ACM TRANSACTIONS ON COMPUTER SYSTEMS
, 1998
"... In this thesis, we formalize the concept of an implicitly-controlled system, also referred to as an implicit system. In an implicit system, cooperating components do not explicitly contact other components for control or state information; instead, components infer remote state by observing natural ..."
Abstract
-
Cited by 54 (2 self)
- Add to MetaCart
(Show Context)
In this thesis, we formalize the concept of an implicitly-controlled system, also referred to as an implicit system. In an implicit system, cooperating components do not explicitly contact other components for control or state information; instead, components infer remote state by observing naturally-occurring local events and their corresponding implicit information, i.e., information available outside of a defined interface. Many systems, particularly in distributed and networked environments, have leveraged implicit control to simplify the implementation of services with autonomous components. To concretely demonstrate the advantages of implicit control, we propose and implement implicit coscheduling, an algorithm for dynamically coordinating the time...
Non-clairvoyant Multiprocessor Scheduling of Jobs with Changing Execution Characteristics
- Journal of Scheduling
, 1997
"... In this work theoretically proves that Equi-partition efficiently schedules multiprocessor batch jobs with different execution characteristics. Motwani et al.show that the mean response time of jobs is within two of optimal for fully parallelizable jobs. We extend this result by considering jobs w ..."
Abstract
-
Cited by 45 (4 self)
- Add to MetaCart
(Show Context)
In this work theoretically proves that Equi-partition efficiently schedules multiprocessor batch jobs with different execution characteristics. Motwani et al.show that the mean response time of jobs is within two of optimal for fully parallelizable jobs. We extend this result by considering jobs with multiple phases of arbitrary nondecreasing and sublinear speedup functions. Having no knowledge of the jobs being scheduled (non-clairvoyant) one would not expect it to perform well. However, our main result shows that the mean response time obtained with Equi-partition is no more than 2 + 3 3:73 times the optimal. The paper also considers schedulers with different numbers of preemptions and jobs with more general classes of speedup functions. Matching lower bounds are also proved.
Preemptive scheduling of parallel jobs on multiprocessors
- In SODA
, 1996
"... Abstract. We study the problem of processor scheduling for n parallel jobs applying the method of competitive analysis. We prove that for jobs with a single phase of parallelism, a preemptive scheduling algorithm without information about job execution time can achieve a mean completion time within ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
Abstract. We study the problem of processor scheduling for n parallel jobs applying the method of competitive analysis. We prove that for jobs with a single phase of parallelism, a preemptive scheduling algorithm without information about job execution time can achieve a mean completion time within 2 − 2 2 times the optimum. In other words, we prove a competitive ratio of 2 − n+1 n+1. The result is extended to jobs with multiple phases of parallelism (which can be used to model jobs with sublinear speedup) and to interactive jobs (with phases during which the job has no CPU requirements) to derive solutions guaranteed to be within 4 − 4 times the optimum. In comparison n+1 with previous work, our assumption that job execution times are unknown prior to their completion is more realistic, our multiphased job model is more general, and our approximation ratio (for jobs with a single phase of parallelism) is tighter and cannot be improved. While this work presents theoretical results obtained using competitive analysis, we believe that the results provide insight into the performance of practical multiprocessor scheduling algorithms that operate in the absence of complete information.
Using Parallel Program Characteristics in Dynamic Processor Allocation Policies
- Performance Evaluation
, 1996
"... In multiprocessors a parallel program's execution time is directly influenced by the number of processors it is allocated. The problem of scheduling parallel programs in a multiprogrammed environment becomes one of determining how to best allocate processors to the different simultaneously e ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
(Show Context)
In multiprocessors a parallel program's execution time is directly influenced by the number of processors it is allocated. The problem of scheduling parallel programs in a multiprogrammed environment becomes one of determining how to best allocate processors to the different simultaneously executing programs in order to minimize mean response time. In this paper we address the problem of how many processors to allocate to each of the executing parallel jobs by examining the following questions: 1. Is allocating processors equally among all jobs (equipartitioning) a desirable property of a scheduling algorithm? 2. Does using information about the service demand of parallel jobs significantly reduce mean response time? 3. Does using information about the efficiency with which parallel jobs execute significantly reduce mean response time? 4. Does allocating each job a number of processors corresponding to the knee of the execution time -- efficiency curve significantly reduc...
The Interaction between Memory Allocation and Adaptive Partitioning in Message-Passing Multicomputers
- Scheduling Strategies for Parallel Processing, Lecture Notes in Computer Science
, 1995
"... . Most studies on adaptive partitioning policies for scheduling parallel jobs on distributed memory parallel computers ignore the constraints imposed by the memory requirements of the jobs. In this paper, we first show that these constraints can have a negative impact on the performance of adaptive ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
(Show Context)
. Most studies on adaptive partitioning policies for scheduling parallel jobs on distributed memory parallel computers ignore the constraints imposed by the memory requirements of the jobs. In this paper, we first show that these constraints can have a negative impact on the performance of adaptive partitioning policies. We then evaluate the performance of adaptive partitioning in a system where these minimum processor constraints are eased due to the provision of support for virtual memory. Our primary conclusion is that any performance benefits resulting from the easing of minimum processor constraints imposed by the memory requirements of jobs will be negated by the overhead due to paging. 1 Introduction In recent years, several adaptive partitioning strategies [6, 19, 5, 18, 17, 3, 14] have been proposed for scheduling parallel jobs on message-passing multicomputers. A key characteristic of these policies is that they reduce the number of processors allocated to individual jobs as...
Memory Usage in the LANL CM-5 Workload
- In Job Scheduling Strategies for Parallel Processing
, 1997
"... . It is generally agreed that memory requirements should be taken into account in the scheduling of parallel jobs. However, so far the work on combined processor and memory scheduling has not been based on detailed information and measurements. To rectify this problem, we present an analysis of ..."
Abstract
-
Cited by 25 (8 self)
- Add to MetaCart
(Show Context)
. It is generally agreed that memory requirements should be taken into account in the scheduling of parallel jobs. However, so far the work on combined processor and memory scheduling has not been based on detailed information and measurements. To rectify this problem, we present an analysis of memory usage by a production workload on a large parallel machine, the 1024-node CM-5 installed at Los Alamos National Lab. Our main observations are -- The distribution of memory requests has strong discrete components, i.e. some sizes are much more popular than others. -- Many jobs use a relatively small fraction of the memory available on each node, so there is some room for time slicing among several memory-resident jobs. -- Larger jobs (using more nodes) tend to use more memory, but it is difficult to characterize the scaling of per-processor memory usage. 1 Introduction Resource management includes a number of distinct topics, such as scheduling and memory management. Howeve...
Implementing Multiprocessor Scheduling Disciplines
- In Proceedings of IPPS/SPDP ’97 Workshop. Lecture Notes in Computer Science
"... An important issue in multiprogrammed multiprocessor systems is the scheduling of parallel jobs. Consequently, there has been a considerable amount of analytic research in this area recently. A frequent criticism, however, is that proposed disciplines that are studied analytically are rarely eve ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
(Show Context)
An important issue in multiprogrammed multiprocessor systems is the scheduling of parallel jobs. Consequently, there has been a considerable amount of analytic research in this area recently. A frequent criticism, however, is that proposed disciplines that are studied analytically are rarely ever implemented and even more rarely incorporated into commercial scheduling software. In this paper, we seek to bridge this gap by describing how at least one commercial scheduling system, namely Platform Computing's Load Sharing Facility, can be extended to support a wide variety of new scheduling disciplines.