Results 1 - 10
of
78
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors
- IEEE Transactions on Parallel and Distributed Systems
, 1994
"... Loops are the single largest source of parallelism in many applications. One way to exploit this parallelism is to execute loop iterations in parallel on different processors. Previous approaches to loop scheduling attempt to achieve the minimum completion time by distributing the workload as evenly ..."
Abstract
-
Cited by 133 (2 self)
- Add to MetaCart
Loops are the single largest source of parallelism in many applications. One way to exploit this parallelism is to execute loop iterations in parallel on different processors. Previous approaches to loop scheduling attempt to achieve the minimum completion time by distributing the workload as evenly as possible, while minimizing the number of synchronization operations required. In this paper we consider a third dimension to the problem of loop scheduling on shared-memory multiprocessors: communication overhead caused by accesses to non-local data. We show that traditional algorithms for loop scheduling, which ignore the location of data when assigning iterations to processors, incur a significant performance penalty on modern shared-memory multiprocessors. We propose a new loop scheduling algorithm that attempts to simultaneously balance the workload, minimize synchronization, and co-locate loop iterations with the necessary data. We compare the performance of this new algorithm to ot...
Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O
, 2001
"... Disk schedulers in current operating systems are generally work-conserving, i.e., they schedule a request as son as the previous request has finished. Such schedulers often require multiple outstanding requests from each process to meet system-level goals of performance and quality of service. U ..."
Abstract
-
Cited by 94 (2 self)
- Add to MetaCart
Disk schedulers in current operating systems are generally work-conserving, i.e., they schedule a request as son as the previous request has finished. Such schedulers often require multiple outstanding requests from each process to meet system-level goals of performance and quality of service. Unfortunately, many common applications issue disk read requests in a synchronous manna% interspersing successive requests with shor periods of computation. The scheduler chooses the next request too early; this induces deceptive idleness, a condition where the scheduler incorrectly assumes that the test request issuing process has no further requests, and becomes forced to switch to a toques? from another pro- Ce3S.
Scheduling and Page Migration for Multiprocessor Compute Servers
- INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS
, 1994
"... Several cache-coherent shared-memory multiprocessors have been developed that are scalable and offer a very tight coupling between the processing resources. They are therefore quite attractive for use as compute servers for multiprogramming and parallel application workloads. Process scheduling and ..."
Abstract
-
Cited by 81 (4 self)
- Add to MetaCart
Several cache-coherent shared-memory multiprocessors have been developed that are scalable and offer a very tight coupling between the processing resources. They are therefore quite attractive for use as compute servers for multiprogramming and parallel application workloads. Process scheduling and memory management, however, remain challenging due to the distributed main memory found on such machines. This paper examines the effects of OS scheduling and page migration policies on the performance of such compute servers. Our experiments are done on the Stanford DASH, a distributed-memory cache-coherent multiprocessor. We show that for our multiprogramming workloads consisting of sequential jobs, the traditional Unix scheduling policy does very poorly. In contrast, a policy incorporating cluster and cache affinity along with a simple page-migration algorithm offers up to two-fold performance improvement. For our workloads consisting of multiple parallel applications, we compare space-sharing policies that divide the processors among the applications to time-slicing policies such as standard Unix or gang scheduling. We show that space-sharing policies can achieve better processor utilization due to the operating point effect, but time-slicing policies benefit strongly from user-level data distribution. Our initial experience with automatic page migration suggests that policies based only on TLB miss information can be quite effective, and useful for addressing the data distribution problems of space-sharing schedulers.
CPU Inheritance Scheduling
- IN PROCEEDINGS OF THE SECOND SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION
, 1996
"... Traditional processor scheduling mechanisms in operating systems are fairly rigid, often supportingonly one fixed scheduling policy, or, at most, a few "scheduling classes" whose implementations are closely tied together in the OS kernel. This paper presents CPU inheritance scheduling, a novel proce ..."
Abstract
-
Cited by 79 (1 self)
- Add to MetaCart
Traditional processor scheduling mechanisms in operating systems are fairly rigid, often supportingonly one fixed scheduling policy, or, at most, a few "scheduling classes" whose implementations are closely tied together in the OS kernel. This paper presents CPU inheritance scheduling, a novel processor scheduling framework in which arbitrary threads can act as schedulers for other threads. Widely different scheduling policies can be implemented under the framework, and many different policies can coexist in a single system, providing much greater scheduling flexibility. Modular, hierarchical control can be provided over the processor utilization of arbitrary administrative domains, such as processes, jobs, users, and groups, and the CPU resources consumed can be accounted for and attributed accurately. Applications, as well as the OS, can implement customized local scheduling policies; the framework ensures that all the different policies work together logically and predictably. As a ...
Surplus Fair Scheduling: A Proportional-Share CPU Scheduling Algorithm for Symmetric Multiprocessors
, 2000
"... In this paper, we present surplus fair scheduling (SFS), a proportional-share CPU scheduler designed for symmetric multiprocessors. We first show that the infeasibility of certain weight assignments in multiprocessor environments results in unfairness or starvation in many existing proportional-shar ..."
Abstract
-
Cited by 62 (6 self)
- Add to MetaCart
In this paper, we present surplus fair scheduling (SFS), a proportional-share CPU scheduler designed for symmetric multiprocessors. We first show that the infeasibility of certain weight assignments in multiprocessor environments results in unfairness or starvation in many existing proportional-share schedulers. We present a novel weight readjustment algorithm to translate infeasible weight assignments to a set of feasible weights. We show that weight readjustment enables existing proportional-share schedulers to significantly reduce, but not eliminate, the unfairness in their allocations. We then present surplus fair scheduling, a proportional-share scheduler that is designed explicitly for multiprocessor environments. We implement our scheduler in the Linux kernel and demonstrate its efficacy through an experimental evaluation. Our results show that SFS can achieve proportionate allocation, application isolation and good interactive performance, albeit at a slight increase in scheduling overhead. We conclude from our results that a proportionalshare scheduler such as SFS is not only practical but also desirable for server operating systems.
Process migration
- ACM Computing Surveys
, 2000
"... A process is an operating system abstraction representing an instance of a running computer program. Process migration is the act of transferring a process between two machines during its execution. Several implementations ..."
Abstract
-
Cited by 62 (1 self)
- Add to MetaCart
A process is an operating system abstraction representing an instance of a running computer program. Process migration is the act of transferring a process between two machines during its execution. Several implementations
Using Name-Based Mappings to Increase Hit Rates
- IEEE/ACM TRANSACTIONS ON NETWORKING
, 1997
"... Clusters of identical intermediate servers are often created to improve availability and robustness in many domains. The use of proxy servers for the WWW and of Rendezvous Points in multicast routing are two such situations. However, this approach can be inefficient if identical requests are receive ..."
Abstract
-
Cited by 60 (4 self)
- Add to MetaCart
Clusters of identical intermediate servers are often created to improve availability and robustness in many domains. The use of proxy servers for the WWW and of Rendezvous Points in multicast routing are two such situations. However, this approach can be inefficient if identical requests are received and processed by multiple servers. We present an analysis of this problem, and develop a method called the Highest Random Weight (HRW) Mapping that eliminates these difficulties. Given an object name and a set of servers, HRW maps a request to a server using the object name, rather than any a priori knowledge of server states. Since HRW always maps a given object name to the same server within a given cluster, it may be used locally at client sites to achieve consensus on objectserver mappings. We present an analysis of HRW and validate it with simulation results showing that it gives faster service times than traditional request allocation schemes such as round-robin or least-loaded, and...
Deadline fair scheduling: Bridging the theory and practice of proportionate-fair scheduling in multiprocessor servers
- IN PROC. OF THE 7TH IEEE REAL-TIME TECHNOLOGY AND APPLICATIONS SYMPOSIUM
, 2001
"... In this paper, we present Deadline Fair Scheduling (DFS), a proportionate-fair CPU scheduling algorithm for multiprocessor servers. A particular focus of our work is to investigate practical issues in instantiating proportionatefair (P-fair) schedulers into conventional operating systems. We show vi ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
In this paper, we present Deadline Fair Scheduling (DFS), a proportionate-fair CPU scheduling algorithm for multiprocessor servers. A particular focus of our work is to investigate practical issues in instantiating proportionatefair (P-fair) schedulers into conventional operating systems. We show via a simulation study that characteristics of conventional operating systems such as the asynchrony in scheduling multiple processors, frequent arrivals and departures of tasks, and variable quantum durations can cause proportionate-fair schedulers to become nonwork-conserving. To overcome this drawback, we combine DFS with an auxiliary work-conserving scheduler to ensure work-conserving behavior at all times. We then propose techniques to account for processor affinities while scheduling tasks in multiprocessor environments. We implement the resulting scheduler in the Linux kernel and evaluate its performance using various applications and benchmarks. Our experimental results show that DFS can achieve proportionate allocation, performance isolation and work-conserving behavior at the expense of a small increase in the scheduling overhead. We conclude that practical considerations such as work-conserving behavior and processor affinities when incorporated into a P-fair scheduler such as DFS can result in a practical approach for scheduling tasks in a multiprocessor operating system.
Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors
- Journal of Parallel and Distributed Computing
, 1995
"... As a process executes on a CPU, it builds up state in that CPU's cache. In multiprogrammed workloads, the opportunity to reuse this state may be lost when a process gets rescheduled, either because intervening processes destroy its cache state or because the process may migrate to another processor. ..."
Abstract
-
Cited by 47 (1 self)
- Add to MetaCart
As a process executes on a CPU, it builds up state in that CPU's cache. In multiprogrammed workloads, the opportunity to reuse this state may be lost when a process gets rescheduled, either because intervening processes destroy its cache state or because the process may migrate to another processor. In this paper, we explore affinity scheduling, a technique that helps reduce cache misses by preferentially scheduling a process on a CPU where it has run recently. Our study focuses on a bus-based multiprocessor executing a variety of workloads, including mixes of scientific, software development, and database applications. In addition to quantifying the performance benefits of exploiting affinity, our study is distinctive in that it provides low-level data from a hardware performance monitor that details why the workloads perform as they do. Overall, for the workloads studied, we show that affinity scheduling reduces the number of cache misses by 7--36%, resulting in execution time improv...
Processor Allocation Policies for Message-Passing Parallel Computers
, 1994
"... When multiple jobs compete for processing resources on a parallel computer, the operating system kernel's processor allocation policy determines how many and which processors to allocate to each. This dissertation investigates the issues involved in constructing a processor allocation policy for lar ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
When multiple jobs compete for processing resources on a parallel computer, the operating system kernel's processor allocation policy determines how many and which processors to allocate to each. This dissertation investigates the issues involved in constructing a processor allocation policy for large scale, message-passing parallel computers supporting a scientific workload. First, the issues that affect the performance of scheduling policies for message-passing parallel systems are examined. We argue that reasonable policies must provide nearly equal resource allocation to all runnable jobs and allocate, to a single job, processors that are in close proximity to one another. Second, the concept of efficiency preservation is defined as a characteristic of processor allocation policie...

