Results 1 - 10
of
151
Resource Containers: A New Facility for Resource Management in Server Systems
- In Operating Systems Design and Implementation
, 1999
"... General-purpose operating systems provide inadequate support for resource management in large-scale servers. Applications lack sufficient control over scheduling and management of machine resources, which makes it difficult to enforce priority policies, and to provide robust and controlled service. ..."
Abstract
-
Cited by 498 (10 self)
- Add to MetaCart
(Show Context)
General-purpose operating systems provide inadequate support for resource management in large-scale servers. Applications lack sufficient control over scheduling and management of machine resources, which makes it difficult to enforce priority policies, and to provide robust and controlled service. There is a fundamental mismatch between the original design assumptions underlying the resource management mechanisms of current general-purpose operating systems, and the behavior of modern server applications. In particular, the operating system’s notions of protection domain and resource principal coincide in the process abstraction. This coincidence prevents a process that manages large numbers of network connections, for example, from properly allocating system resources among those connections. We propose and evaluate a new operating system abstraction called a resource container, which separates the notion of a protection domain from that of a resource principal. Resource containers enable fine-grained resource management in server systems and allow the development of robust servers, with simple and firm control over priority policies. 1
Memory resource management in VMware ESX server,”
- In Proc. of the 5th OSDI,
, 2002
"... Abstract VMware ESX Server is a thin software layer designed to multiplex hardware resources efficiently among virtual machines running unmodified commodity operating systems. This paper introduces several novel ESX Server mechanisms and policies for managing memory. A ballooning technique reclaims ..."
Abstract
-
Cited by 449 (2 self)
- Add to MetaCart
(Show Context)
Abstract VMware ESX Server is a thin software layer designed to multiplex hardware resources efficiently among virtual machines running unmodified commodity operating systems. This paper introduces several novel ESX Server mechanisms and policies for managing memory. A ballooning technique reclaims the pages considered least valuable by the operating system running in a virtual machine. An idle memory tax achieves efficient memory utilization while maintaining performance isolation guarantees. Content-based page sharing and hot I/O page remapping exploit transparent page remapping to eliminate redundancy and reduce copying overheads. These techniques are combined to efficiently support virtual machine workloads that overcommit memory.
Eliminating receive livelock in an interrupt-driven kernel
- ACM Transactions on Computer Systems
, 1997
"... Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low of-fered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms of receiv ..."
Abstract
-
Cited by 322 (5 self)
- Add to MetaCart
(Show Context)
Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low of-fered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms of receive livelock, in which the system spends all its time processing interrupts, to the exclusion of other neces-sary tasks. Under extreme conditions, no packets are delivered to the user application or the output of the system. To avoid livelock and related problems, an operat-ing system must schedule network interrupt handling as carefully as it schedules process execution. We modified an interrupt-driven networking implemen-tation to do so; this eliminates receive livelock without degrading other aspects of system performance. We present measurements demonstrating the success of our approach. 1.
The Design, Implementation and Evaluation of SMART: A Scheduler for Multimedia Applications
, 1997
"... This paper argues for the need to design a new processor scheduling algorithm that can handle the mix of applications we see today. We present a scheduling algorithm which we have implemented in the Solaris UNIX operating system [Eykholt et al. 1992], and demonstrate its improved performance over ex ..."
Abstract
-
Cited by 242 (6 self)
- Add to MetaCart
(Show Context)
This paper argues for the need to design a new processor scheduling algorithm that can handle the mix of applications we see today. We present a scheduling algorithm which we have implemented in the Solaris UNIX operating system [Eykholt et al. 1992], and demonstrate its improved performance over existing schedulers in research and practice on real applications. In particular, we have quantitatively compared against the popular weighted fair queueing and UNIX SVR4 schedulers in supporting multimedia applications in a realistic workstation environment...
CPU Reservations and Time Constraints: Efficient, Predictable Scheduling of Independent Activities
, 1997
"... Workstations and personal computers are increasingly being used for applications with real-time characteristics such as speech understanding and synthesis, media computations and I/O, and animation, often concurrently executed with traditional non-real-time workloads. This paper presents a system th ..."
Abstract
-
Cited by 216 (8 self)
- Add to MetaCart
Workstations and personal computers are increasingly being used for applications with real-time characteristics such as speech understanding and synthesis, media computations and I/O, and animation, often concurrently executed with traditional non-real-time workloads. This paper presents a system that can schedule multiple independent activities so that: . activities can obtain minimum guaranteed execution rates with application-specified reservation granularities via CPU Reservations, . CPU Reservations, which are of the form "reserve X units of time out of every Y units", provide not just an average case execution rate of X/Y over long periods of time, but the stronger guarantee that from any instant of time, by Y time units later, the activity will have executed for at least X time units, . applications can use Time Constraints to schedule tasks by deadlines, with on-time completion guaranteed for tasks with accepted constraints, and . both CPU Reservations and Time Constraints...
A Proportional Share Resource Allocation Algorithm For Real-Time, Time-Shared Systems
, 1996
"... We propose and analyze a proportional share resource allocation algorithm for realizing real-time performance in time-shared operating systems. In a proportional share system, processes are assigned a weight which determines a share (percentage) of the resource they are to receive. The resource is t ..."
Abstract
-
Cited by 215 (17 self)
- Add to MetaCart
(Show Context)
We propose and analyze a proportional share resource allocation algorithm for realizing real-time performance in time-shared operating systems. In a proportional share system, processes are assigned a weight which determines a share (percentage) of the resource they are to receive. The resource is then allocated in discrete-sized time quanta in such a manner that each process makes progress at a precise, uniform rate. Proportional share allocation algorithms are of interest because (1) they provide a natural means of seamlessly integrating real- and non-real-time processing requirements in a general purpose operating system, (2) they are easy to implement (and in particular, easier than more traditional forms of real-time support such as periodic tasks), (3) they provide a simple and effective means of precisely controlling the real-time performance of a process including uniform, predictable degradation in times of system overload, and (4) they provide a natural mean of policing proce...
Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling
- In Proc. EuroSys
, 2010
"... As organizations start to use data-intensive cluster computing systems like Hadoop and Dryad for more applications, there is a growing need to share clusters between users. However, there is a conflict between fairness in scheduling and data locality (placing tasks on nodes that contain their input ..."
Abstract
-
Cited by 212 (21 self)
- Add to MetaCart
(Show Context)
As organizations start to use data-intensive cluster computing systems like Hadoop and Dryad for more applications, there is a growing need to share clusters between users. However, there is a conflict between fairness in scheduling and data locality (placing tasks on nodes that contain their input data). We illustrate this problem through our experience designing a fair scheduler for a 600-node Hadoop cluster at Facebook. To address the conflict between locality and fairness, we propose a simple algorithm called delay scheduling: when the job that should be scheduled next according to fairness cannot launch a local task, it waits for a small amount of time, letting other jobs launch tasks instead. We find that delay scheduling achieves nearly optimal data locality in a variety of workloads and can increase throughput by up to 2x while preserving fairness. In addition, the simplicity of delay scheduling makes it applicable under a wide variety of scheduling policies beyond fair sharing.
Stride Scheduling: Deterministic Proportional-Share Resource Management
, 1995
"... This paper presents stride scheduling, a deterministic scheduling technique that efficiently supports the same flexible resource management abstractions introduced by lottery scheduling. Compared to lottery scheduling, stride scheduling achieves significantly improved accuracy over relative throughp ..."
Abstract
-
Cited by 169 (1 self)
- Add to MetaCart
This paper presents stride scheduling, a deterministic scheduling technique that efficiently supports the same flexible resource management abstractions introduced by lottery scheduling. Compared to lottery scheduling, stride scheduling achieves significantly improved accuracy over relative throughput rates, with significantly lower response time variability. Stride scheduling implements proportional-share control over processor time and other resources by cross-applying elements of rate-based flow control algorithms designed for networks. We introduce new techniques to support dynamic changes and higher-level resource management abstractions. We also introduce a novel hierarchical stride scheduling algorithm that achieves better throughput accuracy and lower response time variability than prior schemes. Stride scheduling is evaluated using both simulations and prototypes implemented for the Linux kernel.
Dominant resource fairness: Fair allocation of multiple resource types
, 2011
"... We consider the problem of fair resource allocation in a system containing different resource types, where each user may have different demands for each resource. To address this problem, we propose Dominant Resource Fairness (DRF), a generalization of max-min fairness to multiple resource types. We ..."
Abstract
-
Cited by 146 (15 self)
- Add to MetaCart
We consider the problem of fair resource allocation in a system containing different resource types, where each user may have different demands for each resource. To address this problem, we propose Dominant Resource Fairness (DRF), a generalization of max-min fairness to multiple resource types. We show that DRF, unlike other possible policies, satisfies several highly desirable properties. First, DRF incentivizes users to share resources, by ensuring that no user is better off if resources are equally partitioned among them. Second, DRF is strategy-proof, as a user cannot increase her allocation by lying about her requirements. Third, DRF is envyfree, as no user would want to trade her allocation with that of another user. Finally, DRF allocations are Pareto efficient, as it is not possible to improve the allocation of a user without decreasing the allocation of another user. We have implemented DRF in the Mesos cluster resource manager, and show that it leads to better throughput and fairness than the slot-based fair sharing schemes in current cluster schedulers. 1
Microkernels meet recursive virtual machines
, 1996
"... This paper describes a novel approach to providing modular and extensible operating system functionality and encapsulated environments based on a synthesis of microkernel and virtual machine concepts. We have developed a software-based virtualizable architecture called Fluke that allows recursive vi ..."
Abstract
-
Cited by 132 (27 self)
- Add to MetaCart
This paper describes a novel approach to providing modular and extensible operating system functionality and encapsulated environments based on a synthesis of microkernel and virtual machine concepts. We have developed a software-based virtualizable architecture called Fluke that allows recursive virtual machines (virtual machines running on other virtual machines) to be implemented efficiently by a microkernel running on generic hardware. A complete virtual machine interface is provided at each level; efficiency derives from needing to implement only new functionality at each level. This infrastructure allows common OS functionality, such as process management, demand paging, fault tolerance, and debugging support, to be provided by cleanly modularized, independent, stackable virtual machine monitors, implemented as user processes. It can also provide uncommon or unique OS features, including the above features specialized for particular applications ’ needs, virtual machines transparently distributed cross-node, or security monitors that allow arbitrary untrusted binaries to be executed safely. Our prototype implementation of this model indicates that it is practical to modularize operating systems this way. Some types of virtual machine layers impose almost no overhead at all, while others impose some overhead (typically 0–35%), but only on certain classes of applications.