Results 1 - 10
of
20
PRESS: PRedictive Elastic ReSource Scaling for cloud systems
"... Abstract—Cloud systems require elastic resource allocation to minimize resource provisioning costs while meeting service level objectives (SLOs). In this paper, we present a novel PRedictive Elastic reSource Scaling (PRESS) scheme for cloud systems. PRESS unobtrusively extracts fine-grained dynamic ..."
Abstract
-
Cited by 62 (8 self)
- Add to MetaCart
(Show Context)
Abstract—Cloud systems require elastic resource allocation to minimize resource provisioning costs while meeting service level objectives (SLOs). In this paper, we present a novel PRedictive Elastic reSource Scaling (PRESS) scheme for cloud systems. PRESS unobtrusively extracts fine-grained dynamic patterns in application resource demands and adjust their resource allocations automatically. Our approach leverages light-weight signal processing and statistical learning algorithms to achieve online predictions of dynamic application resource requirements. We have implemented the PRESS system on Xen and tested it using RUBiS and an application load trace from Google. Our experiments show that we can achieve good resource prediction accuracy with less than 5 % over-estimation error and near zero under-estimation error, and elastic resource scaling can both significantly reduce resource waste and SLO violations. I.
CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems
"... Elastic resource scaling lets cloud systems meet application service level objectives (SLOs) with minimum resource provisioning costs. In this paper, we present CloudScale, a system that automates finegrained elastic resource scaling for multi-tenant cloud computing infrastructures. CloudScale emplo ..."
Abstract
-
Cited by 53 (6 self)
- Add to MetaCart
(Show Context)
Elastic resource scaling lets cloud systems meet application service level objectives (SLOs) with minimum resource provisioning costs. In this paper, we present CloudScale, a system that automates finegrained elastic resource scaling for multi-tenant cloud computing infrastructures. CloudScale employs online resource demand prediction and prediction error handling to achieve adaptive resource allocation without assuming any prior knowledge about the applications running inside the cloud. CloudScale can resolve scaling conflicts between applications using migration, and integrates dynamic CPU voltage/frequency scaling to achieve energy savings with minimal effect on application SLOs. We have implemented CloudScale on top of Xen and conducted extensive experiments using a set of CPU and memory intensive applications (RUBiS, Hadoop, IBM System S). The results show that CloudScale can achieve significantly higher SLO conformance than other alternatives with low resource and energy cost. CloudScale is non-intrusive and light-weight, and imposes negligible overhead (< 2 % CPU in Domain 0) to the virtualized computing cluster.
Resource Allocation Algorithms for Virtualized Service Hosting Platforms
, 2010
"... Commodity clusters are used routinely for deploying service hosting platforms. Due to hardware and operation costs, clusters need to be shared among multiple services. Crucial for enabling such shared hosting platforms is virtual machine (VM) technology, which allows consolidation of hardware resour ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
(Show Context)
Commodity clusters are used routinely for deploying service hosting platforms. Due to hardware and operation costs, clusters need to be shared among multiple services. Crucial for enabling such shared hosting platforms is virtual machine (VM) technology, which allows consolidation of hardware resources. A key challenge, however, is to make appropriate decisions when allocating hardware resources to service instances. In this work we propose a formulation of the resource allocation problem in shared hosting platforms for static workloads with servers that provide multiple types of resources. Our formulation supports a mix of best-effort and QoS scenarios, and, via a precisely defined objective function, promotes performance, fairness, and cluster utilization. Further, this formulation makes it possible to compute a bound on the optimal resource allocation. We propose several classes of resource allocation algorithms, which we evaluate in simulation. We are able to identify an algorithm that achieves average performance close to the optimal across many experimental scenarios. Furthermore, this algorithm runs in only a few seconds for large platforms and thus is usable in practice.
A Dollar from 15 Cents: Cross-Platform Management for Internet Services
- In USENIX
, 2008
"... As Internet services become ubiquitous, the selection and management of diverse server platforms now affects the bottom line of almost every firm in every industry. Ideally, such cross-platform management would yield high performance at low cost, but in practice, the performance consequences of such ..."
Abstract
-
Cited by 27 (7 self)
- Add to MetaCart
(Show Context)
As Internet services become ubiquitous, the selection and management of diverse server platforms now affects the bottom line of almost every firm in every industry. Ideally, such cross-platform management would yield high performance at low cost, but in practice, the performance consequences of such decisions are often hard to predict. In this paper, we present an approach to guide cross-platform management for real-world Internet services. Our approach is driven by a novel performance model that predicts application-level performance across changes in platform parameters, such as processor cache sizes, processor speeds, etc., and can be calibrated with data commonly available in today’s production environments. Our model is structured as a composition of several empirically observed, parsimonious sub-models. These sub-models have few free parameters and can be calibrated with lightweight passive observations on a current production platform. We demonstrate the usefulness of our cross-platform model in two management problems. First, our model provides accurate performance predictions when selecting the next generation of processors to enter a server farm. Second, our model can guide platform-aware load balancing across heterogeneous server farms. 1
Automated Experiment-Driven Management of (Database) Systems
- In Proceedings of 12th Workshop on Hot Topics in Operating Systems
, 2009
"... In this position paper, we argue that an important piece of the system administration puzzle has largely been left untouched by researchers. This piece involves mechanisms and policies to identify as well as collect missing instrumentation data; the missing data is essential to generate the knowledg ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
(Show Context)
In this position paper, we argue that an important piece of the system administration puzzle has largely been left untouched by researchers. This piece involves mechanisms and policies to identify as well as collect missing instrumentation data; the missing data is essential to generate the knowledge required to address certain administrative tasks satisfactorily and efficiently. We introduce the paradigm of experiment-driven management which encapsulates such mechanisms and policies for a given administrative task. We outline the benefits that automated experiment-driven management brings to several long-standing problems in databases as well as other systems, and identify research challenges as well as initial solutions. 1
AGILE: elastic distributed resource scaling for Infrastructure-as-a-Service
"... Dynamically adjusting the number of virtual machines (VMs) assigned to a cloud application to keep up with load changes and interference from other uses typically requires detailed application knowledge and an ability to know the future, neither of which are readily available to infrastructure servi ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
(Show Context)
Dynamically adjusting the number of virtual machines (VMs) assigned to a cloud application to keep up with load changes and interference from other uses typically requires detailed application knowledge and an ability to know the future, neither of which are readily available to infrastructure service providers or application owners. The result is that systems need to be over-provisioned (costly), or risk missing their performance Service Level Objectives (SLOs) and have to pay penalties (also costly). AGILE deals with both issues: it uses wavelets to provide a medium-term resource demand prediction with enough lead time to start up new application server instances before performance falls short, and it uses dynamic VM cloning to reduce application startup times. Tests using RUBiS and Google cluster traces show that AGILE can predict varying resource demands over the medium-term with up to 3.42 × better true positive rate and 0.34 × the false positive rate than existing schemes. Given a target SLO violation rate, AGILE can efficiently handle dynamic application workloads, reducing both penalties and user dissatisfaction. 1
SLA-Driven Adaptive Resource Management for Web Applications on a Heterogeneous Compute Cloud
- in Proceedings of the 1st International Conference on Cloud Computing (CloudCom ’09
, 2009
"... Abstract. Current service-level agreements (SLAs) offered by cloud providers make guarantees about quality attributes such as availability. However, although one of the most important quality attributes from the perspective of the users of a cloud-based Web application is its re-sponse time, current ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Current service-level agreements (SLAs) offered by cloud providers make guarantees about quality attributes such as availability. However, although one of the most important quality attributes from the perspective of the users of a cloud-based Web application is its re-sponse time, current SLAs do not guarantee response time. Satisfying a maximum average response time guarantee for Web applications is dif-ficult due to unpredictable traffic patterns, but in this paper we show how it can be accomplished through dynamic resource allocation in a virtual Web farm. We present the design and implementation of a work-ing prototype built on a EUCALYPTUS-based heterogeneous compute cloud that actively monitors the response time of each virtual machine assigned to the farm and adaptively scales up the application to satisfy a SLA promising a specific average response time. We demonstrate the feasibility of the approach in an experimental evaluation with a testbed cloud and a synthetic workload. Adaptive resource management has the potential to increase the usability of Web applications while maximizing resource utilization. 1
Optimizing utility in cloud computing through autonomic workload execution
- IEEE Data Eng. Bull
"... Cloud computing provides services to potentially numerous remote users with diverse requirements. Although predictable performance can be obtained through the provision of carefully delimited services, it is straightforward to identify applications in which a cloud might usefully host services that ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
Cloud computing provides services to potentially numerous remote users with diverse requirements. Although predictable performance can be obtained through the provision of carefully delimited services, it is straightforward to identify applications in which a cloud might usefully host services that support the composition of more primitive analysis services or the evaluation of complex data analysis requests. In such settings, a service provider must manage complex and unpredictable workloads. This paper describes how utility functions can be used to make explicit the desirability of different workload evaluation strategies, and how optimization can be used to select between such alternatives. The approach is illustrated for workloads consisting of workflows or queries. 1
The Case for Predictive Database Systems: Opportunities and Challenges
"... This paper argues that next generation database management systems should incorporate a predictive model management component to effectively support both inward-facing applications, such as self management, and user-facing applications such as data-driven predictive analytics. We draw an analogy bet ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
This paper argues that next generation database management systems should incorporate a predictive model management component to effectively support both inward-facing applications, such as self management, and user-facing applications such as data-driven predictive analytics. We draw an analogy between model management and data management functionality and discuss how model management can leverage profiling, physical design and query optimization techniques, as well as the pertinent challenges. We then describe the early design and architecture of Longview, a predictive DBMS prototype that we are building at Brown, along with a case study of how models can be used to predict query execution performance. 1.
Automatic Tuning of Interactive Perception Applications
- In Proceedings of the Conference on Uncertainty and Artificial Intelligence (UAI
, 2010
"... Interactive applications incorporating high-data rate sensing and computer vision are becoming possible due to novel runtime systems and the use of parallel computation resources. To allow interactive use, such applications require careful tuning of multiple application parameters to meet required f ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Interactive applications incorporating high-data rate sensing and computer vision are becoming possible due to novel runtime systems and the use of parallel computation resources. To allow interactive use, such applications require careful tuning of multiple application parameters to meet required fidelity and latency bounds. This is a nontrivial task, often requiring expert knowledge, which becomes intractable as resources and application load characteristics change. This paper describes a method for automatic performance tuning that learns application characteristics and effects of tunable parameters online, and constructs models that are used to maximize fidelity for a given latency constraint. The paper shows that accurate latency models can be learned online, knowledge of application structure can be used to reduce the complexity of the learning task, and operating points can be found that achieve 90 % of the optimal fidelity by exploring the parameter space only 3 % of the time. 1