Results 1 - 10
of
15
Mercury and Freon: Temperature Emulation and Management for Server Systems
"... Power densities have been increasing rapidly at all levels of server systems. To counter the high temperatures resulting from these densities, systems researchers have recently started work on software-based thermal management. Unfortunately, research in this new area has been hindered by the limita ..."
Abstract
-
Cited by 47 (6 self)
- Add to MetaCart
Power densities have been increasing rapidly at all levels of server systems. To counter the high temperatures resulting from these densities, systems researchers have recently started work on software-based thermal management. Unfortunately, research in this new area has been hindered by the limitations imposed by simulators and real measurements. In this paper, we introduce Mercury, a software suite that avoids these limitations by accurately emulating temperatures based on simple layout, hardware, and componentutilization data. Most importantly, Mercury runs the entire software stack natively, enables repeatable experiments, and allows the study of thermal emergencies without harming hardware reliability. We validate Mercury using real measurements and a widely used commercial simulator. We use Mercury to develop Freon, a system that manages thermal emergencies in a server cluster without unnecessary performance degradation. Mercury will soon become available from
DFTL: A Flash Translation Layer Employing Demand-based Selective Caching of Page-level Address Mappings
- Penn State University
, 2008
"... Recent technological advances in the development of flashmemory based devices have consolidated their leadership position as the preferred storage media in the embedded systems market and opened new vistas for deployment in enterprise-scale storage systems. Unlike hard disks, flash devices are free ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
Recent technological advances in the development of flashmemory based devices have consolidated their leadership position as the preferred storage media in the embedded systems market and opened new vistas for deployment in enterprise-scale storage systems. Unlike hard disks, flash devices are free from any mechanical moving parts, have no seek or rotational delays and consume lower power. However, the internal idiosyncrasies of flash technology make its performance highly dependent on workload characteristics. The poor performance of random writes has been a cause of major concern which needs to be addressed to better utilize the potential of flash in enterprise-scale environments. We examine one of the important causes of this poor performance: the design of the Flash Translation Layer
C-Oracle: Predictive Thermal Management for Data Centers
- In Symposium on High-Performance Computer Architecture
, 2008
"... Thermal management has become a critical requirement for today’s power-dense server clusters, due to the negative impact of high temperatures on the reliability of computer hardware. Recognizing this fact, researchers have started to design software-based thermal management policies that leverage hi ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Thermal management has become a critical requirement for today’s power-dense server clusters, due to the negative impact of high temperatures on the reliability of computer hardware. Recognizing this fact, researchers have started to design software-based thermal management policies that leverage high-level information to control system-wide temperatures effectively. Unfortunately, designing these policies is currently a challenge, since it is difficult to predict the exact temperature and performance that would result from trying to react to a thermal emergency. Reactions that are excessively severe may cause unnecessary performance degradation and/or generate emergencies in other parts of the system, whereas reactions that are excessively mild may take relatively long to become effective (if at all), compromising the reliability of the system. To address this challenge, in this paper we propose C-Oracle, a software infrastructure for Internet services that dynamically predicts the temperature and performance impact of different thermal management reactions into the future, allowing the thermal management policy to select the best reaction at each point in time. C-Oracle makes predictions based on simple models of temperature, component utilization, and policy behavior that can be solved efficiently. We experimentally evaluate C-Oracle for thermal management policies based on load redistribution and dynamic voltage/frequency scaling in both single-tier and multi-tier services. Our results show that, regardless of management policy or service organization, C-Oracle enables non-trivial decisions that effectively manage thermal emergencies, while avoiding unnecessary performance degradation. 1
Modeling and Managing Thermal Profiles of Rack-mounted Servers with ThermoStat
- In Proceedings of HPCA
, 2007
"... High power densities and the implications of high operating temperatures on the failure rates of components are key driving factors of temperature-aware computing. Computer architects and system software designers need to understand the thermal consequences of their proposals, and develop techniques ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
High power densities and the implications of high operating temperatures on the failure rates of components are key driving factors of temperature-aware computing. Computer architects and system software designers need to understand the thermal consequences of their proposals, and develop techniques to lower operating temperatures to reduce both transient and permanent component failures. Until recently, tools for understanding temperature ramifications of designs have been mainly restricted to industry for studying packaging and cooling mechanisms, with little access to such toolsets for academic researchers. Developing such tools is an arduous task since it usually requires cross-cutting areas of expertise spanning architecture, systems software, thermodynamics, and cooling systems. Recognizing the need for such tools, there has been recent work on modeling temperatures of processors at the microarchitectural level which can be easily understood and employed by computer architects for processor designs. However, there is a dearth of such tools in the academic/research community for undertaking architectural/systems studies beyond a processor- a server box, rack or even a machine room. This paper presents a detailed 3-dimensional Computational Fluid Dynamics based thermal modeling tool, called ThermoStat, for rack-mounted server systems. Using this tool, we model a 20 (each with dual Xeon processors) node rack-mounted server system, and validate it with over 30 temperature sensor measurements at different points in the servers/rack. We conduct several experiments with this tool to show how different load conditions affect the thermal profile, and also illustrate how this tool can help design dynamic thermal management techniques.
SODA: Sensitivity Based Optimization of Disk Architecture
"... Storage plays a pivotal role in the performance of many applications. Optimizing disk architectures is a design-time as well as a run-time issue and requires balancing between performance, power and capacity. The design space is large and there are many “knobs ” that can be used to optimize disk dri ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Storage plays a pivotal role in the performance of many applications. Optimizing disk architectures is a design-time as well as a run-time issue and requires balancing between performance, power and capacity. The design space is large and there are many “knobs ” that can be used to optimize disk drive behavior. Here we present a sensitivity-based optimization for disk architectures (SODA) which leverages results from digital circuit design. Using detailed models of the electro-mechanical behavior of disk drives and a suite of realistic workloads, we show how SODA can aid in design and runtime optimization.
Virtual Machine Power Metering and Provisioning
"... Virtualization is often used in cloud computing platforms for its several advantages in efficiently managing resources. However, virtualization raises certain additional challenges, and one of them is lack of power metering for virtual machines (VMs). Power management requirements in modern data cen ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Virtualization is often used in cloud computing platforms for its several advantages in efficiently managing resources. However, virtualization raises certain additional challenges, and one of them is lack of power metering for virtual machines (VMs). Power management requirements in modern data centers have led to most new servers providing power usage measurement in hardware and alternate solutions exist for older servers using circuit and outlet level measurements. However, VM power cannot be measured purely in hardware. We present a solution for VM power metering, named Joulemeter. We build power models to infer power consumption from resource usage at runtime and identify the challenges that arise when applying such models for VM power metering. We show how existing instrumentation in server hardware and hypervisors can be used to build the required power models on real platforms with low error. Our approach is designed to operate with extremely low runtime overhead while providing practically useful accuracy. We illustrate the use of the proposed metering capability for VM power capping, a technique to reduce power provisioning costs in data centers. Experiments are performed on server traces from several thousand production servers, hosting Microsoft’s realworld applications such as Windows Live Messenger. The results show that not only does VM power metering allow virtualized data centers to achieve the same savings that non-virtualized data centers achieved through physical server power capping, but also that it enables further savings in provisioning costs with virtualization.
Thermal Modeling and Management of DRAM Memory Systems
- In Proceedings of ISCA
, 2007
"... With increasing speed and power density, high-performance memories, including FB-DIMM (Fully Buffered DIMM) and DDR2 DRAM, now begin to require dynamic thermal management (DTM) as processors and hard drives did. The DTM of memories, nevertheless, is different in that it should take the processor per ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
With increasing speed and power density, high-performance memories, including FB-DIMM (Fully Buffered DIMM) and DDR2 DRAM, now begin to require dynamic thermal management (DTM) as processors and hard drives did. The DTM of memories, nevertheless, is different in that it should take the processor performance and power consumption into consideration. Existing schemes have ignored that. In this study, we investigate a new approach that controls the memory thermal issues from the source generating memory activities – the processor. It will smooth the program execution when compared with shutting down memory abruptly, and therefore improve the overall system performance and power efficiency. For multicore systems, we propose two schemes called adaptive core gating and coordinated DVFS. The first scheme activates clock gating on selected processor cores and the second one scales down the frequency and voltage levels of processor cores when the memory is to be overheated. They can successfully control the memory activities and handle thermal emergency. More importantly, they improve performance significantly under the given thermal envelope. Our simulation results show that adaptive core gating improves performance by up to 23.3 % (16.3 % on average) on a four-core system with FB-DIMM when compared with DRAM thermal shutdown; and coordinated DVFS with control-theoretic methods improves the performance by up to 18.5 % (8.3 % on average).
FlashSim: A Simulator for NAND Flash-based Solid-State Drives
, 2009
"... NAND Flash memory-based Solid-State Disks (SSDs) are becoming popular as the storage media in domains ranging from mobile laptops to enterprise-scale storage systems due to a number of benefits (e.g., lighter weights, faster access times, lower power consumption, higher resistance to vibrations) the ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
NAND Flash memory-based Solid-State Disks (SSDs) are becoming popular as the storage media in domains ranging from mobile laptops to enterprise-scale storage systems due to a number of benefits (e.g., lighter weights, faster access times, lower power consumption, higher resistance to vibrations) they offer over the conventionally popular Hard Disk Drives (HDDs). While a number of well-regarded simulation environments exist for HDDs, the same is not yet true for SSDs. This is due to SSDs having been in the storage market for relatively less time as well as the lack of information (hardware configuration and software methods) about state-of-the-art SSDs that is publicly available. We describe the design and implementation of FlashSim, a simulator aimed at filling this void in performance evaluation of emerging storage systems that employ SSDs. FlashSim is an event-driven simulator that follows the objected-oriented programming paradigm for modularity. We have validated the performance of FlashSim against a number of commercial SSDs for behavioral similarity. We have also used FlashSim to compare the performance of SSD devices employing different Flash Translation Layer (FTL) schemes, and analyzed the energy consumption of different FTL schemes in the SSD. FlashSim has been written to be inter-operable with the well-regarded DiskSim simulator, thus enabling the simulation of a variety of “hybrid ” storage systems employing combinations of SSDs and HDDs. Given the current interest in such hybrid systems as opposed to systems with SSDs replacing HDDs (due to higher price), we believe this to be an especially useful feature of FlashSim. We have made FlashSim freely available for download with the hope that it would be of use to researchers exploring the design of SSD-based systems. 1
Software thermal management of DRAM memory for multicore systems
- In SIGMETRICS ’08: Proceedings of the 2008 International Conference on Measurement and Modeling of Computer Systems
, 2008
"... Thermal management of DRAM memory has become a critical issue for server systems. We have done, to our best knowledge, the first study of software thermal management for memory subsystem on real machines. Two recently proposed DTM (Dynamic Thermal Management) policies have been improved and implemen ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Thermal management of DRAM memory has become a critical issue for server systems. We have done, to our best knowledge, the first study of software thermal management for memory subsystem on real machines. Two recently proposed DTM (Dynamic Thermal Management) policies have been improved and implemented in Linux OS and evaluated on two multicore servers, a Dell PowerEdge 1950 server and a customized Intel SR1500AL server testbed. The experimental results first confirm that a systemlevel memory DTM policy may significantly improve system performance and power efficiency, compared with existing memory bandwidth throttling scheme. A policy called DTM-ACG (Adaptive Core Gating) shows performance improvement comparable to that reported previously. The average performance improvements are 13.3 % and 7.2 % on the PowerEdge 1950 and the SR1500AL (vs. 16.3 % from the previous simulation-based study), respectively. We also have surprising findings that reveal the weakness of the previous study: the CPU heat dissipation and its impact on DRAM memories, which were ignored, are significant factors. We have observed that the second policy, called DTM-CDVFS (Coordinated Dynamic Voltage and Frequency Scaling), has much better performance than previously reported for this reason. The average improvements are 10.8 % and 15.3 % on the two machines (vs. 3.4% from the previous study), respectively. It also significantly reduces the processor power by 15.5 % and energy by 22.7 % on average.
Sensitivity-Based Optimization of Disk Architecture
"... Abstract—Many applications, especially those that run on servers, are I/O intensive and therefore require high-performance storage systems. These high-end storage systems consume a large amount of power, the bulk of which is due to the disk drives. Optimizing disk architectures is a design-time, as ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract—Many applications, especially those that run on servers, are I/O intensive and therefore require high-performance storage systems. These high-end storage systems consume a large amount of power, the bulk of which is due to the disk drives. Optimizing disk architectures is a design-time, as well as a run-time, issue, and requires performance and power trade-offs. A hard disk designer needs to balance between the disk rotational speed (rotations per minute, RPM), platter sizes, and the number of platters. The RPM and platter sizes affect performance, and all three have an impact on power. A data center manager might have specific energy budgets within which she has to extract as much performance as possible. Applications themselves may have specific optimization requirements. Therefore, there are different figures of merit, such as performance and energy, and a large space of design and runtime “knobs ” that can be used to optimize disk drive behavior. Given such a large space, it is desirable to have a systematic methodology to optimally set these knobs to satisfy the figures of merit as efficiently as possible. In this paper, we present the Sensitivity-based Optimization methodology for Disk Architectures (SODA), which leverages results previously obtained in digital circuit design optimization scenarios. Using detailed models of the electromechanical behavior of disk drives, and a suite of realistic workloads, we show how SODA can aid in design and runtime optimization of disk drive architectures.

