Results 1 - 10
of
25
Process Cruise Control: Event-Driven Clock Scaling for Dynamic Power Management
, 2002
"... Scalability of the core frequency is a common feature of low-power processor architectures. Many heuristics for frequency scaling were proposed in the past to find the best trade-off between energy efficiency and computational performance. With complex applications exhibiting unpredictable behavior ..."
Abstract
-
Cited by 78 (5 self)
- Add to MetaCart
Scalability of the core frequency is a common feature of low-power processor architectures. Many heuristics for frequency scaling were proposed in the past to find the best trade-off between energy efficiency and computational performance. With complex applications exhibiting unpredictable behavior these heuristics cannot reliably adjust the operation point of the hardware because they do not know where the energy is spent and why the performance is lost. Embedded hardware monitors in the form of event counters have proven to offer valuable information in the field of performance analysis. We will demonstrate that counter values can also reveal the power-specific characteristics of a thread. In this paper we propose an energy-aware scheduling policy for non-real-time operating systems that benefits from event counters. By exploiting the information from these counters, the scheduler determines the appropriate clock frequency for each individual thread running in a time-sharing environment. A recurrent analysis of the thread-specific energy and performance profile allows an adjustment of the frequency to the behavioral changes of the application. While the clock frequency may vary in a wide range, the application performance should only suffer slightly. Because of the similarity to a car cruise control, we called our scheduling policy Process Cruise Control. This adaptive clock scaling is accomplished by the operating system without any application support. Process Cruise Control has been implemented on the Intel XScale architecture, that offers a variety of frequencies and a set of configurable event counters. Energy measurements of the target architecture under variable load show the advantage of the proposed approach.
Compiler-Directed Dynamic Voltage Scaling for Memory-Bound Applications
, 2002
"... This paper presents the design and implementation of a compiler algorithm that effectively reduces the energy usage of memory-bound applications via dynamic voltage scaling (DVS). The algorithm identifies program regions where the CPU can be slowed down with negligible performance penalty. It is imp ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
This paper presents the design and implementation of a compiler algorithm that effectively reduces the energy usage of memory-bound applications via dynamic voltage scaling (DVS). The algorithm identifies program regions where the CPU can be slowed down with negligible performance penalty. It is implemented as a source-to-source level transformation using the SUIF2 compiler infrastructure. Physical measurements on a laptop with a 600 MHz - 1.2 GHz AMD Athlon 4 processor show that CPU energy savings in the range of 9.17% to 55.65% can be achieved with performance degradation in the range of 0.69% to 6.14% for the SPECfp95 benchmarks. On average, the energy and energy-delay product are reduced by 26.58% and 24.11%, respectively, at the cost of the performance slowdown of 3.26%. This paper also discusses a new methodology which attempts to approximate the minimum energy usage by any DVS algorithm. Our compiler-directed DVS algorithm is within 6% from the "optimal" case. To the best of our knowledge, this is one of the first work that evaluates DVS strategies by physical measurements.
Profile Guided Selection of ARM and Thumb Instructions
, 2002
"... The ARM processor core is a leading processor design for the embedded domain. In the embedded domain, both memory and energy are important concerns. For this reason the 32 bit ARM processor also supports the 16 bit Thumb instruction set. For a given program, typically the Thumb code is smaller than ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
The ARM processor core is a leading processor design for the embedded domain. In the embedded domain, both memory and energy are important concerns. For this reason the 32 bit ARM processor also supports the 16 bit Thumb instruction set. For a given program, typically the Thumb code is smaller than the ARM code. Therefore by using Thumb code the I-cache activity, and hence the energy consumed by the I-cache, can be reduced. However, the limitations of the Thumb instruction set, in comparison to the ARM instruction set, can often lead to generation of poorer quality code. Thus, while Thumb code may be smaller than ARM code, it may perform poorly and thus may not lead to overall energy savings.
Full-system power analysis and modeling for server environments
- In Workshop on Modeling Benchmarking and Simulation (MOBS
, 2006
"... Abstract — The increasing costs of power delivery and cooling, as well as the trend toward higher-density computer systems, have created a growing demand for better power management in server environments. Despite the increasing interest in this issue, little work has been done in quantitatively und ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
Abstract — The increasing costs of power delivery and cooling, as well as the trend toward higher-density computer systems, have created a growing demand for better power management in server environments. Despite the increasing interest in this issue, little work has been done in quantitatively understanding power consumption trends and developing simple yet accurate models to predict full-system power. We study the component-level power breakdown and variation, as well as temporal workload-specific power consumption of an instrumented power-optimized blade server. Using this analysis, we examine the validity of prior adhoc approaches to understanding power breakdown and quantify several interesting trends important for power modeling and management in the future. We also introduce Mantis, a nonintrusive method for modeling full-system power consumption and providing real-time power prediction. Mantis uses a onetime calibration phase to generate a model by correlating AC power measurements with user-level system utilization metrics. We experimentally validate the model on two server systems with drastically different power footprints and characteristics (a low-end blade and high-end compute-optimized server) using a variety of workloads. Mantis provides power estimates with high accuracy for both overall and temporal power consumption, making it a valuable tool for power-aware scheduling and analysis. I.
Partitioned Instruction Cache Architecture for Energy Efficiency
- ACM Transactions on Embedded Computing Systems
, 2003
"... this paper studies energy-e#cient cache architectures in the memory hierarchy that can have a signi#cant impact on the overall system energy consumption ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
this paper studies energy-e#cient cache architectures in the memory hierarchy that can have a signi#cant impact on the overall system energy consumption
DRAMsim: A Memory System Simulator
- ACM SIGARCH Computer Architecture News
"... As memory accesses become slower with respect to the processor and consume more power with increasing memory size, the focus of memory performance and power consumption has become increasingly important. With the trend to develop multi-threaded, multi-core processors, the demands on the memory syste ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
As memory accesses become slower with respect to the processor and consume more power with increasing memory size, the focus of memory performance and power consumption has become increasingly important. With the trend to develop multi-threaded, multi-core processors, the demands on the memory system will continue to scale. However, determining the optimal memory system configuration is non-trivial. The memory system performance is sensitive to a large number of parameters. Each of these parameters take on a number of values and interact in fashions that make overall trends difficult to discern. A comparison of the memory system architectures becomes even harder when we add the dimensions of power consumption and manufacturing cost. Unfortunately, there is a lack of tools in the public-domain that support such studies. Therefore, we introduce DRAMsim, a detailed and highlyconfigurable C-based memory system simulator to fill this gap. DRAMsim implements detailed timing models for a variety of existing memories, including SDRAM, DDR, DDR2, DRDRAM and FB-DIMM,with the capability to easily vary their parameters. It also models the power consumption of SDRAM and its derivatives. It can be used as a standalone simulator or as part of a more comprehensive system-level model. We have successfully integrated DRAMsim into a variety of simulators including MASE[15], Sim-alpha[14], BOCHS[2] and GEMS[13].The simulator can be downloaded from www.ece.umd.edu/dramsim. 1
Toward an evaluation infrastructure for power and energy optimizations
- In Workshop on HighPerformance, Power-Aware Computing
, 2005
"... Execution-driven simulators are often used for power/energy and performance evaluation. Simulators can provide semantic details but they provide insufficient speed and accuracy for compiler and OS research. Physical measurement is fast and objective but lacks a semantic connection between the measur ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Execution-driven simulators are often used for power/energy and performance evaluation. Simulators can provide semantic details but they provide insufficient speed and accuracy for compiler and OS research. Physical measurement is fast and objective but lacks a semantic connection between the measurement result and the evaluated program. The objective of our research is to bring together the advantages of simulation and physical measurement to build an infrastructure for power and energy optimization. Power and energy behavior is obtained through physical measurement. Simulation is used for observing the connection between power and energy behavior and the evaluated program. Our preliminary results demonstrate the ability of this infrastructure to capture detailed power behavior of any region of a program. To simplify the power/energy evaluation of programs with long execution times and overcome the limitation of physical devices, we propose using the SimPoints methodology developed by researchers at UC San Diego to find representative slices of a program. Through simulation, we validate the feasibility of the SimPoint idea in simplifying power/energy evaluation. We expect that this infrastructure will help researchers in OS/compiler power/energy optimization to evaluate their optimizations more efficiently and observe more optimization opportunities. 1.
Into the Wild: Studying Real User Activity Patterns to Guide Power Optimizations for Mobile Architectures
"... As the market for mobile architectures continues its rapid growth, it has become increasingly important to understand and optimize the power consumption of these battery-driven devices. While energy consumption has been heavily explored, there is one critical factor that is often overlooked – the en ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
As the market for mobile architectures continues its rapid growth, it has become increasingly important to understand and optimize the power consumption of these battery-driven devices. While energy consumption has been heavily explored, there is one critical factor that is often overlooked – the end user. Ultimately, the energy consumption of a mobile architecture is defined by user activity. In this paper, we study mobile architectures in their natural environment – in the hands of the end user. Specifically, we develop a logger application for Android G1 mobile phones and release the logger into the wild to collect traces of real user activity. We then show how the traces can be used to characterize power consumption, and guide the development of power optimizations.
Compiler-Directed Dynamic Voltage and Frequency Scaling for CPU Power and Energy Reduction
, 2003
"... OF THE DISSERTATION COMPILER-DIRECTED DYNAMIC VOLTAGE AND FREQUENCY SCALING FOR CPU POWER AND ENERGY REDUCTION by Chung-Hsing Hsu Dissertation Director: Ulrich Kremer The high power consumption of a processor is becoming a critical problem for both battery-powered devices and high-performance ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
OF THE DISSERTATION COMPILER-DIRECTED DYNAMIC VOLTAGE AND FREQUENCY SCALING FOR CPU POWER AND ENERGY REDUCTION by Chung-Hsing Hsu Dissertation Director: Ulrich Kremer The high power consumption of a processor is becoming a critical problem for both battery-powered devices and high-performance computers. It reduces circuit reliability, complicates the cooling technology, shortens the battery lifetime, and increases the production and operation costs of a CPU. One e#ective technique, called dynamic voltage scaling (DVS), achieves CPU power reduction through lowering the CPU supply voltage and clock frequency at runtime. It is e#ective because the CPU power is proportional to the clock frequency and to the square of the supply voltage. However, the CPU power savings come at the cost of degraded performance due to the slower clock frequency. Furthermore, the longer the CPU runs, the more power other computer components (e.g., disk and screen) will consume; not to mention that a user may not be willing to sacrifice any performance. Therefore, DVS should only be applied when it will not noticeably a#ect performance.
Multiobjective Synthesis of Low-Power Real-Time Distributed Embedded Systems
, 2002
"... This dissertation presents methods for automating the synthesis of embedded systems, i.e., special-purpose computers. In addition, it describes a method for analyzing the manner in which real-time operating system use influences embedded system power consumption. ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This dissertation presents methods for automating the synthesis of embedded systems, i.e., special-purpose computers. In addition, it describes a method for analyzing the manner in which real-time operating system use influences embedded system power consumption.

