Results 11 - 20
of
52
Vector Microprocessors
- In Hot Chips VII
, 1998
"... Vector Microprocessors by Krste Asanovic Doctor of Philosophy in Computer Science University of California, Berkeley Professor John Wawrzynek, Chair Most previous research into vector architectures has concentrated on supercomputing applications and small enhancements to existing vector superc ..."
Abstract
-
Cited by 62 (4 self)
- Add to MetaCart
Vector Microprocessors by Krste Asanovic Doctor of Philosophy in Computer Science University of California, Berkeley Professor John Wawrzynek, Chair Most previous research into vector architectures has concentrated on supercomputing applications and small enhancements to existing vector supercomputer implementations. This thesis expands the body of vector research by examining designs appropriate for single-chip full-custom vector microprocessor implementations targeting a much broader range of applications. I present the design, implementation, and evaluation of T0 (Torrent-0): the first single-chip vector microprocessor. T0 is a compact but highly parallel processor that can sustain over 24 operations per cycle while issuing only a single 32-bit instruction per cycle. T0 demonstrates that vector architectures are well suited to full-custom VLSI implementation and that they perform well on many multimedia and human-machine interface tasks. The remainder of the thesis contains ...
Power-conscious Joint Scheduling of Periodic Task Graphs and Aperiodic Tasks in Distributed Real-time Embedded Systems
, 2000
"... In this paper , we present a power-conscious algorithm for jointly scheduling multi-rate periodic task graphs and aperiodic tasks in distributed real-time embedded systems. While the periodic task graphs have hard deadlines, the aperiodic tasks can have either hard or soft deadlines. Periodic task g ..."
Abstract
-
Cited by 60 (2 self)
- Add to MetaCart
In this paper , we present a power-conscious algorithm for jointly scheduling multi-rate periodic task graphs and aperiodic tasks in distributed real-time embedded systems. While the periodic task graphs have hard deadlines, the aperiodic tasks can have either hard or soft deadlines. Periodic task graphs are first scheduled statically. Slots are created in this static schedule to accommodate hard aperiodic tasks. Soft aperiodic tasks are scheduled dynamically with an on-line scheduler. Flexibility is introduced into the static schedule and optimized to allow the on-line scheduler to make dynamic modifications to the static schedule. This helps minimize the response times of soft aperiodic tasks through both resource reclaiming and slack stealing. Of course, the validity of the static schedule is maintained. The on-line scheduler also employs dynamic voltage scaling and power management to obtain a power-efficient schedule. Experimental results show that the flexibility introduced into the static schedule helps improve the response times of soft aperiodic tasks by up to 43%. Dynamic voltage scaling and power management reduce power by up to 68%. The scheme in which the static schedule is allowed to be flexible achieves up to 32% more power saving compared to the scheme in which no flexibility is allowed, when both schemes are power-conscious. Our work gives an average architecture price saving of 30% over a previous approach for embedded system architectures synthesized with execution slots for hard aperiodic tasks present. 1.
Power Optimization of Variable-Voltage Core-Based Systems
- IEEE Trans. Computer-Aided Design
, 1999
"... The growing class of portable systems, such as personal computing and communication devices, has resulted in a new set of system design requirements, mainly characterized by dominant importance of power minimization and design reuse. The energy efficiency of systems-on-a-chip (SOC) could be much imp ..."
Abstract
-
Cited by 56 (4 self)
- Add to MetaCart
The growing class of portable systems, such as personal computing and communication devices, has resulted in a new set of system design requirements, mainly characterized by dominant importance of power minimization and design reuse. The energy efficiency of systems-on-a-chip (SOC) could be much improved if one were to vary the supply voltage dynamically at run time. We develop the design methodology for the lowpower core-based real-time SOC based on dynamically variable voltage hardware. The key challenge is to develop effective scheduling techniques that treat voltage as a variable to be determined, in addition to the conventional task scheduling and allocation. Our synthesis technique also addresses the selection of the processor core and the determination of the instruction and data cache size and configuration so as to fully exploit dynamically variable voltage hardware, which results in significantly lower power consumption for a set of target applications than existing techniques. The highlight of the proposed approach is the nonpreemptive scheduling heuristic, which results in solutions very close to optimal ones for many test cases. The effectiveness of the approach is demonstrated on a variety of modern industrialstrength multimedia and communication applications.
Compiler-directed dynamic frequency and voltage scheduling
- In Workshop on Power-Aware Computer Systems
, 2000
"... 1 Introduction Modern architectures have a large gap between thespeeds of the memory and the processor. Several techniques exist to bridge this gap, including mem-ory pipelines (outstanding reads/writes), cache hierarchies, and large register sets. Most of these ar-chitectural features exploit the f ..."
Abstract
-
Cited by 48 (8 self)
- Add to MetaCart
1 Introduction Modern architectures have a large gap between thespeeds of the memory and the processor. Several techniques exist to bridge this gap, including mem-ory pipelines (outstanding reads/writes), cache hierarchies, and large register sets. Most of these ar-chitectural features exploit the fact that computations have temporal and/or spatial locality. However,many computations have limited locality, or even no locality at all. In addition, the degree of locality maybe different for different program regions. Such computations may lead to a significant mismatch between
Dynamic Voltage Scheduling Technique for Low-Power Multimedia Applications Using Buffers
, 2001
"... As multimedia applications are used increasingly in many embedded systems, power efficient design for the applications becomes more important than ever. This paper proposes a simple dynamic voltage scheduling technique, which suits the multimedia applications well. The proposed technique fully utili ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
As multimedia applications are used increasingly in many embedded systems, power efficient design for the applications becomes more important than ever. This paper proposes a simple dynamic voltage scheduling technique, which suits the multimedia applications well. The proposed technique fully utilizes the idle intervals with buffers in a variable speed processor. The main theme of this paper is to determine the minimum buffer size to achieve the maximum energy saving in three cases: single-task, multiple subtasks, and multi-task. Experimental results show that the proposed technique is expected to obtain significant power reduction for several real-world multimedia applications.
A Fully Digital, Energy-Efficient, Adaptive Power-Supply Regulator
- IEEE Journal of Solid-State Circuits
, 1999
"... A voltage scaling technique for energy-efficient operation requires an adaptive power-supply regulator to significantly reduce dynamic power consumption in synchronous digital circuits. A digitally controlled power converter that dynamically tracks circuit performance with a ring oscillator and regu ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
A voltage scaling technique for energy-efficient operation requires an adaptive power-supply regulator to significantly reduce dynamic power consumption in synchronous digital circuits. A digitally controlled power converter that dynamically tracks circuit performance with a ring oscillator and regulates the supply voltage to the minimum required to operate at a desired frequency is presented. This paper investigates the issues involved in designing a fully digital power converter and describes a design fabricated in a MOSIS 0.8-µm process. A variable-frequency digital controller design takes advantage of the power savings available through adaptive supply-voltage scaling and demonstrates converter efficiency greater than 90 % over a dynamic range of regulated voltage levels.
Synchroscalar: A multiple clock domain, power-aware, tile-based embedded processor
- in Proceedings of the International Symposium on Computer Architecture
, 2004
"... We present Synchroscalar, a tile-based architecture for embedded processing that is designed to provide the flexibility of DSPs while approaching the power efficiency of ASICs. We achieve this goal by providing high parallelism and voltage scaling while minimizing control and communication costs. Sp ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
We present Synchroscalar, a tile-based architecture for embedded processing that is designed to provide the flexibility of DSPs while approaching the power efficiency of ASICs. We achieve this goal by providing high parallelism and voltage scaling while minimizing control and communication costs. Specifically, Synchroscalar uses columns of processor tiles organized into statically-assigned frequency-voltage domains to minimize power consumption. Furthermore, while columns use SIMD control to minimize overhead, data-dependent computations can be supported by extremely flexible statically-scheduled communication between columns. We provide a detailed evaluation of Synchroscalar including SPICE simulation, wire and device models, synthesis of key components, cycle-level simulation, and compiler- and hand-optimized signal processing applications. We find that the goal of meeting, not exceeding, performance targets with data-parallel applications leads to designs that depart significantly from our intuitions derived from general-purpose microprocessor design. In particular, synchronous design and substantial global interconnect are desirable in the low-frequency, low-power domain. This global interconnect supports parallelization and reduces processor idle time, which are critical to energy efficient implementations of high bandwidth signal processing. Overall, Synchroscalar provides programmability while achieving power efficiencies within 8-30X of known ASIC implementations, which is 10-60X better than conventional DSPs. In addition, frequency-voltage scaling in Synchroscalar provides between 3-32 % power savings in our application suite. 1.
Application-Directed Voltage Scaling
, 2002
"... Clock (and voltage) scheduling is an important technique to reduce the energy consumption of processors that support voltage scaling. It is difficult, however, to achieve good results using only statistics from the OS level when applications show bursty (unpredictable) behavior. We take the approach ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Clock (and voltage) scheduling is an important technique to reduce the energy consumption of processors that support voltage scaling. It is difficult, however, to achieve good results using only statistics from the OS level when applications show bursty (unpredictable) behavior. We take the approach that such applications must be made power-aware and specify their Average Execution Time (AET) and the deadline to the scheduler controlling the clock speed and processor voltage. This paper describes our Energy Priority Scheduling (EPS) algorithm supporting power-aware applications. EPS orders tasks according to how tight their deadlines are and how often tasks overlap. Low-priority tasks are scheduled first, since they can be easily preempted to accommodate for high-priority tasks later. The EPS algorithm does not always yield the optimal schedule, but has a low complexity. We have implemented EPS on a StrongARM-based variable-voltage platform. We conducted experiments with a modified video decoder that estimates the AET of each frame. Measurements show that application-directed voltage scaling reduces processor power consumption with 50% for the bursty video decoder without missing any frame deadlines.
Energy consumption and garbage collection in low-powered computing
, 2002
"... We have measured the energy efficiency of different memory management strategies on a high performance pocket computer. We conducted our study by measuring the energy consumption of eight C programs with four different memory management strategies each. The memory management strategies are: no deall ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We have measured the energy efficiency of different memory management strategies on a high performance pocket computer. We conducted our study by measuring the energy consumption of eight C programs with four different memory management strategies each. The memory management strategies are: no deallocation, explicit deallocation, conservative mark-and-sweep garbage collection, and conservative mark-and-sweep incremental garbage collection. Our measurements show that different memory management strategies have very different energy requirements. In the most extreme case, one program consumed 40 times as much energy with incremental garbage collection than with explicit deallocation. We demonstrate that, although overall energy use is strongly correlated with execution time, the processor and peripheral energies separately do not correlate well with execution time.
Synthesis Of Power Efficient Systems-On-Silicon
- In Asia and South Pacific Design Automation Conference
, 1998
"... We developed a new modular synthesis approach for design of low-power core-based data-intensive application-specific systems on silicon. The power optimization is conducted in three steps: minimization of instruction cache misses, placement of frequently executed sequential basic blocks of code in c ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We developed a new modular synthesis approach for design of low-power core-based data-intensive application-specific systems on silicon. The power optimization is conducted in three steps: minimization of instruction cache misses, placement of frequently executed sequential basic blocks of code in consecutive Gray code addressed memory locations, and processor and cache applicationdriven selection for low-power. In order to bridge the gap between the profiling and modeling tools from the two traditionally disjoint synthesis domains (architecture and CAD), we developed a new synthesis and evaluation platform. The platform integrates the existing modeling, profiling, and simulation tools with the developed system-level synthesis tools. The effectiveness of the approach is demonstrated on a variety of modern industrial-strength multimedia and communication applications.

