Searching for authors named "Andreas Moshovos" – sorted by Relevance.
-
Memory Dependence Speculation Tradeoffs
- We consider a variety of dynamic, hardware-based methods for exploiting load/store parallelism, including mechanisms that use memory dependence speculation. While previous work has also investigated such methods [19,4], this has been done primarily for split, distributed window processor models. We
- Cited by 5 (1 self) – Add To MetaCart
-
Exploiting Coarse Grain Non-Shared Regions in Snoopy Coherent Multiprocessors
- It has been shown that many requests miss in all remote nodes in shared memory multiprocessors. We are motivated by the observation that this behavior extends to much coarser grain areas of memory. We define a region to be a continuous, aligned memory area whose size is a power of two and observe th
- Cited by 2 (1 self) – Add To MetaCart
-
Power-Aware Register Renaming
- We propose power optimizations for the register renaming unit. Our optimizations reduce power dissipation in two ways. First, they reduce the number of read and write ports that are needed at the register alias table. Second, they reduce the number of internal checkpoints that are required to allow
- Cited by 3 (0 self) – Add To MetaCart
-
Speculative memory cloaking and bypassing
- We revisit memory hierarchy design viewing memory as an inter-operation communication mechanism. We show how dynamically collected information about inter-operation memory communication can be used to improve memory latency. We propose two techniques: (1) Speculative Memory Cloaking, and (2) Specula
- Cited by 13 (2 self) – Add To MetaCart
-
RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence
- It has been shown that many requests miss in all remote nodes in shared memory multiprocessors. We are motivated by the observation that this behavior extends to much coarser grain areas of memory. We define a region to be a continuous, aligned memory area whose size is a power of two and observe th
- Cited by 25 (1 self) – Add To MetaCart
-
Dynamic Speculation and Synchronization of Data Dependencies
- Data dependence speculation is used in instruction-level parallel (ILP) processors to allow early execution of an instruction before a logically preceding instruction on which it may be data dependent. If the instruction is independent, data dependence speculation succeeds; if not, it fails, and the
- Cited by 164 (21 self) – Add To MetaCart
-
Microarchitectural Innovations: Boosting Microprocessor Performance beyond Semiconductor Technology Scaling
- plentiful transistors to build microprocessors, and applications continue to drive the demand for more powerful microprocessors. Weaving the “raw ” semiconductor material into a microprocessor that offers the performance needed by modern and future applications is the role of computer architecture.
- Cited by 3 (1 self) – Add To MetaCart
-
Branchtap: Improving performance with very few checkpoints through adaptive speculation control
- Checkpoint prediction and intelligent management have been recently proposed for reducing the number of coarse-grain checkpoints needed to achieve high performance through speculative execution. In this work, we take a closer look at various checkpoint prediction and management alternatives, compari
- Cited by 3 (0 self) – Add To MetaCart
-
ReCast: Boosting Tag Line Buffer Coverage in Low-Power High-Level Caches 'for Free
- We revisit the idea of using small line buffers in-front of caches. We propose ReCast, a tiny tag set cache that filters a significant number of tag probes to the L2 tag array thus reducing power. The key contribution in ReCast is S-Shift, a simple indexing function (no logic involved just wires) th
- Cited by 2 (0 self) – Add To MetaCart
-
L-CBF: a low-power, fast counting Bloom filter architecture
- We study the energy, latency and area characteristics of two Counting Bloom Filter implementations using a commercial 0.13µm technology and full custom layouts. The first implementation, S-CBF, uses an SRAM array of counts and a shared counter. The second, L-CBF, utilizes an array of up/down linear
- Cited by 2 (0 self) – Add To MetaCart

