Abstract:
Data dependencies have become one of the main bottlenecks of current superscalar processors. Data speculation is gaining popularity as a mechanism to avoid the ordering imposed by data dependencies. Loads and stores are very good candidates for data speculation since their effective address has a regular behavior and then, they are highly predictable. In this paper we propose a mechanism called Address Prediction and Data Prefetching that allows load instructions to obtain their data at the decode stage. Besides, the effective address of load and store instructions is also predicted. These instructions and those dependent on them are speculatively executed. The technique has been evaluated for an out-of-order processor with a realistic configuration. The performance gain is about 19% in average and it is much higher for some benchmarks (up to 35%). 1 Introduction Dependencies among the instructions that constitute a program are becoming one of the most important bottlenecks for curren...
Citations
|
3312
|
Computer Architecture a Quantitative Approach
– Hennessy, Patterson
- 1996
|
|
656
|
ATOM - A system for building customized program analysis tools
– Srivastava, Eustace
- 1994
|
|
446
|
Design and Evaluation of a Compiler Algorithm for Prefetching
– Mowry, Lam, et al.
- 1992
|
|
320
|
Value locality and load value prediction
– Lipasti, Wilkerson, et al.
- 1996
|
|
255
|
Exceeding the dataflow limit via value prediction
– Lipasti, Shen
- 1996
|
|
252
|
Superscalar Microprocessor Design
– Johnson
- 1990
|
|
234
|
Software prefetching
– Callahan, Kennedy, et al.
- 1991
|
|
203
|
An effective on-chip preloading scheme to reduce data access penalty
– Baer, Chen
- 1991
|
|
117
|
A performance study of software and hardware data prefetching schemes
– Chen, Baer
- 1994
|
|
84
|
Complexity/Performance Tradeoffs with Non-Blocking Loads
– Farkas, Jouppi
- 1994
|
|
79
|
A load instruction unit for pipelined processors
– Eickemeyer, Vassiliadis
- 1993
|
|
79
|
The Performance Potential of Data Dependence Speculation and Collapsing
– Sazeides, Vassiliadis, et al.
- 1996
|
|
66
|
Advanced Performance Features of the 64-bit PA-8000
– Hunt
- 1995
|
|
28
|
Compiler techniques for data prefetching on the powerPC
– Bernstein, Cohen, et al.
- 1995
|
|
18
|
Memory address prediction for data speculation
– González, González
- 1997
|
|
6
|
T.N.Vijaykumar, "Multiscalar Processors
– Sohi, Breach
- 1995
|
|
5
|
ARB: A Hardware Mechanims for Dynamic Reordering of Memory References
– Franklin, Sohi
- 1996
|
|
4
|
D.N.Pnevmastikatos G.S. Sohi, "Streamling Data Cache Access with Fast Address Calculation
– Austin
- 1995
|
|
4
|
Hardware Support fot Hiding Cache Latency
– Golden, Mudge
- 1993
|
|
4
|
Pleszkun "Implementing Precise Interrupts in Pipelined Processors
– Smith, R
- 1988
|
|
3
|
Run-time disambiguation: Copying with statically unpredictable dependences
– Nicolau
- 1989
|
|
1
|
Sohi "Zero-Cycle Loads: Microarchitecture Support for Reducing Load Latency
– Austin, S
- 1995
|
|
1
|
Janssens "Stride Directed Prefetching in Scalar Processors
– Fu, Patel, et al.
- 1992
|
|
1
|
Granston A.V. Veidenbaum. "Compiler-directed Data Prefetching in Multiprocessors with Memory Hierarchies
– Gornish, D
- 1990
|
|
1
|
Temam "Speculative Prefetching
– Jegou, O
- 1993
|
|
1
|
Vijaykumar and G.S. Sohi "A Dynamic Approach to Improve the Accuracy of Data Speculation
– Moshovos, Breach, et al.
- 1996
|
|
1
|
Blaner and R.J. Eickemeyer "SCISM: A Scalable Compound Instruction Set Machine Architecture
– Vassiliadis, B
- 1994
|