Abstract:
To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarchitects have incorporated larger and more complex branch predictors into their designs, taking advantage of the increasing numbers of transistors available on a chip. Unfortunately, because of penalties associated with their implementations, the extra accuracy provided by many branch predictors does not produce a proportionate increase in performance. Specifically, we show that the techniques used to hide the latency of a large and complex branch predictor do not scale well and will be unable to sustain IPC for deeper pipelines.
Citations
|
1354
|
The simplescalar tool set, version 2.0
– Burger, Austin
- 1997
|
|
522
|
Combining branch predictors
– McFarling
- 1993
|
|
324
|
The alpha 21264 microprocessor
– Kessler
- 1999
|
|
234
|
Clock rate versus ipc: The end of the road for conventional microarchitectures
– Agarwal, Hrishikesh, et al.
- 2000
|
|
168
|
Cacti 3.0: An integrated cache timing, power and area model
– Shivakumar, Jouppi
- 2001
|
|
135
|
Two-level adaptive branch prediction
– Yeh, Patt
- 1991
|
|
92
|
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
– Yeh, Marr, et al.
- 1993
|
|
75
|
Trading Conflict and Capacity Aliasing in Conditional Branch Predictors
– Michaud, Seznec, et al.
- 1997
|
|
71
|
The impact of delay on the design of branch predictors. Microarchitecture, Dec 2000
– Jimenez, Keckler, et al.
|
|
61
|
Neural methods for dynamic branch prediction
– Jiménez, Lin
- 2002
|
|
57
|
Multiple-block ahead branch predictors
– Seznec, Jourdan, et al.
- 1996
|
|
46
|
The Cascaded Predictor: Economical and Adaptive Branch Target Prediction
– Driesen, Hoelzle
- 1998
|
|
31
|
Speculative updates of local and global branch history: Aquantitative analysis
– Skadron, Martonosi, et al.
- 1998
|
|
15
|
The optimal useful logic depth per pipeline stage is 6-8 fo4
– Hrishikesh, Jouppi, et al.
- 2002
|
|
12
|
Improving Branch Prediction by Understanding Branch Behavior
– Evers
- 2000
|
|
10
|
A 1GHz PA-RISC processor
– Tsai
- 2001
|
|
7
|
AMD’s next generation microprocessor architecture
– Weber
- 2001
|
|
6
|
Reinman and Norm Jouppi. Extensions to cacti
– Glenn
- 1999
|
|
4
|
Yiannakakis Sazeides. Design tradeoffs for the Alpha EV8 conditional branch predictor
– Seznec, Felix, et al.
- 2002
|
|
3
|
de Vries. AMD’s hammer microarchitecture preview
– Hans
- 2001
|