MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching (1996) [257 citations — 11 self]

Abstract:

Superscalar processors require sufficient instruction fetch bandwidth to feed their highly parallel execution cores. Fetch bandwidth is determined by a number of factors, namely instruction cache hit rate, branch prediction accuracy, and taken branches in the instruction stream. Taken branches introduce the problem of noncontiguous instruction fetching: the dynamic instruction sequence exists in the cache, but the instructions are not in contiguous cache locations. This report considers the problem of fetching noncontiguous blocks of instructions in a single cycle. We propose the trace cache, a special instruction cache that captures dynamic instruction sequences. Each line in the trace cache stores a dynamic code sequence, which may contain one or more taken branches. Dynamic sequences are built up as the program executes. If a predicted dynamic sequence exists in the trace cache, it can be fed directly to the decoders. We investigate other methods for fetching noncontiguous instructi...

Citations

689 Improving direct-mapped cache performance by the addition of a small fully associative cache and prefetch bu ers – Jouppi - 1990
374 A study of branch prediction strategies – Smith - 1981
221 Branch prediction strategies and branch target buffer – Lee, Smith - 1984
166 Improving the Accuracy of Dynamic Branch Prediction Using Branch Correlation – Pan, So, et al. - 1992
111 Optimization of instruction fetch mechanisms for high issue rates – Conte, Menezes, et al. - 1995
92 Increasing the instruction fetch rate via multiple branch prediction and a branch address cache – Yeh, Marr, et al. - 1993
84 Efficient program tracing – Larus - 1993
78 Branch History Table Prediction of Moving Targer Branches Due to Subroutine Returns – Kaeli, Emma - 1991
62 Instruction fetching: Coping with code bloat – Uhlig, Nagle, et al. - 1995
62 A comprehensive instruction fetch mechanism for a processor supporting speculative execution – Yeh, Patt - 1992
47 The fill-unit approach to multiple instruction issue – Franklin, Smotherman - 1994
44 Hardware support for large atomic units in dynamically scheduled machines – Melvin, Shebanow, et al. - 1988
36 Control flow prediction with treelike subgraphs for superscalar processors – Dutta, Franklin - 1995
21 Two-level Adaptive Branch Prediction and Instruction Fetch Mechanisms for High Performance Superscalar Processors – Yeh - 1993
5 Generalized history table for branch prediction – Losq - 1982
1 Machine organization of the ibm rs/6000 processor – Grohoski - 1990