Results 1 - 10
of
42
Dynamic Branch Prediction with Perceptrons
"... This paper presents a new method for branch prediction. The key idea is to use one of the simplest possible neural networks, the perceptron, as an alternative to the commonly used two-bit counters. Our predictor achieves increased accuracy by making use of long branch histories, which are possible b ..."
Abstract
-
Cited by 123 (17 self)
- Add to MetaCart
This paper presents a new method for branch prediction. The key idea is to use one of the simplest possible neural networks, the perceptron, as an alternative to the commonly used two-bit counters. Our predictor achieves increased accuracy by making use of long branch histories, which are possible because the hardware resources for our method scale linearly with the history length. By contrast, other purely dynamic schemes require exponential resources. We describe our design and evaluate it with respect to two well known predictors. We show that for a 4K byte hardware budget our method improves misprediction rates for the SPEC 2000 benchmarks by 10.1 % over the gshare predictor. Our experiments also provide a better understanding of the situations in which traditional predictors do and do not perform well. Finally, we describe techniques that allow our complex predictor to operate in one cycle.
Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors
, 2001
"... The performance of out-of-order processors increases with the instruction window size. In conventional processors, the effective instruction window cannot be larger than the issue buffer. Determining which instructions from the issue buffer can be launched to the execution units is a timecritical op ..."
Abstract
-
Cited by 73 (0 self)
- Add to MetaCart
The performance of out-of-order processors increases with the instruction window size. In conventional processors, the effective instruction window cannot be larger than the issue buffer. Determining which instructions from the issue buffer can be launched to the execution units is a timecritical operation which complexity increases with the issue buffer size. We propose to relieve the issue stage by reordering instructions before they enter the issue buffer. This study introduces the general principle of data-flow prescheduling. Then we describe a possible implementation. Our preliminary results show that data-flow prescheduling makes it possible to enlarge the effective instruction window while keeping the issue buffer small. 1. Introduction Processor performance is strongly correlated with the clock cycle. Shorter clock cycle has been allowed by both improvements in silicon technology and careful processor design. As a consequence of this evolution, the IPC (average number of ins...
Neural Methods for Dynamic Branch Prediction
- ACM Transactions on Computer Systems
, 2002
"... This paper presents a new method for branch prediction that is highly accurate. The key idea is to use one of the simplest possible neural methods, the perceptron, as an alternative to the commonly used two-bit counters. The source of our predictor's accuracy is its ability to use long history lengt ..."
Abstract
-
Cited by 71 (9 self)
- Add to MetaCart
This paper presents a new method for branch prediction that is highly accurate. The key idea is to use one of the simplest possible neural methods, the perceptron, as an alternative to the commonly used two-bit counters. The source of our predictor's accuracy is its ability to use long history lengths, because the hardware resources for our method scale linearly, rather than exponentially, with the history length.
Dynamic history-length fitting: a third level of adaptivity for branch prediction
- SIGARCH Comput. Archit. News
, 1998
"... Accurate branch prediction is essential for obtaining high performance in pipelined superscalar processors that execute instructions speculatively. Some of the best current predictors combine a part of the branch address with a fixed amount of global history of branch outcomes in order to make a pre ..."
Abstract
-
Cited by 64 (1 self)
- Add to MetaCart
Accurate branch prediction is essential for obtaining high performance in pipelined superscalar processors that execute instructions speculatively. Some of the best current predictors combine a part of the branch address with a fixed amount of global history of branch outcomes in order to make a prediction. These predictors cannot perform uniformly well across all workloads because the best amount of history to be used depends on the code, the input data and the frequency of context switches. Consequently, all predictors that use a fixed history length are therefore unable to perform up to their maximum potential. We introduce a method —called DHLF — that dynamically determines the optimum history length during execution, adapting to the specific requirements of any code, input data and system workload. Our proposal adds an extra level of adaptivity to two-level adaptive branch predictors. The DHLF method can be applied to any one of the predictors that combine global branch history with the branch address. We apply the DHLF method to gshare (dhlf-gshare) and obtain near-optimal results for all SPECint95 benchmarks, with and without context switches. Some results are also presented for gskewed (dhlf-gskewed), confirming that other predictors can benefit from our proposal. 1.
Branch prediction, instruction-window size, and cache size: Performance tradeoffs and simulation techniques
- IEEE Transactions on Computers
, 1999
"... Design parameters interact in complex ways in modern processors, especially because out-of-order issue and decoupling buffers allow latencies to be overlapped. Tradeoffs among instruction-window size, branch-prediction accuracy, and instruction- and datacache size can change as these parameters move ..."
Abstract
-
Cited by 57 (18 self)
- Add to MetaCart
Design parameters interact in complex ways in modern processors, especially because out-of-order issue and decoupling buffers allow latencies to be overlapped. Tradeoffs among instruction-window size, branch-prediction accuracy, and instruction- and datacache size can change as these parameters move through different domains. For example, modeling unrealistic caches can under- or over-state the benefits of better prediction or a larger instruction window. Avoiding such pitfalls requires understanding how all these parameters interact. Because such methodological mistakes are common, this paper provides a comprehensive set of SimpleScalar simulation results from SPECint95 programs, showing the interactions among these major structures. In addition to presenting this database of simulation results, major mechanisms driving the observed tradeoffs are described. The paper also considers appropriate simulation techniques when sampling full-length runs with the SPEC reference inputs. In particular, the results show that branch mispredictions limit the benefits of larger instruction windows, that better branch prediction and better instruction cache behavior have synergistic effects, and that the benefits of larger instruction windows and larger data caches trade off and have overlapping effects. In addition, simulations of only 50 million instructions can yield representative results if these short windows are carefully selected.
An analysis of correlation and predictability: What makes two-level branch predictors work
- In Proceedings of the 25th Annual International Symposium on Computer Architecture
, 1998
"... Pipeline flushes due to branch mispredictions is one of the most serious problems facing the designer of a deeply pipelined, superscalar processor. Many branch predictors have been proposed to help alleviate this problem, including two-level adaptive branch predictors and hybrid branch predictors. N ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
Pipeline flushes due to branch mispredictions is one of the most serious problems facing the designer of a deeply pipelined, superscalar processor. Many branch predictors have been proposed to help alleviate this problem, including two-level adaptive branch predictors and hybrid branch predictors. Numerous studies have shown which predictors and configurations best predict the branches in a given set of benchmarks. Some studies have also investigated effects, such as pattern history table interference, that can be detrimental to the performance of these predictors. However, little research has been done on which characteristics of branch behavior make predictors perform well. In this paper, we investigate and quantify reasons why branches are predictable. We show that some of this predictability is not captured by the two-level adaptive branch predictors. An understanding of the predictability of branches may lead to insights ultimately resulting in better or less complex predictors. We also investigate and quantify what fraction of the branches in each benchmark is predictable using each of the methods described in this paper. 1.
Dataflow Analysis of Branch Mispredictions and Its Application to Early Resolution of Branch Outcomes
, 1998
"... The goal of this study is twofold: to analyze in detail the nature of conditional branch mispredictions in correlationbased branch predictors, and, based on this analysis, to reduce the impact of branch mispredictions on processor performance by decreasing the branch resolution delay instead of impr ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
The goal of this study is twofold: to analyze in detail the nature of conditional branch mispredictions in correlationbased branch predictors, and, based on this analysis, to reduce the impact of branch mispredictions on processor performance by decreasing the branch resolution delay instead of improving the branch prediction accuracy. We classify conditional branches with the highest number of mispredictions according to the nature of their branch condition analytical expression. Based on these expressions, we can analyze and even precisely explain the origin of mispredictions in many cases. Moreover, we find that many such branches belong to small sets of blocks inside loops, and within such sets we find that some of the branch expressions have regularity properties. We show how to exploit this regularity property by anticipating the branch outcome, where anticipation is a combination of value prediction and normal dataflow execution. We investigate a hardware mechanism to implement...
Variable Length Path Branch Prediction
- In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems
, 1998
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. Variable Length Path Branch Predictio ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. Variable Length Path Branch Prediction Jared Stark Marius Evers Yale N. Patt Department of Electrical Engineering and Computer Science The University of Michigan Ann Arbor, Michigan 48109-2122 fstarkj,olaf,pattg@eecs.umich.edu Abstract Accurate branch prediction is required to achieve high performance in deeply pipelined, wide-issue processors. Recent studies have shown that conditional and indirect (or computed) branch targets can be accurately predicted by recording the path, which consists of the target addresses of recent branches, leading up to the branch. In current path based branch predictors, the N most recent target addresses are hashed together to form an index into a table, where N is some fixed integer. The inde...
Design tradeoffs for the alpha EV8 conditional branch predictor
- in 29th Annual International Symposium on Computer Architecture
, 2002
"... This paper presents the Alpha EV8 conditional branch predictor. The Alpha EV8 microprocessor project, canceled in June 2001 in a late phase of development, envisioned an aggressive 8-wide issue out-of-order superscalar microarchitecture featuring a very deep pipeline and simultaneous multithreading. ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
This paper presents the Alpha EV8 conditional branch predictor. The Alpha EV8 microprocessor project, canceled in June 2001 in a late phase of development, envisioned an aggressive 8-wide issue out-of-order superscalar microarchitecture featuring a very deep pipeline and simultaneous multithreading. Performance of such a processor is highly dependent on the accuracy of its branch predictor and consequently a very large silicon area was devoted to branch prediction on EV8. The Alpha EV8 branch predictor relies on global history and features a total of 352 Kbits. The focus of this paper is on the different trade-offs performed to overcome various implementation constraints for the EV8 branch predictor. One such instance is the pipelining of the predictor on two cycles to facilitate the prediction of up to 16 branches per cycle from any two dynamically successive, 8 instruction fetch blocks. This resulted in the use of three fetch-block old compressed branch history information for accesing the predictor. Implementation constraints also restricted the composition of the index functions for the predictor and forced the usage of only singleported memory cells. Nevertheless, we show that the Alpha EV8 branch predictor achieves prediction accuracy in the same range as the state-of-the-art academic global history branch predictors that do not consider implementation constraints in great detail. 1
Control-Flow Speculation through Value Prediction for Superscalar Processors
, 1998
"... In this paper, we introduce a new branch predictor that predicts the outcomes of branches by predicting the value of their inputs and performing an early computation of their results according to the predicted values. The design of a hybrid predictor comprising our branch predictor and a correlating ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
In this paper, we introduce a new branch predictor that predicts the outcomes of branches by predicting the value of their inputs and performing an early computation of their results according to the predicted values. The design of a hybrid predictor comprising our branch predictor and a correlating branch predictor is presented. We also propose a new selector that chooses the most reliable prediction for each branch. This selector is based on the path followed to reach the branch. Results for immediate updates show that for a processor that already has a value prediction unit, our hybrid predictor, with a size of 4KB, achieves the same miss ratio as a conventional hybrid predictor of 64KB. The reduction in misprediction penalty is about 40% for all predictor sizes. Furthermore, if the cost of the value predictor is considered as an additional cost of the hybrid predictor, our proposal still reduces the miss ratio with respect to a conventional hybrid predictor for all different predic...

