Results 1 -
5 of
5
Analysis of Branch Prediction via Data Compression
- in Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems
, 1996
"... Branch prediction is an important mechanism in modem microprocessor design. The focus of research in this area has been on designing new branch prediction schemes. In contrast, very few studies address the theoretical basis behind these prediction schemes. Knowing this theoretical basis helps us to ..."
Abstract
-
Cited by 79 (3 self)
- Add to MetaCart
Branch prediction is an important mechanism in modem microprocessor design. The focus of research in this area has been on designing new branch prediction schemes. In contrast, very few studies address the theoretical basis behind these prediction schemes. Knowing this theoretical basis helps us to evaluate how good a prediction scheme is and how much we can expect to improve its accuracy.
A System Level Perspective on Branch Architecture Performance
- THIS PAPER APPEARED IN THE 28TH INTL. SYMP. ON MICROARCHITECTURE
"... Accurate instruction fetch and branch prediction is increasingly important on today’s wide-issue architectures. Fetch prediction is the process of determining the next instruction to request from the memory subsystem. Branch prediction is the process of predicting the likely out-come of branch instr ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Accurate instruction fetch and branch prediction is increasingly important on today’s wide-issue architectures. Fetch prediction is the process of determining the next instruction to request from the memory subsystem. Branch prediction is the process of predicting the likely out-come of branch instructions. Many branch and fetch prediction architectures have been proposed, from simple static techniques to more sophisticated hardware designs. All these previous studies compare differing branch prediction architectures in terms of misprediction rates, branch penalties, or an idealized cycles per instruction. This paper provides a system-level performance comparison of several branch architectures using a full pipeline-level architectural simulator. The performance of various branch architectures is reported using execution time and cycles-per-instruction. For the programs we measured, our simulations show that having no branch prediction increases the execution time by 27%. By comparison, a highly accurate 512 entry branch target buffer architecture has an increased execution time of 1.5 % when compared to an architecture with perfect branch prediction. We also show that the most commonly used branch performance metrics, branch misprediction rates and the branch execution penalty, are highly correlated with program performance and are suitable metrics for architectural studies.
Limits to Branch Prediction
- In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII
, 1996
"... Branch prediction is an important mechanism in modern microprocessor design. The focus of research in this area has been on designing new branch prediction schemes. In contrast, very few studies address the inherent limit of predictability of program themselves. Programs have an inherent limit of pr ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Branch prediction is an important mechanism in modern microprocessor design. The focus of research in this area has been on designing new branch prediction schemes. In contrast, very few studies address the inherent limit of predictability of program themselves. Programs have an inherent limit of predictability due to the randomness of input data. Knowing the limit helps us to evaluate how good a prediction scheme is and how much we can expect to improve its accuracy. In this paper we propose two complementary approaches to estimating the limits of predictability: exact analysis of the program and the use of a universal compression/prediction algorithm, prediction by partial matching (PPM), that has been very successful in the field of data and image compression. We review the algorithmic basis for both some common branch predictors and PPM and show that two-level branch prediction, the best method currently in use, is a simplified version of PPM. To illustrate exact analysis, we use ...
Tagless Two-level Branch Prediction Schemes
, 1996
"... Per-address two-level branch predictors have been shown to be among the best predictors and have been implemented in current microprocessors. However, as the cycle time of modern microprocessors continue to decrease, the implementation of set-associative per-address twolevel branch predictors will b ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Per-address two-level branch predictors have been shown to be among the best predictors and have been implemented in current microprocessors. However, as the cycle time of modern microprocessors continue to decrease, the implementation of set-associative per-address twolevel branch predictors will become more difficult. In this paper, we revisit and analyze an alternative tagless, direct-mapped approach which is simpler, requires lower power, and has faster access time. The tagless predictor can also offer comparable performance to current setassociative designs since removal of tags allows more resources to be allocated for the predictor and branch target buffer (BTB). Further, removal of tags allows decoupling of the per-address predictors from the BTB, allowing the two components to be optimized individually. We show that tagless predictors are better than tagged predictors because of opportunities for better misshandling. Finally, we examine the system cost-benefit for tagless per...
Design optimization for high-speed per-address two-level branch predictors
- International Conference on Computer Design
, 1997
"... Per-address two-level branch predictors have been shown to be among the best predictors and have been implemented in current microprocessors. However, as the cycle time of modern microprocessors continues to decrease, the implementation of set-associative per-address two-level branch predictors will ..."
Abstract
- Add to MetaCart
Per-address two-level branch predictors have been shown to be among the best predictors and have been implemented in current microprocessors. However, as the cycle time of modern microprocessors continues to decrease, the implementation of set-associative per-address two-level branch predictors will become more difficult. Instead, direct-mapped designs may be more attractive. In this paper, we investigate an alternative implementation of the per-address two-level predictor referred to as the tagless, directmapped predictor, which is simpler and has faster access time. The tagless predictor can offer comparable performance to current set-associative designs since removal of tags allows more resources to be allocated for the predictor and branch target buffer (BTB). Removal of tags also decouples

