Results 1 - 10
of
91
The predictability of data values
- IN PROCEEDINGS OF THE 30TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE
, 1997
"... ..."
(Show Context)
Phase Tracking and Prediction
, 2003
"... In a single second a modern processor can execute billions of instructions. Obtaining a bird's eye view of the behavior of a program at these speeds can be a difficult task when all that is available is cycle by cycle examination. In many programs, behavior is anything but steady state, and und ..."
Abstract
-
Cited by 233 (19 self)
- Add to MetaCart
(Show Context)
In a single second a modern processor can execute billions of instructions. Obtaining a bird's eye view of the behavior of a program at these speeds can be a difficult task when all that is available is cycle by cycle examination. In many programs, behavior is anything but steady state, and understanding the patterns of behavior, at run-time, can unlock a multitude of optimization opportunities.
Trading Conflict and Capacity Aliasing in Conditional Branch Predictors
- In Proceedings of the 24th International Symposium on Computer Architecture
, 1997
"... As modern microprocessors employ deeper pipelines and issue multiple instructions per cycle, they are becoming increasingly dependent on accurate branch prediction. Because hardware resources for branch-predictor tables are invariably limited, it is not possible to hold all relevant branch history f ..."
Abstract
-
Cited by 95 (8 self)
- Add to MetaCart
(Show Context)
As modern microprocessors employ deeper pipelines and issue multiple instructions per cycle, they are becoming increasingly dependent on accurate branch prediction. Because hardware resources for branch-predictor tables are invariably limited, it is not possible to hold all relevant branch history for all active branches at the same time, especially for large workloads consisting of multiple processes and operating-system code. The problem that results, commonly referred to as aliasing in the branch-predictor tables, is in many ways similar to the misses that occur in finite-sized hardware caches. In this paper we propose a new classification for branch aliasing based on the three-Cs model for caches, and show that conflict aliasing is a significant source of mispredictions. Unfortunately, the obvious method for removing conflicts -- adding tags and associativity to the predictor tables -- is not a cost-effective solution. To address this problem, we propose the skewed branch predict...
The Case for Efficient File Access Pattern Modeling
, 1996
"... Most modern I/O systems treat each file access independently. However, events in a computer system are driven by programs. Thus, accesses to files occur in consistent patterns and are by no means independent. The result is that modern I/O systems ignore useful information. Using traces of file syste ..."
Abstract
-
Cited by 63 (12 self)
- Add to MetaCart
(Show Context)
Most modern I/O systems treat each file access independently. However, events in a computer system are driven by programs. Thus, accesses to files occur in consistent patterns and are by no means independent. The result is that modern I/O systems ignore useful information. Using traces of file system activity we show that file accesses are strongly correlated with preceding accesses. In fact, a simple last-successor model (one that predicts each file access will be followed by the same file that followed the last time it was accessed) successfully predicted the next file 72% of the time. We examine the ability of two previously proposed models for file access prediction in comparison to this baseline model and see a stark contrast in accuracy and high overheads in state space. We then enhance one of these models to address the issues of model space requirements. This new model is able to improve an additional 10% on the accuracy of the last-successor model, while working within a state...
Accurate Indirect Branch Prediction
- IN PROCEEDINGS OF THE 25TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE
, 1998
"... Indirect branch prediction is likely to become increasingly important in the future because indirect branches occur more frequently in object-oriented programs. With misprediction rates of around 25% on current processors, indirect branches can incur a significant fraction of branch misprediction ov ..."
Abstract
-
Cited by 60 (0 self)
- Add to MetaCart
Indirect branch prediction is likely to become increasingly important in the future because indirect branches occur more frequently in object-oriented programs. With misprediction rates of around 25% on current processors, indirect branches can incur a significant fraction of branch misprediction overhead even though they remain less frequent than the more predictable conditional branches. We investigate a wide range of two-level predictors dedicated exclusively to indirect branches. Starting with predictors that use full-precision addresses and unlimited tables, we progressively introduce hardware constraints and minimize the loss of predictor performance at each step. For programs from the SPECint95 suite as well as a suite of large C++ applications, a two-level predictor achieves a misprediction rate of 9.8% with a 1K-entry table and 7.3% with an 8K-entry table, representing more than a threefold improvement over an ideal BTB. A hybrid predictor further reduces the misprediction rates to 8.98% and 5.95%, respectively.
Implementations of Context Based Value Predictors
, 1997
"... Execution paradigms that eliminate data dependences based on value prediction have been shown to have significant performance potential. High accuracy value prediction is essential for the success of such paradigms. Recently it was shown that context based prediction can predict values with high a ..."
Abstract
-
Cited by 59 (3 self)
- Add to MetaCart
(Show Context)
Execution paradigms that eliminate data dependences based on value prediction have been shown to have significant performance potential. High accuracy value prediction is essential for the success of such paradigms. Recently it was shown that context based prediction can predict values with high accuracy.
Transition phase classification and prediction
- In 11th International Symposium on High Performance Computer Architecture
, 2005
"... Most programs are repetitive, where similar behavior can be seen at different execution times. Proposed on-line systems automatically group these similar intervals of execution into phases, where the intervals in a phase have homogeneous behavior and similar resource requirements. These systems are ..."
Abstract
-
Cited by 50 (7 self)
- Add to MetaCart
(Show Context)
Most programs are repetitive, where similar behavior can be seen at different execution times. Proposed on-line systems automatically group these similar intervals of execution into phases, where the intervals in a phase have homogeneous behavior and similar resource requirements. These systems are driven by algorithms that dynamically classify intervals of execution into phases and predict phase changes. In this paper, we examine several improvements to dynamic phase classification and prediction. The first improvement is to appropriately deal with phase transitions. This modification identifies phase transitions for what they are, instead of classifying them into a new phase, which increases phase prediction accuracy. We also describe an adaptive system that dynamically adjusts classification thresholds and splits phases with poor homogeneity. This modification increase the homogeneity of the hardware metrics across the intervals in each phase. We improve phase prediction accuracy by applying confidence to phase prediction, and we develop architectures that can accurately predict the outcome of the next phase change, and the length of the next phase. 1
Memory Dependence Prediction
, 1998
"... As the existing techniques that empower the modern high-performance processors are being refined and as the underlying technology trade-offs change, new bottlenecks are exposed and new challenges are raised. This thesis introduces a new tool, Memory Dependence Prediction that can be useful in combat ..."
Abstract
-
Cited by 39 (5 self)
- Add to MetaCart
(Show Context)
As the existing techniques that empower the modern high-performance processors are being refined and as the underlying technology trade-offs change, new bottlenecks are exposed and new challenges are raised. This thesis introduces a new tool, Memory Dependence Prediction that can be useful in combating these bottlenecks and meeting the new challenges. Memory dependence prediction is a technique to guess whether a load or a store will experience a dependence. Memory dependence prediction exploits regularity in the memory dependence stream of ordinary programs, a phenomenon which is also identified in this thesis. To demonstrate the utility of memory dependence prediction this thesis also presents the following three novel microarchitectural techniques: 1. Dynamic Speculation/Synchronization of Memory Dependences: this thesis demonstrates that to exploit parallelism over larger regions of code waiting to determine the dependences a load has is not the best performing option. Higher performance is possible if memory dependence speculation is used especially if memory dependence prediction is used to guide this speculation.
Improving Branch Predictors by Correlating on Data Values
, 1999
"... Branch predictors typically use combinations of branch PC bits and branch histories to make predictions. Recent improvements in branch predictors have come from reducing the effect of interference, i.e. multiple branches mapping to the same table entries. In contrast, the branch difference predictor ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
Branch predictors typically use combinations of branch PC bits and branch histories to make predictions. Recent improvements in branch predictors have come from reducing the effect of interference, i.e. multiple branches mapping to the same table entries. In contrast, the branch difference predictor (BDP) uses data values as additional information to improve the accuracy of conditional branch predictors. The BDP maintains a history of differences between branch source register operands, and feeds these into the prediction process. An important component of the BDP is a rare event predictor (REP) which reduces learning time and table interference. An REP is a cache-like structure designed to store patterns whose predictions differ from the norm. Initially, ideal interference-free predictors are evaluated to determine how data values improve correlation. Next, execution driven simulations of complete designs realize this potential. The BDP reduces the misprediction rate of five SPEC95 ...
Performance prediction based on inherent program similarity
- In PACT
, 2006
"... A key challenge in benchmarking is to predict the performance of an application of interest on a number of platforms in order to determine which platform yields the best performance. This paper proposes an approach for doing this. We measure a number of microarchitecture-independent characteristics ..."
Abstract
-
Cited by 35 (6 self)
- Add to MetaCart
(Show Context)
A key challenge in benchmarking is to predict the performance of an application of interest on a number of platforms in order to determine which platform yields the best performance. This paper proposes an approach for doing this. We measure a number of microarchitecture-independent characteristics from the application of interest, and relate these characteristics to the characteristics of the programs from a previously profiled benchmark suite. Based on the similarity of the application of interest with programs in the benchmark suite, we make a performance prediction of the application of interest. We propose and evaluate three approaches (normalization, principal components analysis and genetic algorithm) to transform the raw data set of microarchitecture-independent characteristics into a benchmark space in which the relative distance is a measure for the relative performance differences. We evaluate our approach using all of the SPEC CPU2000 benchmarks and real hardware performance numbers from the SPEC website. Our framework estimates per-benchmark machine ranks with a 0.89 average and a 0.80 worst case rank correlation coefficient.