## Performance prediction based on inherent program similarity (2006)

### Cached

### Download Links

- [lca.ece.utexas.edu]
- [www.elis.ugent.be]
- [www.elis.ugent.be]
- [users.elis.ugent.be]
- [itkovian.net]
- [www.cs.virginia.edu]
- [www.ece.utexas.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In PACT |

Citations: | 29 - 6 self |

### BibTeX

@INPROCEEDINGS{Hoste06performanceprediction,

author = {Kenneth Hoste and Aashish Phansalkar and Lieven Eeckhout and Andy Georges and Lizy K. John and Koen De Bosschere},

title = {Performance prediction based on inherent program similarity},

booktitle = {In PACT},

year = {2006},

pages = {114--122},

publisher = {ACM Press}

}

### Years of Citing Articles

### OpenURL

### Abstract

A key challenge in benchmarking is to predict the performance of an application of interest on a number of platforms in order to determine which platform yields the best performance. This paper proposes an approach for doing this. We measure a number of microarchitecture-independent characteristics from the application of interest, and relate these characteristics to the characteristics of the programs from a previously profiled benchmark suite. Based on the similarity of the application of interest with programs in the benchmark suite, we make a performance prediction of the application of interest. We propose and evaluate three approaches (normalization, principal components analysis and genetic algorithm) to transform the raw data set of microarchitecture-independent characteristics into a benchmark space in which the relative distance is a measure for the relative performance differences. We evaluate our approach using all of the SPEC CPU2000 benchmarks and real hardware performance numbers from the SPEC website. Our framework estimates per-benchmark machine ranks with a 0.89 average and a 0.80 worst case rank correlation coefficient.

### Citations

778 |
Applied multivariate statistical analysis
- RA, Wichern
- 2002
(Show Context)
Citation Context ..., the underlying program characteristic that causes the microarchitecture-independent characteristics to correlate, gets a higher weight in the Euclidean distance. Principal components analysis (PCA) =-=[7]-=- is a statistical data analysis technique that extracts uncorrelated dimensions from a data set. The input to PCA is a matrix in which the rows are the cases and the columns are the variables. In this... |

751 | ATOM - A system for building customized program analysis tools
- Srivastava, Eustace
- 1994
(Show Context)
Citation Context ...ts. The binaries were taken from the SimpleScalar website; they are compiled for the Alpha ISA. Measuring the microarchitecture-independent characteristics discussed in section 2.1 is done using ATOM =-=[15]-=-. ATOM is a binary instrumentation tool that allows for instrumenting functions, basic blocks and instructions. The instrumentation itself is done offline, i.e., an instrumented binary is stored on di... |

673 | Automatically characterizing large scale program behavior
- Sherwood, Perelman, et al.
- 2002
(Show Context)
Citation Context ...tion coefficient from 0.76 to 0.91. A large body of work has also been done on the correlation between microarchitecture-independent program characteristics and processor performance, see for example =-=[1, 9, 14]-=-. However, these techniques do not predict performance for an application of interest based on cross-program similarity. Instead, these techniques predict performance based on intra-program phase-leve... |

122 | Analysis of benchmark characteristics and benchmark performance prediction
- Saavedra, Smith
- 1996
(Show Context)
Citation Context ...litator for our performance prediction approach is a good quantitative measure for program similarity. Several researchers have proposed methods for quantifying program similarity. Saavedra and Smith =-=[13]-=- use the squared Euclidean distance computed in a benchmark space built up using dynamic program characteristics at the Fortran programming language level such as operation mix, number of function cal... |

117 |
A first-order superscalar processor model
- Karkhanis, Smith
- 2004
(Show Context)
Citation Context ...ccurate performance estimates of the given application on the given microarchitecture. The work that gets close to such an approach is the superscalar processor model presented by Karkhanis and Smith =-=[8]-=- that estimates performance based on microarchitecture-dependent characteristics such as cache miss rates and branch misprediction rates. And various researchers have proposed techniques to predict ca... |

87 | Analysis of branch prediction via data compression
- Chen, Coffey, et al.
- 1996
(Show Context)
Citation Context ...nches are for a given benchmark. In order to capture branch predictability in a microarchitecture-independent manner we used the Prediction by Partial Matching (PPM) predictor proposed by Chen et al. =-=[2]-=-, which is a universal compression/prediction technique. A PPM predictor is built on the notion of a Markov predictor. A Markov predictor of order k predicts the next branch outcome based upon k prece... |

63 | Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors
- Franklin, Sohi
- 1992
(Show Context)
Citation Context ...hieved for an idealized processor with a given window size of 32, 64, 128 and 256 in-flight instructions. Register traffic characteristics. We collect a number of characteristics concerning registers =-=[6]-=-. Our first characteristic is the average number of input operands to an instruction. Our second characteristic is the average degree of use, or the average number of times a register instance is cons... |

62 | Measuring Program Similarity: Experiments with
- Phansalkar, Joshi, et al.
- 2005
(Show Context)
Citation Context ...al interesting insights into how benchmarks behave and into how (dis)similar benchmarks are from each other. Based on this prior work, researchers have proposed benchmark suite composition techniques =-=[4, 5, 12]-=-. These techniques first measure a number of program characteristics, then apply principal components analysis, and finally apply cluster analysis in order to find distinct groups of program behavior.... |

61 | Structures for Phase Classification
- Lau, Schoenmackers, et al.
- 2004
(Show Context)
Citation Context ...cks were touched and how many unique 4KB pages were touched for both instruction and data accesses. Data stream strides. The data stream is characterized with respect to local and global data strides =-=[10]-=-. A global stride is defined as the difference in the data memory addresses between temporally adjacent memory accesses. A local stride is defined identically except that both memory accesses come fro... |

56 | Quantifying the Impact of Input Data Sets on Program Behavior and its Applications
- Eeckhout, Vandierendonck
- 2003
(Show Context)
Citation Context ...etained principal components. We subsequently normalize the principal components, i.e. we rescale the principal components to unit variance. This gives equal weight to all of the principal components =-=[5]-=-. 2.2.3 Genetic Algorithm Since we use the Euclidean distance as a distance measure in the benchmark space, we implicitly assume that the Euclidean distance in the (microarchitecture-independent) benc... |

53 | A Statistically Rigorous Approach for Improving Simulation Techniques
- Yi, Lilja, et al.
- 2003
(Show Context)
Citation Context ...ions, etc. Conte [3] uses kiviat views to qualitatively compare program behavior based on microarchitecture-dependent characteristics such as cache miss rates, branch mispredict rates, etc. Yi et al. =-=[17]-=- use a Plackett-Burman design for classifying benchmarks based on how the benchmarks stress the same processor components to similar degrees. Vandierendonck and De Bosschere [16] rank benchmarks based... |

49 | The strong correlation between code signatures and performance
- Lau, Sampson, et al.
- 2005
(Show Context)
Citation Context ...tion coefficient from 0.76 to 0.91. A large body of work has also been done on the correlation between microarchitecture-independent program characteristics and processor performance, see for example =-=[1, 9, 14]-=-. However, these techniques do not predict performance for an application of interest based on cross-program similarity. Instead, these techniques predict performance based on intra-program phase-leve... |

42 | Miss rate prediction across all program inputs
- Zhong, Dropsho, et al.
- 2003
(Show Context)
Citation Context ...ch misprediction rates. And various researchers have proposed techniques to predict cachesmiss rates based on microarchitecture-independent characteristics such as the stack distance, see for example =-=[18]-=-. However, we are unaware of any work that proposes a superscalar processor model based on microarchitecture-independent characteristics solely — the major impediment for achieving this is a good mode... |

23 | The Fuzzy Correlation between Code and Performance Predictability
- Annavaram, Rakvic, et al.
- 2004
(Show Context)
Citation Context ...tion coefficient from 0.76 to 0.91. A large body of work has also been done on the correlation between microarchitecture-independent program characteristics and processor performance, see for example =-=[1, 9, 14]-=-. However, these techniques do not predict performance for an application of interest based on cross-program similarity. Instead, these techniques predict performance based on intra-program phase-leve... |

20 | Exploiting program microarchitecture independent characteristics and phase behavior for reduced benchmark suite simulation
- Eeckhout, Sampson, et al.
- 2005
(Show Context)
Citation Context ...al interesting insights into how benchmarks behave and into how (dis)similar benchmarks are from each other. Based on this prior work, researchers have proposed benchmark suite composition techniques =-=[4, 5, 12]-=-. These techniques first measure a number of program characteristics, then apply principal components analysis, and finally apply cluster analysis in order to find distinct groups of program behavior.... |

18 |
Traffic Analysis for Streamlining Inter–Operation Communication
- Franklin, Sohi, et al.
- 1992
(Show Context)
Citation Context ...ream at the 32B block level 23 D-stream at the 4KB-page level 32, 64, 128 and 256 in-flight instructions. Register traffic characteristics. We collect a number of characteristics concerning registers =-=[6]-=-. Our first characteristic is the average number of input operands to an instruction. Our second characteristic is the average degree of use, or the average number of times a register instance is cons... |

10 |
Bosschere, “Many Benchmarks Stress the Same Bottlenecks,” Workshop on Computer Architecture Evaluation using Commerical Workloads
- Vandierendonck, De
- 2004
(Show Context)
Citation Context ...rates, etc. Yi et al. [17] use a Plackett-Burman design for classifying benchmarks based on how the benchmarks stress the same processor components to similar degrees. Vandierendonck and De Bosschere =-=[16]-=- rank benchmarks based on their uniqueness in the standard benchmark suite using the SPEC performance rating, i.e., the benchmarks that exhibit different speedups on most of the machines are given a h... |

5 |
Performance Prediction using Program Similarity
- Phansalkar, John
- 2006
(Show Context)
Citation Context ...ld a reduced benchmark suite from an existing benchmark suite. This reduced benchmark suite yields accurate performance predictions compared to the original benchmark suite. The current paper extends =-=[11]-=- which used the above workload characterization methodology consisting of principal components analysis and cluster analysis to predict performance for individual benchmarks. As shown in this paper, a... |

1 |
not (random) numbers. Keynote talk at the
- Insight
- 2005
(Show Context)
Citation Context ... a benchmark space built up using dynamic program characteristics at the Fortran programming language level such as operation mix, number of function calls, number of address computations, etc. Conte =-=[3]-=- uses kiviat views to qualitatively compare program behavior based on microarchitecture-dependent characteristics such as cache miss rates, branch mispredict rates, etc. Yi et al. [17] use a Plackett-... |