## A predictive performance model for superscalar processors (2006)

### Download From

IEEE### Download Links

- [drona.csa.iisc.ernet.in]
- [drona.csa.iisc.ernet.in]
- DBLP

### Other Repositories/Bibliography

Venue: | In International Symposium on Microarchitecture |

Citations: | 23 - 0 self |

### BibTeX

@INPROCEEDINGS{Joseph06apredictive,

author = {P. J. Joseph and Kapil Vaswani and Matthew J. Thazhuthaveetil},

title = {A predictive performance model for superscalar processors},

booktitle = {In International Symposium on Microarchitecture},

year = {2006}

}

### OpenURL

### Abstract

Designing and optimizing high performance microprocessors is an increasingly difficult task due to the size and complexity of the processor design space, high cost of detailed simulation and several constraints that a processor design must satisfy. In this paper, we propose the use of empirical non-linear modeling techniques to assist processor architects in making design decisions and resolving complex trade-offs. We propose a procedure for building accurate non-linear models that consists of the following steps: (i) selection of a small set of representative design points spread across processor design space using latin hypercube sampling, (ii) obtaining performance measures at the selected design points using detailed simulation, (iii) building non-linear models for performance using the function approximation capabilities of radial basis function networks, and (iv) validating the models using an independently and randomly generated set of design points. We evaluate our model building procedure by constructing non-linear performance models for programs from the SPEC CPU2000 benchmark suite with a microarchitectural design space that consists of 9 key parameters. Our results show that the models, built using a relatively small number of simulations, achieve high prediction accuracy (only 2.8 % error in CPI estimates on average) across a large processor design space. Our models can potentially replace detailed simulation for common tasks such as the analysis of key microarchitectural trends or searches for optimal processor design points. 1.

### Citations

695 |
The elements of statistical learning: data mining, inference, and prediction
- Hastie, Tibshirani, et al.
- 2001
(Show Context)
Citation Context ...suited α based on experimentation. We select a subset of the regression tree node centers as RBF centers such that the resulting model generalizes well. We achieve this using model selection criteria =-=[7]-=-, which help to select a model that fits well on the training data and also has a small number of model parameters - in this case determined by the number of RBFs. We use Akaike’s Information Criteria... |

448 |
SimpleScalar: An infrastructure for computer system modeling
- Austin, Larson, et al.
- 2002
(Show Context)
Citation Context ...h instruction window size[11]. We further illustrate this limitation of linear models using a simple experiment. We measured the change in superscalar processor performance as modeled by Simplescalar =-=[2]-=- by varying two parameters, the L1 instruction cache size and the L2 cache latency, while keeping other microarchitectural parameters fixed. Figure 1 illustrates the variation in performance for vorte... |

434 |
Multivariable functional interpolation and adaptive networks
- Broomhead, Lowe
- 1988
(Show Context)
Citation Context ...e the use of nonlinear regression modeling techniques to build accurate predictive models for processor performance. Specifically, we choose to build models using Radial Basis Function (RBF) networks =-=[3]-=-, primarily because of their ability to approximate many complex functions, and the relative ease with which these models can be generated. We propose a procedure, BuildRBFmodel, that can be used to c... |

417 |
A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code, Technometrics
- McKay, Beckman, et al.
- 1979
(Show Context)
Citation Context ...hould space out points throughout the design space just close enough to capture variations in the response. We achieve good sampling of the design space by using a variant of latin hypercube sampling =-=[15]-=-. In this scheme, the sample is ensured to have points corresponding to all settings of a parameter, and the settings of each of the parameters are randomly combined. For a typical set of processor mi... |

199 | MinneSPEC: A New SPEC Benchmark Workload Simulation-Based Computer Architecture Research
- KleinOsowski, Lilja
(Show Context)
Citation Context ...ly configured verified simulator, alphasim [4] at several points in the design space. We used the simulator to run benchmarks from the SPEC CPU2000 integer suite using the lgred data set in MinneSPEC =-=[12]-=- reduced data sets. This is done using traces generated with IBM PowerPC executables, compiled with xlc compiler applying the -O3 option. We run the benchmarks to completion, and do not use any sampli... |

110 | Measuring Experimental Error in Microprocessor Simulation
- Desikan, Burger, et al.
- 2001
(Show Context)
Citation Context ...vidually. To further verify the simulator’s accuracy across the processor design space, we validated trends in the summary statistics against another similarly configured verified simulator, alphasim =-=[4]-=- at several points in the design space. We used the simulator to run benchmarks from the SPEC CPU2000 integer suite using the lgred data set in MinneSPEC [12] reduced data sets. This is done using tra... |

92 | Using Machine Learning to Focus Iterative Optimization
- Agakov, Bonilla, et al.
- 2006
(Show Context)
Citation Context ...ral parameters in superscalar processors. However, we observe that the accuracy of linear models in predicting processor performance is much lower than the non-linear models we present. Agakov et al. =-=[1]-=- develop statistical models relating program execution to compiler optimization sequences for several benchmarks, and use it to predict the best optimization sequence for a new program. The prediction... |

92 | A generalized discrepancy and quadrature error bound
- Hickernell
- 1998
(Show Context)
Citation Context ...at should be included in the model. 2. An initial a set of design points within the design space (a sample) is selected for simulation. We use a spacefilling criteria called the L 2 -star discrepancy =-=[8]-=- for selecting the sample. 3. Processor response for the sample is obtained using detailed, cycle accurate simulation. 4. The set of design points together with the response (sample data) is used to b... |

88 | HLS: Combining statistical and symbolic simulation to guide microprocessor design
- Oskin, Chong, et al.
- 2000
(Show Context)
Citation Context ... They are useful to evaluate and compare the performance of closely related designs, but they have not been demonstrated to be accurate across the entire feasible design space. Statistical simulation =-=[5, 19]-=- uses profiled program statistics as well as cache and branch prediction statistics to generate a synthetic instruction trace which can guide processor simulation. These simulations converge to a stea... |

84 | Cross architecture performance predictions for scientific applications using parameterized models
- Marin, Mellor-Crummey
- 2004
(Show Context)
Citation Context ...se it to predict the best optimization sequence for a new program. The prediction is done using the model of the statistically closest program in terms of program properties. Marin and Mellor-Crummey =-=[14]-=- build models to predict application performance on processors using a combination of architecture independent program parameters and micro-architectural parameters. The focus of the work is to predic... |

83 |
Accurate and efficient regression modeling for microarchitectural performance and power prediction
- Lee, Brooks
- 2006
(Show Context)
Citation Context ...pproaches can possibly be combined to learn a model relating application characteristics, input sizes, and micro-architectural parameters to execution speed. In parallel with our work, Lee and Brooks =-=[13]-=- and Ipek et al. [9] have independently developed predictive models for processors. Lee and Brooks use regression splines to build predictive models from simulation data. Ipek et al. use artificial ne... |

59 | Efficiently Exploring Architecture Design Spaces via Predictive Modeling
- IPEK, SUPINSKI, et al.
(Show Context)
Citation Context ...y be combined to learn a model relating application characteristics, input sizes, and micro-architectural parameters to execution speed. In parallel with our work, Lee and Brooks [13] and Ipek et al. =-=[9]-=- have independently developed predictive models for processors. Lee and Brooks use regression splines to build predictive models from simulation data. Ipek et al. use artificial neural networks for th... |

37 |
Theoretical Modeling of Superscalar Processor Performance
- Noonburg, Shen
- 1994
(Show Context)
Citation Context ...erformance and various microarchitectural parameters would, in theory, obviate the need for detailed, expensive simulations. However, existing analytical modeling techniques for processor performance =-=[11, 16]-=- are based on several simplifying assumptions and only model assmall number of microarchitectural parameters. As a result, these models lack the accuracy or the flexibility for use in a real-world pro... |

35 | Control flow modeling in statistical simulation for accurate and efficient processor design studies
- Eeckhout, Jr, et al.
- 2004
(Show Context)
Citation Context ... They are useful to evaluate and compare the performance of closely related designs, but they have not been demonstrated to be accurate across the entire feasible design space. Statistical simulation =-=[5, 19]-=- uses profiled program statistics as well as cache and branch prediction statistics to generate a synthetic instruction trace which can guide processor simulation. These simulations converge to a stea... |

34 | Construction and use of linear regression models for processor performance analysis
- Joseph, Vaswani, et al.
- 2006
(Show Context)
Citation Context ...eal-world processor design cycle. As an alternative to analytical models, several empirical modeling techniques for processor performance have been proposed and evaluated. For instance, Joseph et al. =-=[10]-=- model performance as a linear combination of individual microarchitectural parameters and their interactions. The key characteristic of their approach is that the linear models are learnt from data o... |

34 | Characterizing and comparing prevailing simulation techniques
- Yi, Kodakara, et al.
- 2005
(Show Context)
Citation Context ...ble to study a larger set of configurations. However, its accuracy has not been demonstrated across the entire design space. Its accuracy is typically tested by varying a few parameters[5]. Yi et al. =-=[20]-=- have quantified the significance of micro-architectural parameters by conducting simulations based on foldover Plackett-Burman experimental designs. Plackett-Burman designs are a way of choosing n pa... |

12 | Combining regression trees and radial basis functions. Division of informatics
- Orr, Hallman, et al.
- 1999
(Show Context)
Citation Context ...o that the whole network provides a good model of the CPI response metric for the entire parameter space. For this we use a scheme based on regression trees which was originally devised by Orr et al. =-=[17]-=-. In our context, this scheme identifies contiguous regions of the design space - specified by ranges of the design parameters - that have similar performance as measured by CPI. RBF centers are chose... |

9 | Centered L2-discrepancy of random sampling and Latin hypercube design, and construction of uniform designs
- Fang, Ma, et al.
(Show Context)
Citation Context ...of pipeline depth, all reorder buffer sizes, all L2 cache sizes, and so on. This strategy has been shown to have better coverage as compared to a simple random selection of points in the design space =-=[6]-=-. To further improve the quality of latin hypercube samples, we use space-filling measures to quantify the extent to which a sample covers the design space. The specific space filling metric we use, r... |

5 |
Matlab functions for radial basis function networks
- Orr
- 2001
(Show Context)
Citation Context ...e, it moves deeper in the regression tree for considering additional centers in a similar fashion. 2.6. Implementation We use a modified version of the function rbf rt 1 in Mark Orr’s MATLAB software =-=[18]-=- for our model construction. The software was updated to use AICc in the model selection. We used it to determine optimal values of the method parameters, pmin and α, to build the predictive model, an... |

4 |
Modeling superscalar processors
- Karkhanis, Smith
- 2004
(Show Context)
Citation Context .... Related Work As a result of the high cost of using simulators, efforts have been made at developing models as alternatives to simulation for exploring the processor design space. Theoretical models =-=[11, 16]-=- relate performance to a few key microarchitectural parameters and program characteristics. These models typically measure the program execution speed under an ideal micro-architecture, and then accou... |