This paper shows that static profiling is effective yet not as effective as dynamic profiling. Hyperblock formation has some interesting differences to superblock formation that could make it an 61
|
290
|
Effective compiler support for predicated execution using the hyperblock
– Mahlke, Lin, et al.
- 1992
|
|
233
|
Conversion of control dependence to data dependence
– Allen, K, et al.
- 1983
|
|
191
|
Available instruction-level parallelism for superscalar and superpipelined machines
– Jouppi, Wall
- 1989
|
|
147
|
Achieving High Instruction Cache Performance with an Optimizing Compiler
– Hwu, Chang
- 1989
|
|
133
|
Highly concurrent scalar processing
– Hsu, Davidson
- 1986
|
|
111
|
Profile-guided automatic inline expansion for C programs
– Chang, Hwu
- 1992
|
|
100
|
The Cydra 5 departmental supercomputer
– Rau, Yen, et al.
- 1989
|
|
97
|
Limits on Multiple Instruction Issue
– Smith, Johnson, et al.
- 1989
|
|
91
|
HPL PlayDoh architecture specification: Version 1.0
– Kathail, Schlansker, et al.
- 1994
|
|
69
|
Parallelization of loops with exits on pipelined architectures
– Tirumalai, Lee, et al.
- 1990
|
|
61
|
Reverse if-conversion
– Warter, Mahlke, et al.
- 1993
|
|
54
|
Region-Based Compilation: An Introduction and Motivation
– Hank, Hwu, et al.
- 1995
|
|
48
|
Superblock formation using static program analysis
– Hank, Mahlke, et al.
- 1993
|
|
43
|
The importance of prepass code scheduling for Superscalar Superpipelined processors
– Chang, Lavery, et al.
- 1995
|
|
41
|
A machine description language for compilation
– Gyllenhaal
- 1994
|
|
40
|
Control and data dependence for program transformations
– Towle
- 1976
|
|
35
|
Exploiting Instruction Level Parallelism in the Presence of Conditional Branches
– Mahlke
- 1995
|
|
33
|
Data Relocation and Prefetching in Programs with Large Data Sets
– Yamada
- 1995
|
|
30
|
Height reduction of control recurrences for ILP processors
– Schlansker, Kathail, et al.
- 1994
|
|
28
|
Memory Disambiguation to Facilitate Instruction-Level Parallelism Compilation
– Gallagher
- 1995
|
|
28
|
Design and implementation of a portable global code optimizer
– Mahlke
- 1991
|
|
26
|
Modulo scheduling with isomorphic control transformations
– Warter
- 1993
|
|
24
|
Compiler-Controlled Speculation
– Bringmann
- 1995
|
|
24
|
Machine independent register allocation for the IMPACT-I C compiler
– Hank
- 1993
|
|
23
|
Data Preload for Superscalar and VLIW Processors
– Chen
- 1993
|
|
20
|
Template for code generation development using the IMPACT-I C compiler
– Bringmann
- 1992
|
|
17
|
Compiler support for multiple instruction issue architectures
– Chang
- 1991
|
|
17
|
An optimizing compiler code generator: A platform for risc performance analysis
– Chen
- 1991
|
|
14
|
Loop transformations for parallel compilers
– Subramanian
- 1993
|
|
12
|
Compiler support for predicated execution in superscalar processors
– Lin
- 1992
|
|
12
|
Data dependence analysis for Fortran programs in the IMPACT compiler
– Haab
- 1995
|
|
12
|
Compiler support for SPARC architecture processors
– Ouellette
- 1994
|
|
10
|
Architectural and software support for executing numerical applications on high performance computers
– Anik
- 1993
|
|
7
|
An instruction-level performance analysis of the Multi ow TRACE 14/300
– Schuette, Shen
- 1991
|
|
6
|
Hwu, "A comparison of full and partial predicated execution support for ILP processors
– Mahlke, Hank, et al.
- 1995
|
|
4
|
On predicated execution," Hewlett Packard Laboratories
– Park, Schlansker
- 1991
|
|
4
|
Hwu, "Dynamic memory disambiguation using the Memory Conflict Buffer
– Gallagher, Chen, et al.
- 1994
|