Results 1 - 10
of
10
Predicting Indirect Branches via Data Compression
, 1998
"... Branch prediction is a key mechanism used to achieve high performance on multiple issue, deeply pipelined processors. By predicting the branch outcome at the instruction fetch stage of the pipeline, superscalar processors are better able to exploit Instruction Level Parallelism (ILP) by providing a ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
Branch prediction is a key mechanism used to achieve high performance on multiple issue, deeply pipelined processors. By predicting the branch outcome at the instruction fetch stage of the pipeline, superscalar processors are better able to exploit Instruction Level Parallelism (ILP) by providing a larger window of instructions. However, when a branch is mispredicted, instructions from the mispredicted path must be discarded. Therefore, branch prediction accuracy is critical to achieve high performance. Existing branch prediction schemes can accurately predict the direction of conditional branches, but they have difficulty predicting the correct targets of indirect branches. Indirect branches occur frequently in Object-Oriented Languages (OOL), as well as in Dynamically-Linked Libraries (DLLs), two programming environments rapidly increasing in popularity. In addition, certain language constructs such as multi-way control transfers (e.g., switches), and architectural features such as 6...
Improving Performance By Branch Reordering
, 1998
"... ix 1 INTRODUCTION 1 2 RELATED WORK 6 3 DETECTING A SEQUENCE OF REORDERABLE BRANCHES 9 3.1 Detecting a Sequence of Reorderable Branches with a Common Successor : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9 3.2 Detecting a Sequence of Reorderable Range Conditions Comparing a Common ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
ix 1 INTRODUCTION 1 2 RELATED WORK 6 3 DETECTING A SEQUENCE OF REORDERABLE BRANCHES 9 3.1 Detecting a Sequence of Reorderable Branches with a Common Successor : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9 3.2 Detecting a Sequence of Reorderable Range Conditions Comparing a Common Variable to Constants : : : : : : : : : : : : : : : 15 4 HANDLING SIDE EFFECTS IN A COMMON VARIABLE SEQUENCE 22 5 PERFORMING PROFILING 28 5.1 Producing Profile Information for Common Successor Sequence : 28 5.2 Producing Profile Information for Common Variable Sequence : 33 iv 6 SELECTING THE ORDERING OF BRANCHES 37 6.1 Selecting the Order of a Sequence of Branches with Common Successors : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 38 6.2 Selecting the Order of a Sequence of Range Conditions Comparing a Common Variable : : : : : : : : : : : : : : : : : : : : : : : 40 7 IMPROVING THE SELECTED SEQUENCE OF RANGE CONDITIONS 46 8 APPLYING THE REORDERING TRANSFORMATION 49 9 RES...
Efficient and effective branch reordering using profile data
- ACM Transactions on Programming Languages and Systems (TOPLAS
, 2002
"... The conditional branch has long been considered an expensive operation. The relative cost of conditional branches has increased as recently designed machines are now relying on deeper pipelines and higher multiple issue. Reducing the number of conditional branches executed often results in a substan ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
The conditional branch has long been considered an expensive operation. The relative cost of conditional branches has increased as recently designed machines are now relying on deeper pipelines and higher multiple issue. Reducing the number of conditional branches executed often results in a substantial performance benefit. This paper describes a code-improving transformation to reorder sequences of conditional branches that compare a common variable to constants. The goal is to obtain an ordering where the fewest av erage number of branches in the sequence will be executed. First, sequences of branches that can be reordered are detected in the control flow. Second, profiling information is collected to predict the probability that each branch will transfer control out of the sequence. Third, the cost of performing each conditional branch is estimated. Fourth, the most beneficial ordering of the branches based on the estimated probability and cost is selected. The most beneficial ordering often includes the insertion of additional conditional branches that did not previously exist in the sequence. Finally, the control flow isrestructured to reflect the new ordering. The results of applying the transformation are on average reductions of about 8% fewer instructions executed and 13 % branches performed, as well as about a 4 % decrease in execution time.
Functioning without closure: Type-safe customized function representations for Standard ML
- In Proc. 2001 Int’l Conf. Functional Programming
, 2001
"... The CIL compiler for core Standard ML compiles whole ML programs using a novel typed intermediate language that supports the generation of type-safe customized data representations. In this paper, we present empirical data comparing the relative efficacy of several different flow-based customization ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
The CIL compiler for core Standard ML compiles whole ML programs using a novel typed intermediate language that supports the generation of type-safe customized data representations. In this paper, we present empirical data comparing the relative efficacy of several different flow-based customization strategies for function representations. We develop a cost model to interpret dynamic counts of operations required for each strategy. In this cost model, customizing the representation of closed functions gives a 12–17 % improvement on average over uniform closure representations, depending on the layout of the closure. We also present data on the relative effectiveness of various strategies for reducing representation pollution, i.e., situations where flow constraints require the representation of a value to be less efficient than it would be in ideal circumstances. For the benchmarks tested and the types of representation pollution detected by our compiler, the pollution removal strategies we consider often cost more in overhead than they gain via enabled customizations. Notable exceptions are selective defunctionalization, a function representation strategy that often achieves significant customization benefits via aggressive pollution removal, and a simple form of flow-directed inlining, in which pollution removal allows multiple functions to be inlined at the same call site.
Indirect Branch Prediction using Data Compression Techniques
- Journal of Instruction Level Parallelism
, 1999
"... Branch prediction is a key mechanism used to achieve high performance on multiple issue, deeply pipelined processors. By predicting the branch outcome at the instruction fetch stage of a pipeline, superscalar processors become able to exploit Instruction Level Parallelism (ILP) by providing a lar ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Branch prediction is a key mechanism used to achieve high performance on multiple issue, deeply pipelined processors. By predicting the branch outcome at the instruction fetch stage of a pipeline, superscalar processors become able to exploit Instruction Level Parallelism (ILP) by providing a larger window of instructions. However, when a branch is mispredicted, instructions from the mispredicted path must be discarded. Therefore, branch prediction accuracy is critical to achieve high performance. Existing branch prediction schemes can accurately predict the direction of conditional branches, but have difficulties predicting the correct targets of indirect branches. Indirect branches occur frequently in Object-Oriented Languages (OOL), as well as in Dynamically-Linked Libraries (DLLs), two programming environments rapidly increasing in popularity. In addition, certain language constructs such as multi-way control transfers (e.g., switches), and architectural features such as ...
Feedback-directed switch-case statement optimization
- In 4th Workshop on Compile and Runtime Techniques for Parallel Computing
, 2005
"... This paper presents two new feedback-guided techniques to generate code for switch-case statements: hot default case promotion (DP) and switch-case statement partitioning (SP). DP improves case dispatch while SP simplifies case dispatch, improves instruction layout and enables further inlining. An e ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
This paper presents two new feedback-guided techniques to generate code for switch-case statements: hot default case promotion (DP) and switch-case statement partitioning (SP). DP improves case dispatch while SP simplifies case dispatch, improves instruction layout and enables further inlining. An extensive experimental study reveals up to 4.9 % performance variations among different strategies. The largest performance improvement of DP and SP over existing O3 optimization in the Open Research Compiler (ORC) is 1.7%. A microarchitecture level performance study provides insights on the basis for this performance improvement. 1
Improving Switch Lowering for The LLVM Compiler System
- PROC. OF THE 2007 SPRING YOUNG RESEARCHERS COLLOQUIUM ON SOFTWARE ENGINEERING (SYRCOSE'2007)
, 2007
"... Switch-case statements (or switches) provide a natural way to express multiway branching control flow semantics. They are common in many applications including compilers, parsers, text processing programs, virtual machines. Various optimizations for switches has been studied for many years. This pap ..."
Abstract
- Add to MetaCart
Switch-case statements (or switches) provide a natural way to express multiway branching control flow semantics. They are common in many applications including compilers, parsers, text processing programs, virtual machines. Various optimizations for switches has been studied for many years. This paper presents the description of switch lowering refactoring recently made for the LLVM Compiler System.
First published in CVu vol. 21 no.?
"... When writing software a common requirement is for the execution of some sequence of statements to depend on a variable having a particular value. Programming languages provide various constructs to support this requirement, e.g., the if-statement (which often supports checking against a single value ..."
Abstract
- Add to MetaCart
When writing software a common requirement is for the execution of some sequence of statements to depend on a variable having a particular value. Programming languages provide various constructs to support this requirement, e.g., the if-statement (which often supports checking against a single value) and the switchstatement (which supports the checking against a set of values). Measurements show that approximately
Commentary
, 2005
"... The material in the C99 subsections is copyright © ISO. The material in the C90 and C++ sections that is quoted from the respective language standards is copyright © ISO. Credits and permissions for quoted material is given where that material appears. ..."
Abstract
- Add to MetaCart
The material in the C99 subsections is copyright © ISO. The material in the C90 and C++ sections that is quoted from the respective language standards is copyright © ISO. Credits and permissions for quoted material is given where that material appears.

