Results 1  10
of
13
Effectively Exploiting Indirect Jumps
 Software Practice and Experience
, 1997
"... This dissertation describes a general codeimproving transformation that can coalesce conditional branches into an indirect jump from a table. Applying this transformation allows an optimizer to exploit indirect jumps for many other coalescing opportunities besides the translation of multiway branch ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
This dissertation describes a general codeimproving transformation that can coalesce conditional branches into an indirect jump from a table. Applying this transformation allows an optimizer to exploit indirect jumps for many other coalescing opportunities besides the translation of multiway branch statements. First, dataflow analysis is performed to detect a set of coalescent conditional branches, which are often separated by blocks of intervening instructions. Second, several techniques are applied to reduce the cost of performing an indirect jump operation, often requiring the execution of only two instructions on a SPARC. Finally, the control flow is restructured using code duplication to replace the set of branches with an indirect jump. Thus, the transformation essentially provides early resolution of conditional branches that may originally have been some distance from the point where the indirect jump is inserted. The transformation can be frequently applied with often significant reductions in the number of instructions executed, total cache work, and execution time. In fact, over twice the benefit was achieved from exploiting indirect jumps as a general codeimproving transformation instead of using the traditional approach of producing indirect jumps as an intermediate code generation decision. In addition, the author show that with comparable branch target buffer support, indirect jumps improve branch prediction since they cause fewer mispredictions than the set of branches they replaced.
Efficient Multiway Radix Search Trees
, 1996
"... this paper discusses only its application to switch statements. There has been considerable work in the past ([2], [3], [5], [6] and [10]) on the Pascal case statement and code generation. The generation of code for switch statements is discussed in [4] and [11]. A scheme similar to MRST, but restri ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
this paper discusses only its application to switch statements. There has been considerable work in the past ([2], [3], [5], [6] and [10]) on the Pascal case statement and code generation. The generation of code for switch statements is discussed in [4] and [11]. A scheme similar to MRST, but restricted to binary radix search trees, appears in [9]. Preprint submitted to Elsevier Preprint 8 August 1997 Applications for fast sparse switch statements are many and varied. Two examples are:  Let L be a CommonLisplike language with dynamic type dispatch on function arguments. Let F be an n argument generic function in L, with
Coalescing Conditional Branches into Efficient Indirect Jumps
 Proceedings of the International Static Analysis Symposium
, 1997
"... Indirect jumps from tables are traditionally only generated by compilers as an intermediate code generation decision when translating multiway selection statements. However, making this decision during intermediate code generation poses problems. The research described in this paper resolves these p ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
Indirect jumps from tables are traditionally only generated by compilers as an intermediate code generation decision when translating multiway selection statements. However, making this decision during intermediate code generation poses problems. The research described in this paper resolves these problems by using several types of static analysis as a framework for a code improving transformation that exploits indirect jumps from tables. First, controlflow analysis is performed that provides opportunities for coalescing branches generated from other control statements besides multiway selection statements. Second, the optimizer uses various techniques to reduce the cost of indirect jump operations by statically analyzing the context of the surrounding code. Finally, path and branch prediction analysis is used to provide a more accurate estimation of the benefit of coalescing a detected set of branches into a single indirect jump. The results indicate that the coalescing transformation can be frequently applied with significant reductions in the number of instructions executed and total cache work. This paper shows that static analysis can be used to implement an effective improving transformation for exploiting indirect jumps.
RE2C  A More Versatile Scanner Generator
 ACM Lett. Program. Lang. Syst
, 1994
"... It is usually claimed that lexical analysis routines are still coded by hand, despite the widespread availability of scanner generators, for efficiency reasons. While efficiency is a consideration, there exist freely available scanner generators such as GLA [7] that can generate scanners that are ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
It is usually claimed that lexical analysis routines are still coded by hand, despite the widespread availability of scanner generators, for efficiency reasons. While efficiency is a consideration, there exist freely available scanner generators such as GLA [7] that can generate scanners that are faster than most handcoded ones. However, most generated scanners are tailored for a particular environment, and retargetting these scanners to other environments, if possible, is usually complex enough to make a handcoded scanner more appealing. In this paper we describe RE2C, a scanner generator that not only generates scanners which are faster (and usually smaller) than those produced by any other scanner generator known to the authors, including GLA, but also adapt easily to any environment. Categories and Subject Descriptors: D.3.2 [Programming Languages]: Language Classifications  specialized application languages; D.3.4 [Programming Languages]: Processors General Terms: Al...
Feedbackdirected switchcase statement optimization
 In 4th Workshop on Compile and Runtime Techniques for Parallel Computing
, 2005
"... This paper presents two new feedbackguided techniques to generate code for switchcase statements: hot default case promotion (DP) and switchcase statement partitioning (SP). DP improves case dispatch while SP simplifies case dispatch, improves instruction layout and enables further inlining. An e ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper presents two new feedbackguided techniques to generate code for switchcase statements: hot default case promotion (DP) and switchcase statement partitioning (SP). DP improves case dispatch while SP simplifies case dispatch, improves instruction layout and enables further inlining. An extensive experimental study reveals up to 4.9 % performance variations among different strategies. The largest performance improvement of DP and SP over existing O3 optimization in the Open Research Compiler (ORC) is 1.7%. A microarchitecture level performance study provides insights on the basis for this performance improvement. 1
Lucid And Efficient Case Analysis
, 1995
"... . This paper describes a new scheme for building static search trees, using multiway radix search trees. We present this method for code generation of switch statements in imperative languages. We show that, for sparse case sets, the method produces faster code on average than existing methods, requ ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
. This paper describes a new scheme for building static search trees, using multiway radix search trees. We present this method for code generation of switch statements in imperative languages. We show that, for sparse case sets, the method produces faster code on average than existing methods, requiring O(1) time with a small constant for the average search. We then apply this method to the problem of code generation for generic functions in objectoriented languages, and find that its use improves clarity as well as efficiency. Keywords Algorithms, Compilers, Switch statements, Code generation, Code optimization, Objectoriented methods. 1. Introduction Switch statements in C, like case statements in Pascal and Ada, are useful conditional control constructs. These statements represent multiway tree control structures, whereas if statements represent binary tree control structures. In this paper we present a new code generation method for switch statements, which on the average gener...
On Conditional Branches in Optimal Decision Trees
, 2006
"... Abstract — The decision tree is one of the most fundamental programming abstractions. A commonly used type of decision tree is the alphabetic binary tree, which uses (without loss of generality) “less than ” versus ”greater than or equal to” tests in order to determine one of n outcome events. The p ..."
Abstract
 Add to MetaCart
Abstract — The decision tree is one of the most fundamental programming abstractions. A commonly used type of decision tree is the alphabetic binary tree, which uses (without loss of generality) “less than ” versus ”greater than or equal to” tests in order to determine one of n outcome events. The process of finding an optimal alphabetic binary tree for a known probability distribution on outcome events usually has the underlying assumption that the cost (time) per decision is uniform and thus independent of the outcome of the decision. This assumption, however, is incorrect in the case of software to be optimized for a given microprocessor, e.g., in compiling switch statements or in finetuning program bottlenecks. The operation of the microprocessor generally means that the cost for the more likely decision outcome can or will be less — often far less — than the less likely decision outcome. Here we formulate a variety of O(n 3)time O(n 2)space dynamic programming algorithms to solve such an optimal binary decision tree problem, optimizing for the behavior of processors with predictive branch capabilities, both static and dynamic. In the static case, we use existing results to arrive at entropybased performance bounds. Solutions to this formulation are often faster in practice than “optimal ” decision trees as formulated in the literature, and, for small problems, are easily worth the extra complexity in finding the better solution. This can be applied in fast implementation of Huffman coding. I.
On Conditional Branches in Optimal Search Trees
"... Abstract — Algorithms for efficiently finding optimal binary alphabetic (decision) trees are well established and in common use. However, such algorithms — e.g., the HuTucker algorithm — usually have the underlying assumption that the cost per decision is uniform and thus independent of the outcome ..."
Abstract
 Add to MetaCart
Abstract — Algorithms for efficiently finding optimal binary alphabetic (decision) trees are well established and in common use. However, such algorithms — e.g., the HuTucker algorithm — usually have the underlying assumption that the cost per decision is uniform and thus independent of the outcome of the decision. Algorithms without this assumption generally use one cost if the decision outcome is “less than ” and another cost otherwise. In practice, neither assumption is accurate for software optimized for today’s microprocessors. Such software generally has one cost for the more likely decision outcome and a greater cost — often far greater — for the less likely decision outcome. This problem and generalizations thereof are applicable to hard coding static decision tree instances in software, e.g., for optimizing program bottlenecks or for compiling switch statements. An O(n 3)time O(n 2)space dynamic programming algorithm can solve this optimal binary decision tree problem, and this approach has many generalizations that optimize for the behavior of processors with predictive branch capabilities, both static and dynamic. Solutions to this formulation are often faster in practice than “optimal ” decision trees as formulated in the literature. Different decision paradigms can sometimes yield even better performance. I.
On Conditional Branches in Optimal Decision Trees
"... Abstract — The decision tree is one of the most fundamental programming abstractions. A commonly used type of decision tree is the alphabetic binary tree, which uses (without loss of generality) “less than ” versus ”greater than or equal to” tests in order to determine one of n outcome events. The p ..."
Abstract
 Add to MetaCart
Abstract — The decision tree is one of the most fundamental programming abstractions. A commonly used type of decision tree is the alphabetic binary tree, which uses (without loss of generality) “less than ” versus ”greater than or equal to” tests in order to determine one of n outcome events. The process of finding an optimal alphabetic binary tree for a known probability distribution on outcome events usually has the underlying assumption that the cost (time) per decision is uniform and thus independent of the outcome of the decision. This assumption, however, is incorrect in the case of software to be optimized for a given microprocessor, e.g., in compiling switch statements or in finetuning program bottlenecks. The operation of the microprocessor generally means that the cost for the more likely decision outcome can or will be less — often far less — than the less likely decision outcome. Here we formulate a variety of O(n 3)time O(n 2)space dynamic programming algorithms to solve such an optimal binary decision tree problem, optimizing for the behavior of processors with predictive branch capabilities, both static and dynamic. In the static case, we use existing results to arrive at entropybased performance bounds. Solutions to this formulation are often faster in practice than “optimal ” decision trees as formulated in the literature, and, for small problems, are easily worth the extra complexity in finding the better solution. This can be applied in fast implementation of Huffman coding. I.