Results 1  10
of
35
Iterated Register Coalescing
 ACM Transactions on Programming Languages and Systems
, 1996
"... An important function of any register allocator is to target registers so as to eliminate copy instructions. Graphcoloring register allocation is an elegant approach to this problem. If the source and destination of a move instruction do not interfere, then their nodes can be coalesced in the inter ..."
Abstract

Cited by 147 (4 self)
 Add to MetaCart
An important function of any register allocator is to target registers so as to eliminate copy instructions. Graphcoloring register allocation is an elegant approach to this problem. If the source and destination of a move instruction do not interfere, then their nodes can be coalesced in the interference graph. Chaitin's coalescing heuristic could make a graph uncolorable (i.e., introduce spills); Briggs et al. demonstrated a conservative coalescing heuristic that preserves colorability. But Briggs's algorithm is too conservative, and leaves too many move instructions in our programs. We show how to interleave coloring reductions with Briggs's coalescing heuristic, leading to an algorithm that is safe but much more aggressive. 1 Introduction Graph coloring is a powerful approach to register allocation and can have a significant impact on the execution of compiled code. A good register allocator does copy propagation, eliminating many move instructions by "coloring" the source tempor...
Scalable Packet Classification
 In ACM SIGCOMM
, 2001
"... Packet classification is important for applications such as firewalls, intrusion detection, and differentiated services. Existing algorithms for packet classification reported in the literature scale poorly in either time or space as filter databases grow in size. Hardware solutions such as TCAMs do ..."
Abstract

Cited by 117 (6 self)
 Add to MetaCart
Packet classification is important for applications such as firewalls, intrusion detection, and differentiated services. Existing algorithms for packet classification reported in the literature scale poorly in either time or space as filter databases grow in size. Hardware solutions such as TCAMs do not scale to large classifiers. However, even for large classifiers (say 100,000 rules), any packet is likely to match a few (say 10) rules. Our paper seeks to exploit this observation to produce a scalable packet classification scheme called Aggregated Bit Vector (ABV). Our paper takes the bit vector search algorithm (BV) described in [11] (which takes linear time) and adds two new ideas, recursive aggregation of bit maps and filter rearrangement, to create ABV (which can take logarithmic time for many databases). We show that ABV outperforms BV by an order of magnitude using simulations on both industrial firewall databases and synthetically generated databases.
Practical Improvements to the Construction and Destruction of Static Single Assignment Form
, 1998
"... Static single assignment (SSA) form is a program representation becoming increasingly popular for compilerbased code optimization. In this paper, we address three problems that have arisen in our use of SSA form. Two are variations to the SSA construction algorithms presented by Cytron et al. The f ..."
Abstract

Cited by 69 (4 self)
 Add to MetaCart
Static single assignment (SSA) form is a program representation becoming increasingly popular for compilerbased code optimization. In this paper, we address three problems that have arisen in our use of SSA form. Two are variations to the SSA construction algorithms presented by Cytron et al. The first variation is a version of...
An Introduction to Machine SUIF and its Portable Libraries for Analysis and Optimization
 Harvard University
, 2002
"... ..."
A Simple, Fast Dominance Algorithm
"... The problem of finding the dominators in a controlflow graph has a long history in the literature. The original algorithms su#ered from a large asymptotic complexity but were easy to understand. Subsequent work improved the time bound, but generally sacrificed both simplicity and ease of implemen ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
(Show Context)
The problem of finding the dominators in a controlflow graph has a long history in the literature. The original algorithms su#ered from a large asymptotic complexity but were easy to understand. Subsequent work improved the time bound, but generally sacrificed both simplicity and ease of implementation. This paper returns to a simple formulation of dominance as a global dataflow problem. Some insights into the natureofdominance lead to an implementation of an O(N )algorithm that runs faster, in practice, than the classic LengauerTarjan algorithm, which has a timebound of O(E log(N)). We compare the algorithm to LengauerTarjan because it is the best known and most widely used of the fast algorithms for dominance. Working from the same implementationinsights,wealso rederive (from earlier work on control dependence by Ferrante, et al.)amethodforcalculating dominance frontiers that we show is faster than the original algorithm by Cytron, et al. The aim of this paper is not to present a new algorithm, but, rather, to make an argument based on empirical evidence that algorithms with discouraging asymptotic complexities can be faster in practice than those more commonly employed. We show that, in some cases, careful engineering of simple algorithms can overcome theoretical advantages, even when problems grow beyond realistic sizes. Further, we argue that the algorithms presented herein are intuitive and easily implemented, making them excellent teaching tools.
Inductive Graphs and Functional Graph Algorithms
, 2001
"... We propose a new style of writing graph algorithms in functional languages which is based on an alternative view of graphs as inductively defined data types. We show how this graph model can be implemented efficiently, and then we demonstrate how graph algorithms can be succinctly given by recursive ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
We propose a new style of writing graph algorithms in functional languages which is based on an alternative view of graphs as inductively defined data types. We show how this graph model can be implemented efficiently, and then we demonstrate how graph algorithms can be succinctly given by recursive function definitions based on the inductive graph view. We also regard this as a contribution to the teaching of algorithms and data structures in functional languages since we can use the functionalstyle graph algorithms instead of the imperative algorithms that are dominant today. Keywords: Graphs in Functional Languages, Recursive Graph Algorithms, Teaching Graph Algorithms in Functional Languages
Optimization of simple tabular reduction for table constraints
 In Proceedings of CP’08
, 2008
"... Abstract. Table constraints play an important role within constraint programming. Recently, many schemes or algorithms have been proposed to propagate table constraints or/and to compress their representation. We show that simple tabular reduction (STR), a technique proposed by J. Ullmann to dynamic ..."
Abstract

Cited by 17 (10 self)
 Add to MetaCart
(Show Context)
Abstract. Table constraints play an important role within constraint programming. Recently, many schemes or algorithms have been proposed to propagate table constraints or/and to compress their representation. We show that simple tabular reduction (STR), a technique proposed by J. Ullmann to dynamically maintain the tables of supports, is very often the most efficient practical approach to enforce generalized arc consistency within MAC. We also describe an optimization of STR which allows limiting the number of operations related to validity checking or search of supports. Interestingly enough, this optimization makes STR potentially r times faster where r is the arity of the constraint(s). The results of an extensive experimentation that we have conducted with respect to random and structured instances indicate that the optimized algorithm we propose is usually around twice as fast as the original STR and can be up to one order of magnitude faster than previous stateoftheart algorithms on some series of instances. 1
Maintaining Generalized Arc Consistency on Ad Hoc rary Constraints
"... In many reallife problems, constraints are explicitly defined as a set of solutions. This ad hoc (table) representation uses exponential memory and makes support checking (for enforcing GAC) difficult. In this paper, we address both problems simultaneously by representing an ad hoc constraint with ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
(Show Context)
In many reallife problems, constraints are explicitly defined as a set of solutions. This ad hoc (table) representation uses exponential memory and makes support checking (for enforcing GAC) difficult. In this paper, we address both problems simultaneously by representing an ad hoc constraint with a multivalued decision diagram (MDD), a memory efficient data structure that supports fast support search. We explain how to convert a table constraint into an MDD constraint and how to maintain GAC on the MDD constraint. Thanks to a sparse set data structure, our MDDbased GAC algorithm, mddc, achieves full incrementality in constant time. Our experiments on structured problems, car sequencing and stilllife, show that mddc is a fast GAC algorithm for ad hoc constraints. It can replace a Boolean sequence constraint [1], and scales up well for structural MDD constraints with 208 variables and 340984 nodes. We also show why it is possible for mddc to be faster than the stateoftheart generic GAC algorithms in [2–4]. Its efficiency on nonstructural ad hoc constraints is justified empirically.
An experimental study of a parallel shortest path algorithm for solving largescale graph instances
 Ninth Workshop on Algorithm Engineering and Experiments (ALENEX 2007)
, 2007
"... We present an experimental study of the single source shortest path problem with nonnegative edge weights (NSSP) on largescale graphs using the $\Delta$stepping parallel algorithm. We report performance results on the Cray MTA2, a multithreaded parallel computer. The MTA2 is a highend shared m ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
We present an experimental study of the single source shortest path problem with nonnegative edge weights (NSSP) on largescale graphs using the $\Delta$stepping parallel algorithm. We report performance results on the Cray MTA2, a multithreaded parallel computer. The MTA2 is a highend shared memory system offering two unique features that aid the efficient parallel implementation of irregular algorithms: the ability to exploit finegrained parallelism, and lowoverhead synchronization primitives. Our implementation exhibits remarkable parallel speedup when compared with competitive sequential algorithms, for lowdiameter sparse graphs. For instance, $\Delta$stepping on a directed scalefree graph of 100 million vertices and 1 billion edges takes less than ten seconds on 40 processors of the MTA2, with a relative speedup of close to 30. To our knowledge, these are the first performance results of a shortest path problem on realistic graph instances in the order of billions of vertices and edges.
How efficient can memory checking be
, 2008
"... We consider the problem of memory checking, where a user wants to maintain a large database on a remote server but has only limited local storage. The user wants to use the small (but trusted and secret) local storage to detect faults in the large (but public and untrusted) remote storage. A memory ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
We consider the problem of memory checking, where a user wants to maintain a large database on a remote server but has only limited local storage. The user wants to use the small (but trusted and secret) local storage to detect faults in the large (but public and untrusted) remote storage. A memory checker receives from the user store and retrieve operations to the large database. The checker makes its own requests to the (untrusted) remote storage and receives answers to these requests. It then uses these responses, together with its small private and reliable local memory, to ascertain that all requests were answered correctly, or to report faults in the remote storage (the public memory). A fruitful line of research investigates the complexity of memory checking in terms of the number of queries the checker issues per user request (query complexity) and the size of the reliable local memory (space complexity). Blum et al., who first formalized the question, distinguished between online checkers (that report faults as soon as they occur) and offline checkers (that report faults only at the end of a long sequence of operations). In this work we revisit the question of memory checking, asking how efficient can memory checking be?