An Empirical Assessment of Algorithms for Constructing a Minimum Spanning Tree
, 1994
We address the question of theoretical vs. practical behavior of algorithms for the minimum spanning tree problem. We review the factors that influence the actual running time of an algorithm, from choice of language, machine, and compiler, through lowlevel implementation choices, to purely algorithmic issues. We discuss how to design a careful experimental comparison between various alternatives. Finally, we present the results from a study in which we used: multiple languages, compilers, and machines; all the major variants of the comparisonbased algorithms; and eight varieties of graphs in five families, with sizes of up to 0.5 million vertices (in sparse graphs) or 1.3 million edges (in dense graphs).
TREE COMPRESSION AND OPTIMIZATION WITH APPLICATIONS
, 1990
Different methods for compressing trees are surveyed and developed. Tree compression can be seen as a tradeoff problem between time and space in which we can choose different strategies depending on whether we prefer better compression results or more efficient operations in the compressed structure. Of special interest is the case where space can be saved while preserving the functionality of the operations; this is called data optimization. The general compression scheme employed here consists of separate linearization of the tree structure and the data stored in the tree. Also some applications of the tree compression methods are explored. These include the syntaxdirected compression of program files, the compression of pixel trees, trie compaction and dictionaries maintained as implicit data structures.
New upper bounds for pairing heaps
 In Scandinavian Workshop on Algorithm Theory (LNCS 1851
, 2000
Pairing heaps are shown to have constant amortized time Insert and Meld, thus showing that pairing heaps have the same amortized runtimes as Fibonacci heaps for all operations but Decreasekey. 1
An Empirical Analysis of Algorithms for Constructing a Minimum Spanning Tree
 DIMACS Series in Discrete Mathematics and Theoretical Computer Science
, 1991
We compare algorithms for the construction of a minimum spanning tree through largescale experimentation on randomly generated graphs of different structures and different densities. In order to extrapolate with confidence, we use graphs with up to 130,000 nodes (sparse) or 750,000 edges (dense). Algorithms included in our experiments are Prim's algorithm (implemented with a variety of priority queues), Kruskal's algorithm (using presorting or demand sorting), Cheriton and Tarjan's algorithm, and Fredman and Tarjan 's algorithm. We also ran a large variety of tests to investigate lowlevel implementation decisions for the data structures, as well as to enable us to eliminate the effect of compilers and architectures. Within the range of sizes used, Prim's algorithm, using pairing heaps or sometimes binary heaps, is clearly preferable. While versions of Prim's algorithm using efficient implementations of Fibonacci heaps or rankrelaxed heaps often approach and (on the densest graphs) so...
HighPerformance Algorithm Engineering for Computational Phylogenetics
 J. Supercomputing
, 2002
A phylogeny is the evolutionary history of a group of organisms; systematists (and other biologists) attempt to reconstruct this history from various forms of data about contemporary organisms. Phylogeny reconstruction is a crucial step in the understanding of evolution as well as an important tool in biological, pharmaceutical, and medical research. Phylogeny reconstruction from molecular data is very difficult: almost all optimization models give rise to NPhard (and thus computationally intractable) problems. Yet approximations must be of very high quality in order to avoid outright biological nonsense. Thus many biologists have been willing to run farms of processors for many months in order to analyze just one dataset. Highperformance algorithm engineering offers a battery of tools that can reduce, sometimes spectacularly, the running time of existing phylogenetic algorithms, as well as help designers produce better algorithms. We present an overview of algorithm engineering techniques, illustrating them with an application to the "breakpoint analysis" method of Sankoff et al., which resulted in the GRAPPA software suite. GRAPPA demonstrated a speedup in running time by over eight orders of magnitude over the original implementation on a variety of real and simulated datasets. We show how these algorithmic engineering techniques are directly applicable to a large variety of challenging combinatorial problems in computational biology.
A Practical Shortest Path Algorithm with Linear Expected Time
 SUBMITTED TO SIAM J. ON COMPUTING
, 2001
We present an improvement of the multilevel bucket shortest path algorithm of Denardo and Fox [9] and justify this improvement, both theoretically and experimentally. We prove that if the input arc lengths come from a natural probability distribution, the new algorithm runs in linear average time while the original algorithm does not. We also describe an implementation of the new algorithm. Our experimental data suggests that the new algorithm is preferable to the original one in practice. Furthermore, for integral arc lengths that fit into a word of today's computers, the performance is close to that of breadthfirst search, suggesting limitations on further practical improvements.
Algorithms and Experiments: The New (and Old) Methodology
 J. Univ. Comput. Sci
, 2001
The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large runningtime coefficients, the gap between theory and practice has widened over these years. Experimentation is indispensable in the assessment of heuristics for hard problems, in the characterization of asymptotic behavior of complex algorithms, and in the comparison of competing designs for tractable problems. Implementation, although perhaps not rigorous experimentation, was characteristic of early work in algorithms and data structures. Donald Knuth has throughout insisted on testing every algorithm and conducting analyses that can predict behavior on actual data; more recently, Jon Bentley has vividly illustrated the difficulty of implementation and the value of testing. Numerical analysts have long understood the need for standardized test suites to ensure robustness, precision and efficiency of numerical libraries. It is only recently, however, that the algorithms community has shown signs of returning to implementation and testing as an integral part of algorithm development. The emerging disciplines of experimental algorithmics and algorithm engineering have revived and are extending many of the approaches used by computing pioneers such as Floyd and Knuth and are placing on a formal basis many of Bentley's observations. We reflect on these issues, looking back at the last thirty years of algorithm development and forward to new challenges: designing cacheaware algorithms, algorithms for mixed models of computation, algorithms for external memory, and algorithms for scientific research.
Pairing heaps with O(log log n) decrease cost
 In 20th ACMSIAM Symposium on Discrete Algorithms
, 2009
We give a variation of the pairing heaps for which the time bounds for all the operations match the lower bound proved by Fredman for a family of similar selfadjusting heaps. Namely, our heap structure requires O(1) for insert and findmin, O(log n) for deletemin, and O(log log n) for decreasekey and meld (all the bounds are in the amortized sense except for findmin). 1