Results 11 - 20
of
35
Engineering an External Memory Minimum Spanning Tree Algorithm
- In Proc. 3rd IFIP Intl. Conf. on Theoretical Computer Science
, 2004
"... Abstract We develop an external memory algorithm for computing minimum spanning trees. The algorithm is considerably simpler than previously known external memory algorithms for this problem and needs a factor of at least four less I/Os for realistic inputs. Our implementation indicates that this al ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Abstract We develop an external memory algorithm for computing minimum spanning trees. The algorithm is considerably simpler than previously known external memory algorithms for this problem and needs a factor of at least four less I/Os for realistic inputs. Our implementation indicates that this algorithm processes graphs only limited by the disk capacity of most current machines in time no more than a factor 2–5 of a good internal algorithm with sufficient memory space.
Experimental Evaluation of a New Shortest Path Algorithm
- in ALENEX, 2002
, 2001
"... We evaluate the practical eciency of a new shortest path algorithm for undirected graphs which was developed by the rst two authors. This algorithm works on the fundamental comparison-addition model. ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
We evaluate the practical eciency of a new shortest path algorithm for undirected graphs which was developed by the rst two authors. This algorithm works on the fundamental comparison-addition model.
Algorithms and Experiments: The New (and Old) Methodology
- J. Univ. Comput. Sci
, 2001
"... The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large running-time coefficients, the gap between theory and practice has widened over ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large running-time coefficients, the gap between theory and practice has widened over these years. Experimentation is indispensable in the assessment of heuristics for hard problems, in the characterization of asymptotic behavior of complex algorithms, and in the comparison of competing designs for tractable problems. Implementation, although perhaps not rigorous experimentation, was characteristic of early work in algorithms and data structures. Donald Knuth has throughout insisted on testing every algorithm and conducting analyses that can predict behavior on actual data; more recently, Jon Bentley has vividly illustrated the difficulty of implementation and the value of testing. Numerical analysts have long understood the need for standardized test suites to ensure robustness, precision and efficiency of numerical libraries. It is only recently, however, that the algorithms community has shown signs of returning to implementation and testing as an integral part of algorithm development. The emerging disciplines of experimental algorithmics and algorithm engineering have revived and are extending many of the approaches used by computing pioneers such as Floyd and Knuth and are placing on a formal basis many of Bentley's observations. We reflect on these issues, looking back at the last thirty years of algorithm development and forward to new challenges: designing cache-aware algorithms, algorithms for mixed models of computation, algorithms for external memory, and algorithms for scientific research.
MCSTL: The Multi-Core Standard Template Library
"... Abstract. 1 Future gain in computing performance will not stem from increased clock rates, but from even more cores in a processor. Since automatic parallelization is still limited to easily parallelizable sections of the code, most applications will soon have to support parallelism explicitly. The ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Abstract. 1 Future gain in computing performance will not stem from increased clock rates, but from even more cores in a processor. Since automatic parallelization is still limited to easily parallelizable sections of the code, most applications will soon have to support parallelism explicitly. The Multi-Core Standard Template Library (MCSTL) simplifies parallelization by providing efficient parallel implementations of the algorithms in the C++ Standard Template Library. Thus, simple recompilation will provide partial parallelization of applications that make consistent use of the STL. We present performance measurements on several architectures. For example, our sorter achieves a speedup of 21 on an 8-core 32-thread SUN T1. 1
Optimal incremental sorting
- In Proc. 8th Workshop on Algorithm Engineering and Experiments (ALENEX
, 2006
"... Let A be a set of size m. Obtaining the first k ≤ m elements of A in ascending order can be done in optimal O(m+k log k) time. We present an algorithm (online on k) which incrementally gives the next smallest element of the set, so that the first k elements are obtained in optimal time for any k. We ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
Let A be a set of size m. Obtaining the first k ≤ m elements of A in ascending order can be done in optimal O(m+k log k) time. We present an algorithm (online on k) which incrementally gives the next smallest element of the set, so that the first k elements are obtained in optimal time for any k. We also give a practical algorithm with the same complexity on average, which improves in practice the existing online algorithm. As a direct application, we use our technique to implement Kruskal’s Minimum Spanning Tree algorithm, where our solution is competitive with the best current implementations. We finally show that our technique can be applied to several other problems, such as obtaining an interval of the sorted sequence and implementing heaps. 1
Scanning Multiple Sequences Via Cache Memory
- Algorithmica
, 2003
"... We consider the simple problem of scanning multiple sequences. There are k sequences of total length N which are to be scanned concurrently. One pointer into each sequence is maintained and an adversary specifies which pointer is to be advanced. The concept of scanning multiple sequence is ubiquitou ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We consider the simple problem of scanning multiple sequences. There are k sequences of total length N which are to be scanned concurrently. One pointer into each sequence is maintained and an adversary specifies which pointer is to be advanced. The concept of scanning multiple sequence is ubiquitous in algorithms designed for hierarchical memory.
On the Representation and Multiplication of Hypersparse Matrices
, 2008
"... Multicore processors are marking the beginning of a new era of computing where massive parallelism is available and necessary. Slightly slower but easy to parallelize kernels are becoming more valuable than sequentially faster kernels that are unscalable when parallelized. In this paper, we focus on ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Multicore processors are marking the beginning of a new era of computing where massive parallelism is available and necessary. Slightly slower but easy to parallelize kernels are becoming more valuable than sequentially faster kernels that are unscalable when parallelized. In this paper, we focus on the multiplication of sparse matrices (SpGEMM). We first present the issues with existing sparse matrix representations and multiplication algorithms that make them unscalable to thousands of processors. Then, we develop and analyze two new algorithms that overcome these limitations. We consider our algorithms first as the sequential kernel of a scalable parallel sparse matrix multiplication algorithm and second as part of a polyalgorithm for SpGEMM that would execute different kernels depending on the sparsity of the input matrices. Such a sequential kernel requires a new data structure that exploits the hypersparsity of the individual submatrices owned by a single processor after the 2D partitioning. We experimentally evaluate the performance and characteristics of our algorithms and show that they scale significantly better than existing kernels.
LEDA-SM: External Memory Algorithms and Data Structures in Theory and Practice
, 2001
"... Data to be processed has dramatically increased during the last years. Nowadays, external memory (mostly hard disks) has to be used to store this massive data. Algorithms and data structures that work on external memory have different properties and specialties that distinguish them from algorithms ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Data to be processed has dramatically increased during the last years. Nowadays, external memory (mostly hard disks) has to be used to store this massive data. Algorithms and data structures that work on external memory have different properties and specialties that distinguish them from algorithms and data structures, developed for the RAM model. In this thesis, we first explain the functionality of external memory, which is realized by disk drives. We then introduce the most important theoretical I/O models. In the main part, we present the C++ class library LEDA-SM. Library LEDA-SM is an extension of the LEDA library towards external memory computation and consists of a collection of algorithms and data structures that are designed to work efficiently in external memory. In the last two chapters, we present new external memory data structures for external memory priority queues and new external memory construction algorithms for suffix arrays. These new proposals are theoretically analyzed and experimentally tested. All proposals are implemented using the LEDA-SM library. Their efficiency is evaluated by performing a large number of experiments.
Experimental study of high performance priority queues, 2007. Undergraduate Honors Thesis
, 2007
"... The priority queue is a very important and widely used data structure in computer science, with a variety of applications including Dijkstra’s Single Source Shortest Path algorithm on sparse graph types. This study presents the experimental results of a variety of priority queues. The focus of the e ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The priority queue is a very important and widely used data structure in computer science, with a variety of applications including Dijkstra’s Single Source Shortest Path algorithm on sparse graph types. This study presents the experimental results of a variety of priority queues. The focus of the experiments is to measure the speed and performance of highly specialized priority queues in out-of-core and memory intensive situations. The priority queues are run in-core on small input sizes as well as out-of-core using large input sizes and restricted memory. The experiments compare a variety of well-known priority queue implementations such as Binary Heap with highly specialized implementations, such as 4-ary Aligned Heap, Chowdhury and Ramachandran’s Auxiliary Buffer Heap, and Fast Binary Heap. The experiments include Cache-Aware as well as Cache-Oblivious priority queues. The results indicate that the high-performance priority queues easily outperform traditional implementations. Also, overall the Auxiliary Buffer Heap has the best performance among the priority queues considered in most in-core and out-of-core situations. 1 1

