Results 1  10
of
11
Efficient SinglePass Index Construction for Text Databases
 Jour. of the American Society for Information Science and Technology
, 2003
"... Efficient construction of inverted indexes is essential to provision of search over large collections of text data. In this paper, we review the principal approaches to inversion, analyse their theoretical cost, and present experimental results. We identify the drawbacks of existing inversion approa ..."
Abstract

Cited by 53 (2 self)
 Add to MetaCart
Efficient construction of inverted indexes is essential to provision of search over large collections of text data. In this paper, we review the principal approaches to inversion, analyse their theoretical cost, and present experimental results. We identify the drawbacks of existing inversion approaches and propose a singlepass inversion method that, in contrast to previous approaches, does not require the complete vocabulary of the indexed collection in main memory, can operate within limited resources, and does not sacrifice speed with high temporary storage requirements. We show that the performance of the singlepass approach can be improved by constructing inverted files in segments, reducing the cost of disk accesses during inversion of large volumes of data.
Automated Configuration of Algorithms for Solving Hard Computational Problems
, 2009
"... The bestperforming algorithms for many hard problems are highly parameterized. Selecting the best heuristics and tuning their parameters for optimal overall performance is often a difficult, tedious, and unsatisfying task. This thesis studies the automation of this important part of algorithm desig ..."
Abstract

Cited by 29 (11 self)
 Add to MetaCart
The bestperforming algorithms for many hard problems are highly parameterized. Selecting the best heuristics and tuning their parameters for optimal overall performance is often a difficult, tedious, and unsatisfying task. This thesis studies the automation of this important part of algorithm design: the configuration of discrete algorithm components and their continuous parameters to construct an algorithm with desirable empirical performance characteristics. Automated configuration procedures can facilitate algorithm development and be applied on the end user side to optimize performance for new instance types and optimization objectives. The use of such procedures separates highlevel cognitive tasks carried out by humans from tedious lowlevel tasks that can be left to machines. We introduce two alternative algorithm configuration frameworks: iterated local search in parameter configuration space and sequential optimization based on response surface models. To the best of our knowledge, our local search approach is the first that goes beyond local optima. Our modelbased search techniques significantly outperform existing techniques and extend them in ways crucial for general algorithm configuration: they can handle categorical parameters, optimization objectives defined across multiple instances, and tens of thousands
Automatic Array Algorithm Animation in C++
, 1999
"... This paper describes an elegant method for automatically animating an arbitrary array algorithm in C++. ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
This paper describes an elegant method for automatically animating an arbitrary array algorithm in C++.
Computeraided design of highperformance algorithms
, 2008
"... Highperformance algorithms play an important role in many areas of computer science and are core components of many software systems used in realworld applications. Traditionally, the creation of these algorithms requires considerable expertise and experience, often in combination with a substanti ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
(Show Context)
Highperformance algorithms play an important role in many areas of computer science and are core components of many software systems used in realworld applications. Traditionally, the creation of these algorithms requires considerable expertise and experience, often in combination with a substantial amount of trial and error. Here, we outline a new approach to the process of designing highperformance algorithms that is based on the use of automated procedures for exploring potentially very large spaces of candidate designs. We contrast this computeraided design approach with the traditional approach and discuss why it can be expected to yield better performing, yet simpler algorithms. Finally, we sketch out the highlevel design of a software environment that supports our new design approach. Existing work on algorithm portfolios, algorithm selection, algorithm configuration and parameter tuning, but also on general methods for discrete and continuous optimisation methods fits naturally into our design approach and can be integrated into the proposed software environment. 1
Efficient TrieBased Sorting of Large Sets of Strings
 Proceedings of the Australasian Computer Science Conference
, 2003
"... Sorting is a fundamental algorithmic task. Many generalpurpose sorting algorithms have been developed, but efficiency gains can be achieved by designing algorithms for specific kinds of data, such as strings. In previous work we have shown that our burstsort, a triebased algorithm for sorting stri ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Sorting is a fundamental algorithmic task. Many generalpurpose sorting algorithms have been developed, but efficiency gains can be achieved by designing algorithms for specific kinds of data, such as strings. In previous work we have shown that our burstsort, a triebased algorithm for sorting strings, is for large data sets more efficient than all previous algorithms for this task. In this paper we reevaluate some of the implementation details of burstsort, in particular the method for managing buckets held at leaves. We show that better choice of data structures further improves the efficiency, at a small additional cost in memory. For sets of around 30,000,000 strings, our improved burstsort is nearly twice as fast as the previous best sorting algorithm.
Partitioning schemes for quicksort and quickselect
, 2003
"... We introduce several modifications of the partitioning schemes used in Hoare’s quicksort and quickselect algorithms, including ternary schemes which identify keys less or greater than the pivot. We give estimates for the numbers of swaps made by each scheme. Our computational experiments indicate th ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We introduce several modifications of the partitioning schemes used in Hoare’s quicksort and quickselect algorithms, including ternary schemes which identify keys less or greater than the pivot. We give estimates for the numbers of swaps made by each scheme. Our computational experiments indicate that ternary schemes allow quickselect to identify all keys equal to the selected key at little additional cost. Key words. Sorting, selection, quicksort, quickselect, partitioning. 1
On the Adaptiveness of Quicksort Gerth Sto/lting Brodal*, # Rolf Fagerberg##, $ Gabriel Moruz*
"... Abstract Quicksort was first introduced in 1961 by Hoare. Manyvariants have been developed, the best of which are ..."
Abstract
 Add to MetaCart
Abstract Quicksort was first introduced in 1961 by Hoare. Manyvariants have been developed, the best of which are
On the Adaptiveness of Quicksort Gerth Sto/lting Brodal*, # Rolf Fagerberg##, $ Gabriel Moruz*
, 2004
"... Abstract Quicksort was first introduced in 1961 by Hoare. Many variantshave been developed, the best of which are among the fastest generic sorting algorithms available, as testified by the choice of Quicksort asthe default sorting algorithm in most programming libraries. Some sorting algorithms are ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Quicksort was first introduced in 1961 by Hoare. Many variantshave been developed, the best of which are among the fastest generic sorting algorithms available, as testified by the choice of Quicksort asthe default sorting algorithm in most programming libraries. Some sorting algorithms are adaptive, i.e. they have a complexity analysiswhich is better for inputs which are nearly sorted, according to some specified measure of presortedness. Quicksort is not among these, asit uses \Omega ( n log n) comparisons even when the input is already sorted.However, in this paper we demonstrate empirically that the actual running time of Quicksort is adaptive with respect to the presortednessmeasure Inv. Differences close to a factor of two are observed between instances with low and high Inv value. We then show that for the randomized version of Quicksort, the number of element swaps performed is provably adaptive with respect to the measure Inv. More precisely,we prove that randomized Quicksort performs expected O(n(1+log(1+Inv /n))) element swaps, where Inv denotes the number of inversionsin the input sequence. This result provides a theoretical explanation