Results 11 - 20
of
41
Cache-aware and cache-oblivious adaptive sorting
- In Proc. 32nd International Colloquium on Automata, Languages, and Programming, Lecture Notes in Computer Science
, 2005
"... Abstract. Two new adaptive sorting algorithms are introduced which perform an optimal number of comparisons with respect to the number of inversions in the input. The first algorithm is based on a new linear time reduction to (non-adaptive) sorting. The second algorithm is based on a new division pr ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Abstract. Two new adaptive sorting algorithms are introduced which perform an optimal number of comparisons with respect to the number of inversions in the input. The first algorithm is based on a new linear time reduction to (non-adaptive) sorting. The second algorithm is based on a new division protocol for the GenericSort algorithm by Estivill-Castro and Wood. From both algorithms we derive I/O-optimal cache-aware and cache-oblivious adaptive sorting algorithms. These are the first I/Ooptimal adaptive sorting algorithms. 1
A meticulous analysis of mergesort programs
- in Proceedings of the 3rd Italian Conference on Algorithms and Complexity, Lecture Notes in Computer Science 1203, Springer-Verlag
, 1997
"... Abstract. The efficiency of mergesort programs is analysed under a simple unit-cost model. In our analysis the time performance of the sorting programs includes the costs of key comparisons, element moves and address calcula-tions. The goal is to establish the best possible time-bound relative to th ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract. The efficiency of mergesort programs is analysed under a simple unit-cost model. In our analysis the time performance of the sorting programs includes the costs of key comparisons, element moves and address calcula-tions. The goal is to establish the best possible time-bound relative to the model when sorting n integers. By the well-known information-theoretic ar-gument n log 2 n- O(n) is a lower bound for the integer-sorting problem in our framework. New implementations for two-way and four-way bottom-up mergesort are given, the worst-case complexities of which are shown to be bounded by 5.5nlog 2 n + O(n) and 3.25nlog 2 n + O(n), respectively. The theoretical findings are backed up with a series of experiments which show the practical relevance of our analysis when implementing library routines for internal-memory computations. 1
Affordable Fault Tolerance through Adaptation
- Parallel and Distributed Processing, Lecture Notes in Computer Science 1388
, 1998
"... . Fault-tolerant programs are typically not only difficult to implement but also incur extra costs in terms of performance or resource consumption. Failures are typically relatively rare but the fault-tolerance overhead must be paid regardless if any failures occur during the program execution. This ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
. Fault-tolerant programs are typically not only difficult to implement but also incur extra costs in terms of performance or resource consumption. Failures are typically relatively rare but the fault-tolerance overhead must be paid regardless if any failures occur during the program execution. This paper presents an approach that reduces the cost of fault-tolerance, namely, adaptations to a change in failure model. In particular, a program that assumes no failures (or only benign failures) is combined with a component that is responsible for detecting if failures occur and then switching to a fault-tolerant algorithm. Provided that the detection and adaptation mechanisms are not too expensive, this approach results in a program with smaller fault-tolerance overhead and thus a better performance than a traditional fault-tolerant program. Thus, the high cost of fault-tolerance is only paid when failures actually occur. 1 Introduction Fault tolerance is becoming increasingly important w...
Numerical Representations as Higher-Order Nested Datatypes
, 1998
"... Number systems serve admirably as templates for container types: a container object of size n is modelled after the representation of the number n and operations on container objects are modelled after their number-theoretic counterparts. Binomial queues are probably the first data structure that wa ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Number systems serve admirably as templates for container types: a container object of size n is modelled after the representation of the number n and operations on container objects are modelled after their number-theoretic counterparts. Binomial queues are probably the first data structure that was designed with this analogy in mind. In this paper we show how to express these so-called numerical representations as higher-order nested datatypes. A nested datatype allows to capture the structural invariants of a numerical representation, so that the violation of an invariant can be detected at compile-time. We develop a programming method which allows to adapt algorithms to the new representation in a mostly straightforward manner. The framework is employed to implement three different container types: binary random-access lists, binomial queues, and 2-3 finger search trees. The latter data structure, which is treated in some depth, can be seen as the main innovation from a data-struct...
Algorithm Selection for Sorting and Probabilistic Inference: A Machine Learning-Based Approach
- KANSAS STATE UNIVERSITY
, 2003
"... The algorithm selection problem aims at selecting the best algorithm for a given computational problem instance according to some characteristics of the instance. In this dissertation, we first introduce some results from theoretical investigation of the algorithm selection problem. We show, by Rice ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The algorithm selection problem aims at selecting the best algorithm for a given computational problem instance according to some characteristics of the instance. In this dissertation, we first introduce some results from theoretical investigation of the algorithm selection problem. We show, by Rice's theorem, the nonexistence of an automatic algorithm selection program based only on the description of the input instance and the competing algorithms. We also describe an abstract theoretical framework of instance hardness and algorithm performance based on Kolmogorov complexity to show that algorithm selection for search is also incomputable. Driven by the theoretical results, we propose a machine learning-based inductive approach using experimental algorithmic methods and machine learning techniques to solve the algorithm selection problem. Experimentally, we have
On the adaptiveness of quicksort
- IN: WORKSHOP ON ALGORITHM ENGINEERING & EXPERIMENTS, SIAM
, 2005
"... Quicksort was first introduced in 1961 by Hoare. Many variants have been developed, the best of which are among the fastest generic sorting algorithms available, as testified by the choice of Quicksort as the default sorting algorithm in most programming libraries. Some sorting algorithms are adapti ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Quicksort was first introduced in 1961 by Hoare. Many variants have been developed, the best of which are among the fastest generic sorting algorithms available, as testified by the choice of Quicksort as the default sorting algorithm in most programming libraries. Some sorting algorithms are adaptive, i.e. they have a complexity analysis which is better for inputs which are nearly sorted, according to some specified measure of presortedness. Quicksort is not among these, as it uses Ω(n log n) comparisons even when the input is already sorted. However, in this paper we demonstrate empirically that the actual running time of Quicksort is adaptive with respect to the presortedness measure Inv. Differences close to a factor of two are observed between instances with low and high Inv value. We then show that for the randomized version of Quicksort, the number of element swaps performed is provably adaptive with respect to the measure Inv. More precisely, we prove that randomized Quicksort performs expected O(n(1+log(1+ Inv/n))) element swaps, where Inv denotes the number of inversions in the input sequence. This result provides a theoretical explanation for the observed behavior, and gives new insights on the behavior of the Quicksort algorithm. We also give some empirical results on the adaptive behavior of Heapsort and Mergesort.
Optimal adaptive algorithms for finding the nearest and farthest point on a parametric black-box curve
- In Proceedings of the 20th Annual ACM Symposium on Computational Geometry
, 2004
"... We consider a general model for representing and manipulating parametric curves, in which a curve is specified by a black box mapping a parameter value between 0 and 1 to a point in Euclidean d-space. In this model, we consider the nearest-point-on-curve and farthest-point-on-curve problems: given a ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We consider a general model for representing and manipulating parametric curves, in which a curve is specified by a black box mapping a parameter value between 0 and 1 to a point in Euclidean d-space. In this model, we consider the nearest-point-on-curve and farthest-point-on-curve problems: given a curve C and a point p, find a point on C nearest to p or farthest from p. In the general black-box model, no algorithm can solve these problems. Assuming a known bound on the speed of the curve (a Lipschitz condition), the answer can be estimated up to an additive error of ε using O(1/ε) samples, and this bound is tight in the worst case. However, many instances can be solved with substantially fewer samples, and we give algorithms that adapt to the inherent difficulty of the particular instance, up to a logarithmic factor. More precisely, if OPT(C, p, ε) is the minimum number of samples of C that every correct algorithm must perform to achieve tolerance ε, then our algorithm performs O(OPT(C, p, ε) log(ε −1 /OPT(C, p, ε))) samples. Furthermore, any algorithm requires Ω(k log(ε −1 /k)) samples for some instance C ′ with OPT(C ′ , p, ε) = k; except that, for the nearest-point-on-curve problem when the distance between C and p is less than ε, OPT is 1 but the upper and lower bounds on the number of samples are both Θ(1/ε). When bounds on relative error are desired, we give algorithms that perform O(OPT · log(2 + (1 + ε −1) · m −1 /OPT)) samples (where m is the exact minimum or maximum distance from p to C) and prove that Ω(OPT·log(1/ε)) samples are necessary on some problem instances. 1.
An Efficient Reference-based Approach to Outlier Detection in Large Datasets
"... A bottleneck to detecting distance and density based outliers is that a nearest-neighbor search is required for each of the data points, resulting in a quadratic number of pairwise distance evaluations. In this paper, we propose a new method that uses the relative degree of density with respect to a ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A bottleneck to detecting distance and density based outliers is that a nearest-neighbor search is required for each of the data points, resulting in a quadratic number of pairwise distance evaluations. In this paper, we propose a new method that uses the relative degree of density with respect to a fixed set of reference points to approximate the degree of density defined in terms of nearest neighbors of a data point. The running time of our algorithm based on this approximation is O(Rn log n) where n is the size of dataset and R is the number of reference points. Candidate outliers are ranked based on the outlier score assigned to each data point. Theoretical analysis and empirical studies show that our method is effective, efficient, and highly scalable to very large datasets. 1
Adaptive comparison-based algorithms for evaluating set queries
, 2004
"... c○Mehdi Mirzazadeh 2004I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. In this thesis we study a problem ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
c○Mehdi Mirzazadeh 2004I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. In this thesis we study a problem that arises in answering boolean queries submitted to a search engine. Usually a search engine stores the set of IDs of documents containing each word in a pre-computed sorted order and to evaluate a query like “computer AND science ” the search engine has to evaluate the union of the sets of documents containing the words “computer ” and “science”. More complex queries will result in more complex set expressions. In this thesis we consider the problem of evaluation of a set expression with union and intersection as operators and ordered sets as operands. We explore properties of comparison-based algorithms for the problem. A proof of a set expression is the set of comparisons that a comparison-based algorithm performs before it can determine the result of the expression. We discuss the properties of the proofs of set expressions and based on how complex the smallest proofs of a set expression E are, we define a measurement for determining how difficult it is for E to be computed. Then, we design an algorithm that is adaptive to the difficulty of the input expression and we show that the running time of the algorithm is roughly proportional to difficulty of the input expression, where the factor is roughly logarithmic in the number of the operands of the input expression. Key words: Adaptive algorithm, comparison-based algorithm, search engines, algorithms.
Adaptive search algorithm for patterns, in succinctly encoded XML
, 2006
"... Abstract. We propose an adaptive algorithm for context queries (queries expressed as preorder and ancestordescendant relations on labeled nodes), which can be used to find patterns in XML documents. Our algorithm takes advantage of the correlation between terms of the query without any preprocessed ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. We propose an adaptive algorithm for context queries (queries expressed as preorder and ancestordescendant relations on labeled nodes), which can be used to find patterns in XML documents. Our algorithm takes advantage of the correlation between terms of the query without any preprocessed information, and it runs in time (kd(lg lg min(n,s)+lg lg(r))) in the RAM model, where k is the number of terms in the query, d is the non-deterministic complexity of the query on the multi-labeled tree (i.e. the minimum number of operations required to check the answer to the query), n is the number of nodes in the tree, s is the number of relations between nodes and labels, and r is the maximal number of nodes matching a label on any rooted path in the tree.

