Results 11  20
of
101
On a multivariate contraction method for random recursive structures with applications to Quicksort
, 2001
"... The contraction method for recursive algorithms is extended to the multivariate analysis of vectors of parameters of recursive structures and algorithms. We prove a general multivariate limit law which also leads to an approach to asymptotic covariances and correlations of the parameters. As an appl ..."
Abstract

Cited by 33 (18 self)
 Add to MetaCart
(Show Context)
The contraction method for recursive algorithms is extended to the multivariate analysis of vectors of parameters of recursive structures and algorithms. We prove a general multivariate limit law which also leads to an approach to asymptotic covariances and correlations of the parameters. As an application the asymptotic correlations and a bivariate limit law for the number of key comparisons and exchanges of medianof(2t + 1) Quicksort is given. Moreover, for the Quicksort programs analyzed by Sedgewick the exact order of the standard deviation and a limit law follow, considering all the parameters counted by Sedgewick.
Optimal Sampling Strategies in Quicksort and Quickselect
 PROC. OF THE 25TH INTERNATIONAL COLLOQUIUM (ICALP98), VOLUME 1443 OF LNCS
, 1998
"... It is well known that the performance of quicksort can be substantially improved by selecting the median of a sample of three elements as the pivot of each partitioning stage. This variant is easily generalized to samples of size s = 2k + 1. For large samples the partitions are better as the median ..."
Abstract

Cited by 29 (4 self)
 Add to MetaCart
It is well known that the performance of quicksort can be substantially improved by selecting the median of a sample of three elements as the pivot of each partitioning stage. This variant is easily generalized to samples of size s = 2k + 1. For large samples the partitions are better as the median of the sample makes a more accurate estimate of the median of the array to be sorted, but the amount of additional comparisons and exchanges to find the median of the sample also increases. We show that the optimal sample size to minimize the average total cost of quicksort (which includes both comparisons and exchanges) is s = a \Delta p n + o( p n ). We also give a closed expression for the constant factor a, which depends on the medianfinding algorithm and the costs of elementary comparisons and exchanges. The result above holds in most situations, unless the cost of an exchange exceeds by far the cost of a comparison. In that particular case, it is better to select not the median of...
Optimizing Sorting with Genetic Algorithms
 In The International Symposium on Code Generation and Optimization
, 2005
"... 1 ..."
Improving Memory Performance of Sorting Algorithms
 ACM J. Exp. Algorithmics
, 2000
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, N ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
(Show Context)
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 8690481, or permissions@acm.org. 2 \Delta Li Xiao, Xiaodong Zhang, and Stefan A. Kubricht isting restructured algorithms (e.g., [4]) mainly attempt to reduce capacity misses on directmapped caches. In this paper, we report substantial performance improvement obtained by further exploiting memory locality to reduce other types of cache misses, such as conflict misses and TLB misses. We present several restructured mergesort and quicksort algorithms and their implementations by fully using existing processor hardware facilities (such as cache associativity and TLB), by integrating tiling and padding techniques, and by properly partitioning the data set for cache op...
A Practical Quicksort Algorithm for Graphics Processors
, 2008
"... In this paper we present GPUQuicksort, an efficient Quicksort algorithm suitable for highly parallel multicore graphics processors. Quicksort has previously been considered as an inefficient sorting solution for graphics processors, but we show that GPUQuicksort often performs better than the fa ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
In this paper we present GPUQuicksort, an efficient Quicksort algorithm suitable for highly parallel multicore graphics processors. Quicksort has previously been considered as an inefficient sorting solution for graphics processors, but we show that GPUQuicksort often performs better than the fastest known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors.
GPUQuicksort: A Practical Quicksort Algorithm for Graphics Processors
"... In this paper we describe GPUQuicksort, an efficient Quicksort algorithm suitable for highly parallel multicore graphics processors. Quicksort has previously been considered an inefficient sorting solution for graphics processors, but we show that in CUDA, NVIDIA’s programming platform for general ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
(Show Context)
In this paper we describe GPUQuicksort, an efficient Quicksort algorithm suitable for highly parallel multicore graphics processors. Quicksort has previously been considered an inefficient sorting solution for graphics processors, but we show that in CUDA, NVIDIA’s programming platform for general purpose computations on graphical processors, GPUQuicksort performs better than the fastest known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors.
Use of Representative Operation Counts in Computational Testing of Algorithms
, 1996
"... In the mathematical programming literature, researchers have conducted a large number of computational studies to assess the empirical behavior of various algorithms and have utilized CPU time as the primary measure of performance. CPU time has the following drawbacks as a measure of an algorithm&ap ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
In the mathematical programming literature, researchers have conducted a large number of computational studies to assess the empirical behavior of various algorithms and have utilized CPU time as the primary measure of performance. CPU time has the following drawbacks as a measure of an algorithm's performance: it is implementation dependent, hard to replicate, and limited in the insight it provides into an algorithm's behavior. In this paper, we illustrate the notion of representative operation counts that can complement the conventional CPU time analysis and can help us (i) to identify the asymptotic bottleneck operations in an algorithm, (ii) to estimate an algorithm's running time for different problem sizes, and (iii) to obtain a fairer comparison of several algorithms. These concepts are easily incorporated into empirical studies and often yield valuable insights into an algorithm's behavior.
A Simple, Fast Parallel Implementation of Quicksort and its Performance Evaluation on SUN Enterprise 10000
"... This paper looks into the behavior of a simple, finegrain parallel extension of Quicksort for cachecoherent shared address space multiprocessors. Quicksoft has many nice properties: i) it is fast and general purpose; it is widely believed that Quicksoft is the fastest generalpurpose sorting algor ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
This paper looks into the behavior of a simple, finegrain parallel extension of Quicksort for cachecoherent shared address space multiprocessors. Quicksoft has many nice properties: i) it is fast and general purpose; it is widely believed that Quicksoft is the fastest generalpurpose sorting algorithm, on average, and for a large number of elelnents [Blelloch et al. 1991; Dusseau et al. 1996; Helman et al. 1996b; Sohn and Kodama 1998], ii) it is inplace, iii) it exhibits good cache pcrforinance and iv) it is simple to inlplelnent. The new generation of hardwarecoherent, shared address space multiprocessor systems with their already donfinant position on the tightlycoupled nmltiprocessor systems are our target systems. The implementation of the parallel Quicksort algorithm utilizes the capabilities that these new systems have to or and uses the fbllowing algorithmic techniques: Cacheqcieni:. Each processor tries to use all keys when sequentially passing through the keys of a cachedblock from the key array
Impact of PCIBus Load on Applications in a PC Architecture
, 2003
"... Any data exchanged between the processor and main memory uses the memory bus, sharing it with data exchanged between I/O devices and main memory. If the processor and a device try to transfer data at the same time, an impact can be seen on the processor as well as on the device. As a result, the exe ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Any data exchanged between the processor and main memory uses the memory bus, sharing it with data exchanged between I/O devices and main memory. If the processor and a device try to transfer data at the same time, an impact can be seen on the processor as well as on the device. As a result, the execution time of an application on the processor may increase due to the memorybus load generated by I/O devices. In realtime environments, this impact can result in missed deadlines and a behavior that is different to that intended by the designer of the system. This paper
Transitional Behaviors of the Average Cost of Quicksort With Medianof(2t + 1)
, 2001
"... A fine analysis is given of the transitional behavior of the average cost of quicksort with medianofthree. Asymptotic formulae are derived for the stepwise improvement of the average cost of quicksort when iterating medianofthree k rounds for all possible values of k. The methods used are genera ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
A fine analysis is given of the transitional behavior of the average cost of quicksort with medianofthree. Asymptotic formulae are derived for the stepwise improvement of the average cost of quicksort when iterating medianofthree k rounds for all possible values of k. The methods used are general enough to apply to quicksort with medianof(2t + 1) and to explain in a precise manner the transitional behaviors of the average cost from insertion sort to quicksort proper. Our results also imply nontrivial bounds on the expected height, "saturation level", and width in a random locally balanced binary search tree.