## On sorting, heaps, and minimum spanning trees

### Cached

### Download Links

Venue: | Algorithmica |

Citations: | 1 - 1 self |

### BibTeX

@ARTICLE{Navarro_onsorting,,

author = {Gonzalo Navarro and Rodrigo Paredes},

title = {On sorting, heaps, and minimum spanning trees},

journal = {Algorithmica},

year = {},

pages = {2010}

}

### OpenURL

### Abstract

Let A be a set of size m. Obtaining the first k ≤ m elements of A in ascending order can be done in optimal O(m + k log k) time. We present Incremental Quicksort (IQS), an algorithm (online on k) which incrementally gives the next smallest element of the set, so that the first k elements are obtained in optimal expected time for any k. Based on IQS, we present the Quickheap (QH), a simple and efficient priority queue for main and secondary memory. Quickheaps are comparable with classical binary heaps in simplicity, yet are more cache-friendly. This makes them an excellent alternative for a secondary memory implementation. We show that the expected amortized CPU cost per operation over a Quickheap of m elements is O(log m), and this translates into O((1/B)log(m/M)) I/O cost with main memory size M and block size B, in a cache-oblivious fashion. As a direct application, we use our techniques to implement classical Minimum Spanning Tree (MST) algorithms. We use IQS to implement Kruskal’s MST algorithm and QHs to implement Prim’s. Experimental results show that IQS, QHs, external QHs, and our Kruskal’s and Prim’s MST variants are competitive, and in many cases better in practice than current state-of-the-art alternative (and much more sophisticated) implementations.

### Citations

8530 |
Introduction to algorithms
- Cormen, Leiserson, et al.
- 2001
(Show Context)
Citation Context ...y of an arbitrary element (increaseKey and decreaseKey, respectively), delete an arbitrary element from the priority queue (delete), and a long so on. The classic PQ implementation uses a binary heap =-=[42, 11]-=-. Wegener [41] proposes a bottomup deletion algorithm, which addresses operation extractMin performing only log 2 m + O(1) key comparisons per extraction on average, in heaps of m elements. Other well... |

2362 | Modern Information Retrieval
- Baeza-Yates, Ribeiro-Neto
- 1999
(Show Context)
Citation Context ...llest elements from a fixed set without knowing how many elements we will end up needing. Prominent examples are Kruskal’s Minimum Spanning Tree (MST) algorithm [24] and ranking by Web search engines =-=[3]-=-. Given a graph, Kruskal’s MST algorithm processes the edges one by one, from smallest to largest, until it forms the MST. At this point, remaining edges are not considered. Web search engines display... |

1681 |
Core Team R : A Language and Environment for Statistical Computing. R Foundation for Statistical Computing
- Development
(Show Context)
Citation Context ...were coded in C++, and compiled with g++ version 3.3.6 optimized with -O3. For each experimental datum shown, we averaged over 50 repetitions. The weighted least square fittings were performed with R =-=[37]-=-. In order to illustrate the precision (∣ of our fittings, ) ∣∣ y−ˆy we also show the average percent error of residuals with respect to real values y ∣ 100% for fittings belonging to around the large... |

729 | E.: "Amortized efficiency of list update and paging rules - Sleator, Tarjan - 1985 |

603 |
Data Structures and Network Algorithms
- Tarjan
- 1983
(Show Context)
Citation Context ...orithms to solve this problem are Kruskal’s [24] and Prim’s [31], whose basic versions have complexity O(m log m) and O ( n2) , respectively. There are several other MST algorithms compiled by Tarjan =-=[35]-=-. Recently, Chazelle [9] gave an O(mα(m,n)) time algorithm. Later, Pettie and Ramachandran [30] proposed an algorithm that runs in optimal time O(T ∗ (m,n)), where T ∗ (m,n) is the minimum number of e... |

575 |
Fibonacci heaps and their uses in improved network optimization algorithms
- Fredman, Tarjan
- 1987
(Show Context)
Citation Context ...ctMin performing only log 2 m + O(1) key comparisons per extraction on average, in heaps of m elements. Other well-known priority queues are sequence heaps [32], binomial queues [40], Fibonacci heaps =-=[17]-=-, pairing heaps [16], skew heaps [34], and van Emde Boas queues [38]. All are based on binary comparisons, except the latter which handles an integer universe [0,m]. 1.2 External Memory Priority Queue... |

531 |
Shortest connection networks and some generalizations
- Prim
- 1957
(Show Context)
Citation Context ...enarios. In fact, we plug them in the classic Minimum Spanning Tree (MST) techniques: We use incremental quicksort to boost Kruskal’s MST algorithm [24], and a quickheap to boost Prim’s MST algorithm =-=[31]-=-. Given a random graph G(V,E), we compute its MST in O(|E| + |V |log 2 |V |) average time. Experimental results show that IQS, QHs, external QHs and our Kruskal’s and Prim’s MST variants are extremely... |

436 |
On the Shortest Spanning Subtree of a Graph and the Travelling Salesman Problem
- Kruskal
- 1956
(Show Context)
Citation Context ...e cases where we need to obtain the smallest elements from a fixed set without knowing how many elements we will end up needing. Prominent examples are Kruskal’s Minimum Spanning Tree (MST) algorithm =-=[24]-=- and ranking by Web search engines [3]. Given a graph, Kruskal’s MST algorithm processes the edges one by one, from smallest to largest, until it forms the MST. At this point, remaining edges are not ... |

369 | Time bounds for selection
- Blum, Floyd, et al.
- 1973
(Show Context)
Citation Context ...A of m numbers and an integer k ≤ m, output the smallest k elements of A in ascending order. This can be easily solved by first finding the k-th smallest element of A using O(m) time Select algorithm =-=[5]-=-, and then collecting and sorting the elements smaller than the k-th element. The resulting complexity, O(m + k log k), is optimal under the comparison model, as every cell must be inspected and there... |

320 | External Memory Algorithms and Data Structures: Dealing with Massive Data
- Vitter
- 1981
(Show Context)
Citation Context ... the queue. 3Some external memory PQs are buffer trees [2, 20], M/B-ary heaps [25, 14], and Array Heaps [8], all of which achieve the lower bound of Θ((1/B)log M/B(m/B)) amortized I/Os per operation =-=[39]-=-. Those structures, however, are rather complex to implement and heavyweight in practice (in extra space and time) [6]. Other techniques are simple but do not perform so well (in theory or in practice... |

199 |
Design and implementation of an efficient priority queue
- BOAS, KAAS, et al.
- 1977
(Show Context)
Citation Context ...on average, in heaps of m elements. Other well-known priority queues are sequence heaps [32], binomial queues [40], Fibonacci heaps [17], pairing heaps [16], skew heaps [34], and van Emde Boas queues =-=[38]-=-. All are based on binary comparisons, except the latter which handles an integer universe [0,m]. 1.2 External Memory Priority Queues When working in the secondary memory scenario, we assume that we h... |

152 |
Organization and maintenance of large ordered indices
- Bayer, McCreight
- 1970
(Show Context)
Citation Context ...ther complex to implement and heavyweight in practice (in extra space and time) [6]. Other techniques are simple but do not perform so well (in theory or in practice), for example those using B-trees =-=[4]-=-. A practical comparison of existing secondary memory PQs was carried out by Brengel et al. [6], where in addition they adapt two-level radix heaps [1] to secondary memory (R-Heaps), and also simplify... |

149 | The bu er tree: A new technique for optimal I/O-algorithms
- Arge
- 1995
(Show Context)
Citation Context ...herefore, in the case of quickheaps, the term Φ(qhN)−Φ(qh0) is O(1) expected. So, we can omit this term, writing the N amortized costs directly as ĉi = ci − Φ(qhi) + Φ(qhi−1). 15. . . . . . . . idx S=-=[2]-=- S[1] Figure 7: The quickheap potential debt function is computed as twice the sum of the lengths of the segments (drawn with lines) delimited by idx and pivots in S[0] to S[H]. In the figure, Φ(qh) =... |

148 |
Algorithm 232: Heapsort
- Williams
- 1964
(Show Context)
Citation Context ...inishes at some unknown value k ∈ [0,m − 1]. One can do this by using Select to find each of the first k elements, for an overall cost of O(km). This can be improved by transforming A into a min-heap =-=[42]-=- in time O(m) [15] and then performing k extractions. This premature cut-off of the heapsort algorithm [42] has O(m + k log m) worst-case complexity. Note that m + k log m = O(m + k log k), as they ca... |

138 |
Amortized computational complexity
- Tarjan
- 1985
(Show Context)
Citation Context ...e, the expected value of the sum of the array segment sizes is Θ(m). 4.2 The Potential Debt Method To carry out the amortized analysis of quickheaps we use a slight variation of the potential method (=-=[36]-=- and [11, Chapter 17]), which we call the potential debt method. In the potential method, the idea is to determine an amortized cost for each operation type. The potential method overcharges some oper... |

103 | Tarjan, Faster algorithms for the shortest path problem
- Ahuja, Mehlhorn, et al.
- 1990
(Show Context)
Citation Context ...to Moret and Shapiro 2[27], these are the fastest priority queue implementations in practice). Using the same amount of memory, our external QH performs up to 3 times fewer I/O accesses than R-Heaps =-=[1]-=- and up to 5 times fewer than Array-Heaps [8], which are the best alternatives tested in the survey by Brengel et al. [6]. External-memory Sequence Heaps [32], however, are faster than QHs, yet these ... |

86 | A data structure for manipulating priority queues
- Vuillemin
- 1978
(Show Context)
Citation Context ...resses operation extractMin performing only log 2 m + O(1) key comparisons per extraction on average, in heaps of m elements. Other well-known priority queues are sequence heaps [32], binomial queues =-=[40]-=-, Fibonacci heaps [17], pairing heaps [16], skew heaps [34], and van Emde Boas queues [38]. All are based on binary comparisons, except the latter which handles an integer universe [0,m]. 1.2 External... |

75 | Improved algorithms and data structures for solving graph problems in external memory
- Kumar, Schwabe
- 1996
(Show Context)
Citation Context ...y, insert, findMin and extractMin. This is because others, like delete or decreaseKey, need at least one random access to the queue. 3Some external memory PQs are buffer trees [2, 20], M/B-ary heaps =-=[25, 14]-=-, and Array Heaps [8], all of which achieve the lower bound of Θ((1/B)log M/B(m/B)) amortized I/Os per operation [39]. Those structures, however, are rather complex to implement and heavyweight in pra... |

74 | A minimum spanning tree algorithm with inverse-Ackermann type complexity - Chazelle - 2000 |

67 |
Finding minimum spanning trees
- Cheriton, Tarjan
- 1976
(Show Context)
Citation Context ...of Kruskal’s, Prim’s and Tarjan’s algorithms, concluding that the best in practice (albeit not in theory) is Prim’s using pairing heaps [16]. Their experiments show that neither Cheriton and Tarjan’s =-=[10]-=- nor Fredman and Tarjan’s algorithm [17] ever approach the speed of Prim’s algorithm using pairing heaps. Moreover, they show that it is possible to use heaps to improve Kruskal’s algorithm. The idea ... |

48 |
The pairing heap: A new form of self-adjusting heap
- Fredman, Sedgewick, et al.
- 1986
(Show Context)
Citation Context ...eart (and much more sophisticated) alternative implementations. IQS is approximately four times faster than the classic alternative to solve the online problem. QHs are competitive with pairing heaps =-=[16]-=- and up to four times faster than binary heaps [42] (according to Moret and Shapiro 2[27], these are the fastest priority queue implementations in practice). Using the same amount of memory, our exte... |

46 | An optimal minimum spanning tree algorithm
- Pettie, Ramachandran
(Show Context)
Citation Context ...plexity O(m log m) and O ( n2) , respectively. There are several other MST algorithms compiled by Tarjan [35]. Recently, Chazelle [9] gave an O(mα(m,n)) time algorithm. Later, Pettie and Ramachandran =-=[30]-=- proposed an algorithm that runs in optimal time O(T ∗ (m,n)), where T ∗ (m,n) is the minimum number of edge-weight comparisons needed to determine the MST of any graph G(V,E) with m edges and n verti... |

46 |
Self adjusting heaps
- Sleator, Tarjan
- 1986
(Show Context)
Citation Context ...key comparisons per extraction on average, in heaps of m elements. Other well-known priority queues are sequence heaps [32], binomial queues [40], Fibonacci heaps [17], pairing heaps [16], skew heaps =-=[34]-=-, and van Emde Boas queues [38]. All are based on binary comparisons, except the latter which handles an integer universe [0,m]. 1.2 External Memory Priority Queues When working in the secondary memor... |

45 | Fast priority queues for cached memory
- Sanders
- 1999
(Show Context)
Citation Context ...= o ( mc) for any c > 0, in which case m dominates k log m. However, according to experiments this scheme is much slower than the offline practical algorithm [26] if a classical heap is used. Sanders =-=[32]-=- proposes sequence heaps, a cache-aware priority queue, to solve the online problem. Sequence heaps are optimized to insert and extract all the elements in the priority queue at a small amortized cost... |

39 | On the limits of cache-obliviousness, in - Brodal, Fagerberg - 2003 |

38 | Stxxl: Standard Template Library for XXL Data Sets
- Dementiev, Kettner, et al.
- 2005
(Show Context)
Citation Context ...same issue, Sanders introduced sequence heaps [32], which can be seen as a simplification of the improved Array-Heaps [6]. Sanders reports that sequence heaps are faster than the improved Array-Heaps =-=[12, 13]-=-. 1.3 Minimum Spanning Trees Assume that G(V,E) is a connected undirected graph with a nonnegative cost function weighte assigned to its edges e ∈ E. A minimum spanning tree mst of the graph G(V,E) is... |

35 |
Algorithm 65: Find
- Hoare
- 1961
(Show Context)
Citation Context ... is not a big achievement because the same can be obtained using a priority queue. However, IQS performs better in practice than the best existing online algorithm. Essentially, IQS calls Quickselect =-=[19]-=- to find the smallest element of arrays A[0,m − 1], A[1,m − 1], ..., A[k − 1,m − 1]. This naturally leaves the k smallest elements sorted in A[0,k − 1]. The key point to avoid the O(km) complexity is ... |

31 | The birth of the giant component
- Janson, Knuth, et al.
- 1993
(Show Context)
Citation Context ... whose edge costs are assigned at random independently of the rest (using any continuous distribution), the subgraph composed by V with the edges reviewed by the Kruskal’s algorithm is a random graph =-=[21]-=-. Based on that analysis [21, p. 349], we expect to finish the MST construction (that is, to connect the random subgraph) upon reviewing 1 1 1 2n ln n + 2γn + 4 + O ( ) 1 n edges, which can be much sm... |

26 |
a new variant of heapsort beating, on an average, quicksort (if n is not very small
- Bottom-up-heapsort
- 1993
(Show Context)
Citation Context ...element (increaseKey and decreaseKey, respectively), delete an arbitrary element from the priority queue (delete), and a long so on. The classic PQ implementation uses a binary heap [42, 11]. Wegener =-=[41]-=- proposes a bottomup deletion algorithm, which addresses operation extractMin performing only log 2 m + O(1) key comparisons per extraction on average, in heaps of m elements. Other well-known priorit... |

20 | Heaps and heapsort on secondary storage - Fadel, Jakobsen, et al. - 1999 |

20 | An empirical analysis of algorithms for constructing a minimum spanning tree
- Moret, Shapiro
- 1991
(Show Context)
Citation Context ..., Prim’s implemented with PHs Prim2, our Prim’s implementation using QHs Prim3 and the iMax algorithm iMax [22, 23]. We have used pairing heaps as they have shown good performance in this application =-=[27, 23]-=-. According to the experiments of Section 7.1, we preferred classical heaps using the bottom-up heuristic (HEx) over sequence heaps (SH) to implement Kruskal2 in these experiments (as we expect to ext... |

15 | An Experimental Study of Priority Queues in External Memory
- Brengel, Crauser, et al.
- 1999
(Show Context)
Citation Context ...memory, our external QH performs up to 3 times fewer I/O accesses than R-Heaps [1] and up to 5 times fewer than Array-Heaps [8], which are the best alternatives tested in the survey by Brengel et al. =-=[6]-=-. External-memory Sequence Heaps [32], however, are faster than QHs, yet these are much more sophisticated and not cache-oblivious. Finally, our Kruskal’s version is much faster than any other Kruskal... |

12 |
Algorithm 245 (TREESORT
- Floyd
- 1964
(Show Context)
Citation Context ...known value k ∈ [0,m − 1]. One can do this by using Select to find each of the first k elements, for an overall cost of O(km). This can be improved by transforming A into a min-heap [42] in time O(m) =-=[15]-=- and then performing k extractions. This premature cut-off of the heapsort algorithm [42] has O(m + k log m) worst-case complexity. Note that m + k log m = O(m + k log k), as they can differ only if k... |

10 | A Practical Minimum Spanning Tree Algorithm Using the Cycle Property
- Katriel, Sanders, et al.
- 2003
(Show Context)
Citation Context ... graph density. As a matter of fact, it is faster than Prim’s algorithm [31], even as optimized by Moret and Shapiro [27], and also competitive with the best alternative implementations we could find =-=[22, 23]-=-. On the other hand, our Prim’s version is rather similar to our Kruskal’s one, yet it is resistant to some Kruskal’s worst cases, such as the lollipop graph. The rest of this paper is organized as fo... |

8 | Graphs for Metric Space Searching
- Paredes
- 2008
(Show Context)
Citation Context ...on of the MST of a graph. Section 7 gives our experimental results. Finally, in Section 8 we give our conclusions and some directions for further work. Pseudo-codes and more experiments are available =-=[28]-=-. 1.1 Priority Queues A priority queue (PQ) is a data structure which allows maintaining a set of elements in a partially ordered way, enabling efficient element insertion (insert), minimum finding (f... |

7 | Early experiences in implementing the buffer tree
- Hutchinson, Maheshwari, et al.
- 1997
(Show Context)
Citation Context ...basic operations, namely, insert, findMin and extractMin. This is because others, like delete or decreaseKey, need at least one random access to the queue. 3Some external memory PQs are buffer trees =-=[2, 20]-=-, M/B-ary heaps [25, 14], and Array Heaps [8], all of which achieve the lower bound of Θ((1/B)log M/B(m/B)) amortized I/Os per operation [39]. Those structures, however, are rather complex to implemen... |

7 | Partial quicksort
- Martinez
- 2004
(Show Context)
Citation Context ...he selection and sorting algorithms, obtaining O(m+k log k) expected complexity. Recently, it has been shown that the selection and sorting steps can be interleaved, which improves the constant terms =-=[26]-=-. To solve the online problem (incremental sort), we have to select the smallest element, then the second smallest, and so on until the process finishes at some unknown value k ∈ [0,m − 1]. One can do... |

7 | Optimal incremental sorting
- Paredes, Navarro
- 2006
(Show Context)
Citation Context ...ucleus Center for Web Research, Grant P04-067-F, Mideplan, Chile; Yahoo! Research grant “Compact Data Structures”; and Fondecyt grant 1-080019, Chile. Early parts of this work appeared in ALENEX 2006 =-=[29]-=-. 1This problem can be called Incremental Sorting. It can be stated as follows: Given a set A of m numbers, output the elements of A from smallest to largest, so that the process can be stopped after... |

2 |
Worst-case external-memory priority queues
- Brodal, Katajainen
- 1998
(Show Context)
Citation Context ...stest priority queue implementations in practice). Using the same amount of memory, our external QH performs up to 3 times fewer I/O accesses than R-Heaps [1] and up to 5 times fewer than Array-Heaps =-=[8]-=-, which are the best alternatives tested in the survey by Brengel et al. [6]. External-memory Sequence Heaps [32], however, are faster than QHs, yet these are much more sophisticated and not cache-obl... |

2 |
Algorithm Engineering for Large Data Sets
- Dementiev
- 2006
(Show Context)
Citation Context ...same issue, Sanders introduced sequence heaps [32], which can be seen as a simplification of the improved Array-Heaps [6]. Sanders reports that sequence heaps are faster than the improved Array-Heaps =-=[12, 13]-=-. 1.3 Minimum Spanning Trees Assume that G(V,E) is a connected undirected graph with a nonnegative cost function weighte assigned to its edges e ∈ E. A minimum spanning tree mst of the graph G(V,E) is... |