Results 1 -
6 of
6
I/O-efficient batched union-find and its applications to terrain analysis
- In Proc. 22nd Annual Symposium on Computational Geometry
, 2006
"... Despite extensive study over the last four decades and numerous applications, no I/O-efficient algorithm is known for the union-find problem. In this paper we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N union and find opera ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
Despite extensive study over the last four decades and numerous applications, no I/O-efficient algorithm is known for the union-find problem. In this paper we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N union and find operations, where each union operation joins two distinct sets, our algorithm uses O(SORT(N)) = O ( N B log M/B N I/Os, where M is the memory size and B is the disk block size. This bound is asymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses O(SORT(N) + MST(N)) I/Os, where MST(N) is the number of I/Os needed to compute the minimum spanning tree of a graph with N edges. We also describe a simple and practical O(SORT(N) log ( N M))-I/O algorithm for this problem, which we have implemented. We are interested in the union-find problem because of its applications in terrain analysis. A terrain can be abstracted as a height function defined over R2, and many problems that deal with such functions require a union-find data structure. With the emergence of modern mapping technologies, huge amount of elevation data is being generated that is too large to fit in memory, thus I/O-efficient algorithms are needed to process this data efficiently. In this paper, we study two terrain-analysis problems that benefit from a union-find data structure: (i) computing topological persistence and (ii) constructing the contour tree. We give the first O(SORT(N))-I/O algorithms for these two problems, assuming that the input terrain is represented as a triangular mesh with N vertices. Finally, we report some preliminary experimental results, showing that our algorithms give order-ofmagnitude improvement over previous methods on large data sets that do not fit in memory. 1
The filter-kruskal minimum spanning tree algorithm
, 2009
"... We present Filter-Kruskal – a simple modification of Kruskal’s algorithm that avoids sorting edges that are “obviously ” not in the MST. For arbitrary graphs with random edge weights Filter-Kruskal runs in time O ( m + n lognlog m n, i.e. in linear time for not too sparse graphs. Experiments indicat ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We present Filter-Kruskal – a simple modification of Kruskal’s algorithm that avoids sorting edges that are “obviously ” not in the MST. For arbitrary graphs with random edge weights Filter-Kruskal runs in time O ( m + n lognlog m n, i.e. in linear time for not too sparse graphs. Experiments indicate that the algorithm has very good practical performance over the entire range of edge densities. An equally simple parallelization seems to be the currently best practical algorithm on multicore machines. 1
Applications of forbidden 0-1 matrices to search tree and path compression based data structures
, 2009
"... In this paper we improve, reprove, and simplify a variety of theorems concerning the performance of data structures based on path compression and search trees. We apply a technique very familiar to computational geometers but still foreign to many researchers in (non-geometric) algorithms and data s ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
In this paper we improve, reprove, and simplify a variety of theorems concerning the performance of data structures based on path compression and search trees. We apply a technique very familiar to computational geometers but still foreign to many researchers in (non-geometric) algorithms and data structures, namely, to bound the complexity of an object via its forbidden substructures. To analyze an algorithm or data structure in the forbidden substructure framework one proceeds in three discrete steps. First, one transcribes the behavior of the algorithm as some combinatorial object M; for example, M may be a graph, sequence, permutation, matrix, set system, or tree. (The size of M should ideally be linear in the running time.) Second, one shows that M excludes some forbidden substructure P, and third, one bounds the size of any object avoiding this substructure. The power of this framework derives from the fact that M lies in a more pristine environment and that upper bounds on the size of a P-free object M may be reused in different contexts. All of our proofs begin by transcribing the individual operations of a dynamic data structure
Union-Find with Constant Time Deletions
"... A union-find data structure maintains a collection of disjoint sets under makeset, union and find operations. Kaplan, Shafrir and Tarjan [SODA 2002] designed data structures for an extension of the unionfind problem in which elements of the sets maintained may be deleted. The cost of a delete opera ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A union-find data structure maintains a collection of disjoint sets under makeset, union and find operations. Kaplan, Shafrir and Tarjan [SODA 2002] designed data structures for an extension of the unionfind problem in which elements of the sets maintained may be deleted. The cost of a delete operation in their implementations is the same as the cost of a find operation. They left open the question whether delete operations can be implemented more efficiently than find operations. We resolve this open problem by presenting a relatively simple modification of the classical union-find data structure that supports delete, aswell as makeset and union, operations in constant time, while still supporting find operations in O(log n) worst-case time and O(α(n)) amortized time, where n is the number of elements in the set returned by the find operation, and α(n) is a functional inverse of Ackermann’s function.
I/O-Efficient Batched Union-Find and Its . . .
"... Despite extensive study over the last four decades and numerous applications, no I/O-efficient al-gorithm is known for the union-find problem. In this paper we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N mixed union andfin ..."
Abstract
- Add to MetaCart
Despite extensive study over the last four decades and numerous applications, no I/O-efficient al-gorithm is known for the union-find problem. In this paper we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N mixed union andfind operations, where each union operation joins two distinct sets, our algorithm uses O(SORT(N)) = O ( NB logM/B NB) I/Os, where M is the memory size and B is the disk block size. This bound isasymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses O(SORT(N) + MST(N)) I/Os, where MST(N) is the number of I/Os needed to com-pute the minimum spanning tree of a graph with N edges. We also describe a simple and practical O(SORT(N) log ( NM))-I/O algorithm, which we have implemented.The main motivation for our study of the union-find problem arises from problems in terrain analysis. A terrain can be abstracted as a height function defined over R2, and many problems that deal with suchfunctions require a union-find data structure. With the emergence of modern mapping technologies, huge amount of data is being generated that is too large to fit in memory, thus I/O-efficient algorithmsare needed to process this data efficiently. In this paper, we study two terrain analysis problems that benefit from a union-find data structure: (i) computing topological persistence and (ii) constructing thecontour tree. We give the first O(SORT(N))-I/O algorithms for these two problems, assuming that theinput terrain is represented as a triangular mesh with N vertices.Finally, we report some preliminary experimental results, showing that our algorithms give order-ofmagnitude improvement over previous methods on large data sets that do not fit in memory.
Topics in Algorithms -- Data Structures and . . .
, 2005
"... This dissertation is divided into two parts. Part I concerns algorithms and data structures on trees or involving trees. Here we study three different problems: ef-ficient binary dispatching in object-oriented languages, tree inclusion, and union-find with deletions. The results in Part II fall wit ..."
Abstract
- Add to MetaCart
This dissertation is divided into two parts. Part I concerns algorithms and data structures on trees or involving trees. Here we study three different problems: ef-ficient binary dispatching in object-oriented languages, tree inclusion, and union-find with deletions. The results in Part II fall within the heading of approximation algorithms. Here we study variants of the k-center problem and hardness of approximation of the dial-a-ride problem. Binary Dispatching The dispatching problem for object oriented languages is the problem of determining the most specialized method to invoke for calls at run-time. This can be a critical component of execution performance. The unary dis-patching problem is equivalent to the tree color problem. The binary dispatching prob-lem can be seen as a 2-dimensional generalization of the tree color problem which we call the bridge color problem. We give

