Results 1 -
5 of
5
Linear-Time Surface Reconstruction
, 2009
"... By now it is moderately well understood how to take as input a set of points that lie on an unknown manifold, and produce as output a piecewise linear approximation to the manifold. The question I explore here is: how fast can we do it? Previous solutions achieve a sequential runtime of O(n log n) i ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
By now it is moderately well understood how to take as input a set of points that lie on an unknown manifold, and produce as output a piecewise linear approximation to the manifold. The question I explore here is: how fast can we do it? Previous solutions achieve a sequential runtime of O(n log n) in low ambient dimensions. I show that if the points are specified in floating point coordinates, we in fact achieve linear work, and we can run in logarithmically many parallel rounds. 1
Geometric Minimum Spanning Trees with GEOFILTERKRUSKAL ⋆
"... Abstract. Let P be a set of points in R d. We propose GEOFILTERKRUSKAL,an algorithm that computes the minimum spanning tree of P using well separated pair decomposition in combination with a simple modification of Kruskal’s algorithm. When P is sampled from uniform random distribution, we show that ..."
Abstract
- Add to MetaCart
Abstract. Let P be a set of points in R d. We propose GEOFILTERKRUSKAL,an algorithm that computes the minimum spanning tree of P using well separated pair decomposition in combination with a simple modification of Kruskal’s algorithm. When P is sampled from uniform random distribution, we show that our algorithm takes one parallel sort plus a linear number of additional steps, with high probability, to compute the minimum spanning tree. Experiments show that our algorithm works better in practice for most data distributions compared to the current state of the art [31]. Our algorithm is easy to parallelize and to our knowledge, is currently the best practical algorithm on multi-core machines for d>2.
RELEVANCE-DRIVEN ACQUISITION AND RAPID ON-SITE ANALYSIS OF 3D GEOSPATIAL DATA
"... One central problem in geospatial applications using 3D models is the tradeoff between detail and acquisition cost during acquisition, as well as processing speed during use. Commonly used laser-scanning technology can be used to record spatial data in various levels of detail. Much detail, even on ..."
Abstract
- Add to MetaCart
One central problem in geospatial applications using 3D models is the tradeoff between detail and acquisition cost during acquisition, as well as processing speed during use. Commonly used laser-scanning technology can be used to record spatial data in various levels of detail. Much detail, even on a small scale, requires the complete scan to be conducted at high resolution and leads to long acquisition time, as well as a great amount of data and complex processing. Therefore, we propose a new scheme for the generation of geospatial 3D models that is driven by relevance rather than data. As part of that scheme we present a novel acquisition and analysis workflow, as well as supporting data-models. The workflow includes onsite data evaluation (e.g. quality of the scan) and presentation (e.g. visualization of the quality), which demands fast data processing. Thus, we employ high performance graphics cards (GPGPU) to effectively process and analyze large volumes of LIDAR data. In particular we present a density calculation based on k-nearest-neighbor determination using OpenCL. The presented GPGPU-accelerated workflow enables a fast data acquisition with highly detailed relevant objects and minimal storage requirements. 1.
Efficient Parallel kNN Joins for Large Data in MapReduce
"... In data mining applications and spatial and multimedia databases, a useful tool is the kNN join, which is to produce the k nearest neighbors (NN), from a dataset S, of every point in a dataset R. Since it involves both the join and the NN search, performing kNN joins efficiently is a challenging tas ..."
Abstract
- Add to MetaCart
In data mining applications and spatial and multimedia databases, a useful tool is the kNN join, which is to produce the k nearest neighbors (NN), from a dataset S, of every point in a dataset R. Since it involves both the join and the NN search, performing kNN joins efficiently is a challenging task. Meanwhile, applications continue to witness a quick (exponential in some cases) increase in the amount of data to be processed. A popular model nowadays for large-scale data processing is the shared-nothing cluster on a number of commodity machines using MapReduce [6]. Hence, how to execute kNN joins efficiently on large data that are stored in a MapReduce cluster is an intriguing problem that meets many practical needs. This work proposes novel (exact and approximate) algorithms in MapReduce to perform efficient parallel kNN joins on large data. We demonstrate our ideas using Hadoop. Extensive experiments in large real and synthetic datasets, with tens or hundreds of millions of records in both R and S and up to 30 dimensions, have demonstrated the efficiency, effectiveness, and scalability of our methods.

