Results 1  10
of
13
Delaunay Triangulation with Transactions and Barriers
 IEEE Intl. Symp. on Workload Characterization
, 2007
"... Transactional memory has been widely hailed as a simpler alternative to locks in multithreaded programs, but few nontrivial transactional programs are currently available. We describe an opensource implementation of Delaunay triangulation that uses transactions as one component of a larger parallel ..."
Abstract

Cited by 37 (10 self)
 Add to MetaCart
(Show Context)
Transactional memory has been widely hailed as a simpler alternative to locks in multithreaded programs, but few nontrivial transactional programs are currently available. We describe an opensource implementation of Delaunay triangulation that uses transactions as one component of a larger parallelization strategy. The code is written in C++, for use with the RSTM software transactional memory library (also open source). It employs one of the fastest known sequential algorithms to triangulate geometrically partitioned regions in parallel; it then employs alternating, barrierseparated phases of transactional and partitioned work to stitch those regions together. Experiments on multiprocessor and multicore machines confirm excellent singlethread performance and good speedup with increasing thread count. Since execution time is dominated by geometrically partitioned computation, performance is largely insensitive to the overhead of transactions, but highly sensitive to any costs imposed on sharable data that are currently “privatized”. 1.
Alchemist: A transparent dependence distance profiling infrastructure
 In CGO ’09: Proceedings of the 2009 International Symposium on Code Generation and Optimization
, 2009
"... Abstract—Effectively migrating sequential applications to take advantage of parallelism available on multicore platforms is a wellrecognized challenge. This paper addresses important aspects of this issue by proposing a novel profiling technique to automatically detect available concurrency in C pr ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Effectively migrating sequential applications to take advantage of parallelism available on multicore platforms is a wellrecognized challenge. This paper addresses important aspects of this issue by proposing a novel profiling technique to automatically detect available concurrency in C programs. The profiler, called Alchemist, operates completely transparently to applications, and identifies constructs at various levels of granularity (e.g., loops, procedures, and conditional statements) as candidates for asynchronous execution. Various dependences including readafterwrite (RAW), writeafterread (WAR), and writeafterwrite (WAW), are detected between a construct and its continuation, the execution following the completion of the construct. The timeordered distance between program points forming a dependence gives a measure of the effectiveness of parallelizing that construct, as well as identifying the transformations necessary to facilitate such parallelization. Using the notion of postdominance, our profiling algorithm builds an execution index tree at runtime. This tree is used to differentiate among multiple instances of the same static construct, and leads to improved accuracy in the computed profile, useful to better identify constructs that are amenable to parallelization. Performance results indicate that the profiles generated by Alchemist pinpoint strong candidates for parallelization, and can help significantly ease the burden of application migration to multicore environments. Keywordsprofiling; program dependence; parallelization; execution indexing I.
Parallel Poisson Surface Reconstruction
"... Abstract. In this work we describe a parallel implementation of the Poisson Surface Reconstruction algorithm based on multigrid domain decomposition. We compare implementations using different models of datasharing between processors and show that a parallel implementation with distributed memory p ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract. In this work we describe a parallel implementation of the Poisson Surface Reconstruction algorithm based on multigrid domain decomposition. We compare implementations using different models of datasharing between processors and show that a parallel implementation with distributed memory provides the best scalability. Using our method, we are able to parallelize the reconstruction of models from one billion data points on twelve processors across three machines, providing a ninefold speedup in running time without sacrificing reconstruction accuracy. 1
Engineering a compact parallel delaunay algorithm in 3d
 In Proceedings of the ACM Symposium on Computational Geometry
, 2006
"... We describe an implementation of a compact parallel algorithm for 3D Delaunay tetrahedralization on a 64processor sharedmemory machine. Our algorithm uses a concurrent version of the BowyerWatson incremental insertion, and a threadsafe spaceefficient structure for representing the mesh. Using t ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
We describe an implementation of a compact parallel algorithm for 3D Delaunay tetrahedralization on a 64processor sharedmemory machine. Our algorithm uses a concurrent version of the BowyerWatson incremental insertion, and a threadsafe spaceefficient structure for representing the mesh. Using the implementation we are able to generate significantly larger Delaunay meshes than have previously been generated—10 billion tetrahedra on a 64 processor SMP using 200GB of RAM. The implementation makes use of a locality based relabeling of the vertices that serves three purposes—it is used as part of the space efficient representation, it improves the memory locality, and it reduces the overhead necessary for locks. The implementation also makes use of a caching technique to avoid excessive decoding of vertex information, a technique for backing out of insertions that collide, and a shared work queue for maintaining points that have yet to be inserted.
Practical Parallel DivideandConquer Algorithms
, 1997
"... Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM impleme ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM implementation layer that cannot be efficiently mapped to MPPs with high interprocessor latency. This thesis shows that by restricting the problem set to that of dataparallel divideandconquer algorithms I can maintain the expressibility of full nested dataparallel languages while achieving good efficiency on current distributedmemory machines. Specifically, I define
Compact Data Structures with Fast Queries
, 2005
"... Many applications dealing with large data structures can benefit from keeping them in compressed form. Compression has many benefits: it can allow a representation to fit in main memory rather than swapping out to disk, and it improves cache performance since it allows more data to fit into the c ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Many applications dealing with large data structures can benefit from keeping them in compressed form. Compression has many benefits: it can allow a representation to fit in main memory rather than swapping out to disk, and it improves cache performance since it allows more data to fit into the cache. However, a data structure is only useful if it allows the application to perform fast queries (and updates) to the data.
Efficient Parallel Implementations of 2D Delaunay Triangulation with High Performance Fortran
, 2000
"... ..."
Parallel Delaunay triangulation for Particle Finite Element Methods
, 2006
"... Delaunay triangulation is a geometric problem that is relatively difficult to parallelize. Parallel algorithms are usually characterized of considerable interprocessor communication or important serialized parts. In this paper, we propose a method that achieves high speedups, but needs information ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Delaunay triangulation is a geometric problem that is relatively difficult to parallelize. Parallel algorithms are usually characterized of considerable interprocessor communication or important serialized parts. In this paper, we propose a method that achieves high speedups, but needs information regarding locally maximum element circumspheres prior to the beginning of the algorithm. Such information is directly available in iterative methods, like the Particle Finite Element Methods. The developed parallel Delaunay triangulation method has minimum communication requirements, is quite simple and achieves high parallel efficiency.
Constrained Delaunay triangulation using plane subdivision
 Proceedings of the 8th central European seminar on computer graphics
, 2004
"... This paper presents an algorithm for obtaining a constrained Delaunay triangulation from a given planar graph. The main advantage towards other algorithms is that I use an efficient ˇZalik’s algorithm, using a plane subdivison for obtaining a Delaunay triangulation. It is used for insertion of point ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
This paper presents an algorithm for obtaining a constrained Delaunay triangulation from a given planar graph. The main advantage towards other algorithms is that I use an efficient ˇZalik’s algorithm, using a plane subdivison for obtaining a Delaunay triangulation. It is used for insertion of points into existing triangulation. The other part of algorithm presents a method for inserting edges, already proposed by Anglada. The algorithm is fast and efficient and therefore appropriate for GIS applications.