Results 1  10
of
76
A Parallel Hashed OctTree NBody Algorithm
, 1993
"... We report on an efficient adaptive Nbody method which we have recently designed and implemented. The algorithm computes the forces on an arbitrary distribution of bodies in a time which scales as N log N with the particle number. The accuracy of the force calculations is analytically bounded, and c ..."
Abstract

Cited by 147 (11 self)
 Add to MetaCart
We report on an efficient adaptive Nbody method which we have recently designed and implemented. The algorithm computes the forces on an arbitrary distribution of bodies in a time which scales as N log N with the particle number. The accuracy of the force calculations is analytically bounded, and can be adjusted via a user defined parameter between a few percent relative accuracy, down to machine arithmetic accuracy. Instead of using pointers to indicate the topology of the tree, we identify each possible cell with a key. The mapping of keys into memory locations is achieved via a hash table. This allows the program to access data in an efficient manner across multiple processors. Performance of the parallel program is measured on the 512 processor Intel Touchstone Delta system. We also comment on a number of wideranging applications which can benefit from application of this type of algorithm.
`NBody' Problems in Statistical Learning
, 2001
"... We present efficient algorithms for allpointpairs problems, or 'Nbody 'like problems, which are ubiquitous in statistical learning. We focus on six examples, including nearestneighbor classification, kernel density estimation, outlier detection, and the twopoint correlation. ..."
Abstract

Cited by 90 (12 self)
 Add to MetaCart
We present efficient algorithms for allpointpairs problems, or 'Nbody 'like problems, which are ubiquitous in statistical learning. We focus on six examples, including nearestneighbor classification, kernel density estimation, outlier detection, and the twopoint correlation.
Abstractions for Recursive Pointer Data Structures: Improving the Analysis and Transformation of Imperative Programs
, 1992
"... Even though impressive progress has been made... ..."
Astrophysical Nbody Simulations Using Hierarchical Tree Data Structures
, 1992
"... We report on recent large astrophysical Nbody simulations executed on the Intel Touchstone Delta system. We review the astrophysical motivation, and the numerical techniques, and discuss steps taken to parallelize these simulations. The methods scale as O(N log N), for large values of N , and also ..."
Abstract

Cited by 86 (11 self)
 Add to MetaCart
We report on recent large astrophysical Nbody simulations executed on the Intel Touchstone Delta system. We review the astrophysical motivation, and the numerical techniques, and discuss steps taken to parallelize these simulations. The methods scale as O(N log N), for large values of N , and also scale linearly with the number of processors. The performance, sustained for a duration of 67 hours was between 5.1 and 5.4 Gflop/sec on a 512 processor system.
The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data
 In Twelfth Conference on Uncertainty in Artificial Intelligence
, 2000
"... This paper is about metric data structures in highdimensional or nonEuclidean space that permit cached sufficient statistics accelerations of learning algorithms. ..."
Abstract

Cited by 75 (8 self)
 Add to MetaCart
This paper is about metric data structures in highdimensional or nonEuclidean space that permit cached sufficient statistics accelerations of learning algorithms.
Commutativity Analysis: A New Analysis Technique for Parallelizing Compilers
 ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1997
"... This article presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointerbased data structures. Commutativity analysis views the computation as composed of operations on objects. It then analyzes the program at this granula ..."
Abstract

Cited by 71 (9 self)
 Add to MetaCart
This article presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointerbased data structures. Commutativity analysis views the computation as composed of operations on objects. It then analyzes the program at this granularity to discover when operations commute (i.e., generate the same final result regardless of the order in which they execute). If all of the operations required to perform a given computation commute, the compiler can automatically generate parallel code. We have implemented a prototype compilation system that uses commutativity analysis as its primary analysis technique
A General Data Dependence Test for Dynamic, PointerBased Data Structures
 In Proc. ACM PLDI
, 1994
"... Optimizing compilers require accurate dependence testing to enable numerous, performanceenhancing transformations. However, data dependence testing is a difficult problem, particularly in the presence of pointers. Though existing approaches work well for pointers to named memory locations (i.e. oth ..."
Abstract

Cited by 71 (2 self)
 Add to MetaCart
Optimizing compilers require accurate dependence testing to enable numerous, performanceenhancing transformations. However, data dependence testing is a difficult problem, particularly in the presence of pointers. Though existing approaches work well for pointers to named memory locations (i.e. other variables) , they are overly conservative in the case of pointers to unnamed memory locations. The latter occurs in the context of dynamic, pointerbased data structures, used in a variety of applications ranging from system software to computational geometry to Nbody and circuit simulations. In this paper we present a new technique for performing more accurate data dependence testing in the presence of dynamic, pointerbased data structures. We will demonstrate its effectiveness by breaking false dependences that existing approaches cannot, and provide results which show that removing these dependences enables significant parallelization of a real application. 1 Introduction and Motiv...
Dynamic Feedback: An Effective Technique for Adaptive Computing
, 1997
"... This paper presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code; each version uses a different optimization policy. The generated co ..."
Abstract

Cited by 60 (4 self)
 Add to MetaCart
This paper presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code; each version uses a different optimization policy. The generated code alternately performs sampling phases and production phases. Each sampling phase measures the overhead of each version in the current environment. Each production phase uses the version with the least overhead in the previous sampling phase. The computation periodically resamples to adjust dynamically to changes in the environment.
A Portable Parallel Particle Program
 Computer Physics Communications
, 1995
"... We describe our implementation of the parallel hashed octtree (HOT) code, and in particular its application to neighbor finding in a smoothed particle hydrodynamics (SPH) code. We also review the error bounds on the multipole approximations involved in treecodes, and extend them to include general ..."
Abstract

Cited by 53 (7 self)
 Add to MetaCart
We describe our implementation of the parallel hashed octtree (HOT) code, and in particular its application to neighbor finding in a smoothed particle hydrodynamics (SPH) code. We also review the error bounds on the multipole approximations involved in treecodes, and extend them to include general cellcell interactions. Performance of the program on a variety of problems (including gravity, SPH, vortex method and panel method) is measured on several parallel and sequential machines. 1 Introduction There are two strategies that can be applied in the quest for more knowledge from bigger and better particle simulations. One can use the brute force approach; simple algorithms on bigger and faster machines (and bigger and faster now means massively parallel). To compute the gravitational force and potential for a single interaction takes 28 floating point operations (here we count a division as 4 floating point operations and a square root as 4 floating point operations). A typical grav...
Commutativity analysis: A new analysis framework for parallelizing compilers
 In Programming Language Design and Implementation (PLDI
, 1996
"... This paper presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointerbased data structures. Commutativity analysis views the computation as composed of operations on objects. It then analyzes the program at this granulari ..."
Abstract

Cited by 48 (8 self)
 Add to MetaCart
This paper presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointerbased data structures. Commutativity analysis views the computation as composed of operations on objects. It then analyzes the program at this granularity to discover when operations commute (i.e. generate the same final result regardless of the order in which they execute). If all of the operations required to perform a given computation commute, the compiler can automatically generate parallel code. We have implemented a prototype compilation system that uses commutativity analysis as its primary analysis framework. We have used this system to automatically parallelize two complete scientific computations: the BarnesHut Nbody solver and the Water code. This paper presents performance results for the generated parallel code running on the Stanford DASH machine. These results provide encouraging evidence that commutativity analysis can serve as the basis for a successful parallelizing compiler. 1