Results 1 
5 of
5
Distributed Learning on Very Large Data Sets
 In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2000
"... One approach to learning from intractably large data sets is to utilize all the training data by learning models on tractably sized subsets of the data. The subsets of data may be disjoint or partially overlapping. The individual learned models may be combined into a single model or a voting approac ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
One approach to learning from intractably large data sets is to utilize all the training data by learning models on tractably sized subsets of the data. The subsets of data may be disjoint or partially overlapping. The individual learned models may be combined into a single model or a voting approachmay be used to combine the classi#cations of a set of models. An approach to learning models in parallel from arbitrarily large training data sets and combining them into a classi#er is described. The training sets are disjoint in the work described here. A parallel implementation on the DOE's ASCI Red parallel supercomputer is described. Results with data sets small enough to be handled by a single processor show that data sets can be divided into a moderate number of distinct subsets without degrading classi#er accuracy. Speedup results are shown for a parallel implementation on the ASCI Red with data sets too large to be handled on a single processor. Training sets of size 3 to 50 millio...
Coherent Culling and Shading for Large Molecular Dynamics Visualization
 In Eurographics/IEEE Symposium on Visualization
, 2010
"... Molecular dynamics simulations are a principal tool for studying molecular systems. Such simulations are used to investigate molecular structure, dynamics, and thermodynamical properties, as well as a replacement for, or complement to, costly and dangerous experiments. With the increasing availabili ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
Molecular dynamics simulations are a principal tool for studying molecular systems. Such simulations are used to investigate molecular structure, dynamics, and thermodynamical properties, as well as a replacement for, or complement to, costly and dangerous experiments. With the increasing availability of computational power the resulting data sets are becoming increasingly larger, and benchmarks indicate that the interactive visualization on desktop computers poses a challenge when rendering substantially more than millions of glyphs. Trading visual quality for rendering performance is a common approach when interactivity has to be guaranteed. In this paper we address both problems and present a method for highquality visualization of massive molecular dynamics data sets. We employ several optimization strategies on different levels of granularity, such as data quantization, data caching in video memory, and a twolevel occlusion culling strategy: coarse culling via hardware occlusion queries and a vertexlevel culling using maximum depth mipmaps. To ensure optimal image quality we employ GPU raycasting and deferred shading with smooth normal vector generation. We demonstrate that our method allows us to interactively render data sets containing tens of millions of highquality glyphs.
De Novo Ultrascale Atomistic Simulations On HighEnd Parallel Supercomputers
 International Journal of High Performance Computing Applications
"... We present a de novo hierarchical simulation framework for firstprinciples based predictive simulations of materials and their validation on highend parallel supercomputers and geographically distributed clusters. In this framework, highend chemically reactive and nonreactive molecular dynamics ( ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We present a de novo hierarchical simulation framework for firstprinciples based predictive simulations of materials and their validation on highend parallel supercomputers and geographically distributed clusters. In this framework, highend chemically reactive and nonreactive molecular dynamics (MD) simulations explore a wide solution space to discover microscopic mechanisms that govern macroscopic material properties, into which highly accurate quantum mechanical (QM) simulations are embedded to validate the discovered mechanisms and quantify the uncertainty of the solution. The framework includes an embedded divideandconquer (EDC) algorithmic framework for the design of linearscaling simulation algorithms with minimal bandwidth complexity and tight error control. The EDC framework also enables adaptive hierarchical simulation with automated
Approximate covering detection among contentbased subscriptions using space filling curves
 in IEEE International Conference on Distributed Computing Systems
, 2007
"... We consider a problem that arises during the propagation of subscriptions in a contentbased publishsubscribe system. Subscription covering is a promising optimization that reduces the number of subscriptions propagated, and hence the size of routing tables in a contentbased publishsubscribe syst ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We consider a problem that arises during the propagation of subscriptions in a contentbased publishsubscribe system. Subscription covering is a promising optimization that reduces the number of subscriptions propagated, and hence the size of routing tables in a contentbased publishsubscribe system. However, detecting covering relationships among subscriptions can be an expensive computational task that potentially reduces the utility of covering as an optimization. We introduce an alternate approach approximate subscription covering, which provide much of the benefits of subscription covering at a fraction of its cost. By forgoing an exhaustive search for covering subscriptions in favor of an approximate search, it is shown that the time complexity of covering detection can be dramatically reduced. The trade off between efficiency of covering detection and the approximation error is demonstrated through the analysis of indexes for multiattribute subscriptions based on space filling curves. 1