Results 1  10
of
184
The buffer tree: A new technique for optimal I/Oalgorithms
 University of Aarhus
, 1995
"... ..."
(Show Context)
A Functional Approach to External Graph Algorithms
 Algorithmica
, 1998
"... . We present a new approach for designing external graph algorithms and use it to design simple external algorithms for computing connected components, minimum spanning trees, bottleneck minimum spanning trees, and maximal matchings in undirected graphs and multigraphs. Our I/O bounds compete w ..."
Abstract

Cited by 109 (2 self)
 Add to MetaCart
(Show Context)
. We present a new approach for designing external graph algorithms and use it to design simple external algorithms for computing connected components, minimum spanning trees, bottleneck minimum spanning trees, and maximal matchings in undirected graphs and multigraphs. Our I/O bounds compete with those of previous approaches. Unlike previous approaches, ours is purely functionalwithout side effectsand is thus amenable to standard checkpointing and programming language optimization techniques. This is an important practical consideration for applications that may take hours to run. 1 Introduction We present a divideandconquer approach for designing external graph algorithms, i.e., algorithms on graphs that are too large to fit in main memory. Our approach is simple to describe and implement: it builds a succession of graph transformations that reduce to sorting, selection, and a recursive bucketing technique. No sophisticated data structures are needed. We apply our t...
GraphChi: Largescale Graph Computation On just a PC
 In Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, OSDI’12
, 2012
"... Current systems for graph computation require a distributed computing cluster to handle very large realworld problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains c ..."
Abstract

Cited by 109 (6 self)
 Add to MetaCart
(Show Context)
Current systems for graph computation require a distributed computing cluster to handle very large realworld problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains challenging, especially to nonexperts. In this work, we present GraphChi, a diskbased system for computing efficiently on graphs with billions of edges. By using a wellknown method to break large graphs into small parts, and a novel parallel sliding windows method, GraphChi is able to execute several advanced data mining, graph mining, and machine learning algorithms on very large graphs, using just a single consumerlevel computer. We further extend GraphChi to support graphs that evolve over time, and demonstrate that, on a single computer, GraphChi can process over one hundred thousand graph updates per second, while simultaneously performing computation. We show, through experiments and theoretical analysis, that GraphChi performs well on both SSDs and rotational hard drives. By repeating experiments reported for existing distributed systems, we show that, with only fraction of the resources, GraphChi can solve the same problems in very reasonable time. Our work makes largescale graph computation available to anyone with a modern PC. 1
A survey on pagerank computing
 Internet Mathematics
, 2005
"... Abstract. This survey reviews the research related to PageRank computing. Components of a PageRank vector serve as authority weights for web pages independent of their textual content, solely based on the hyperlink structure of the web. PageRank is typically used as a web search ranking component. T ..."
Abstract

Cited by 104 (0 self)
 Add to MetaCart
Abstract. This survey reviews the research related to PageRank computing. Components of a PageRank vector serve as authority weights for web pages independent of their textual content, solely based on the hyperlink structure of the web. PageRank is typically used as a web search ranking component. This defines the importance of the model and the data structures that underly PageRank processing. Computing even a single PageRank is a difficult computational task. Computing many PageRanks is a much more complex challenge. Recently, significant effort has been invested in building sets of personalized PageRank vectors. PageRank is also used in many diverse applications other than ranking. We are interested in the theoretical foundations of the PageRank formulation, in the acceleration of PageRank computing, in the effects of particular aspects of web graph structure on the optimal organization of computations, and in PageRank stability. We also review alternative models that lead to authority indices similar to PageRank and the role of such indices in applications other than web search. We also discuss linkbased search personalization and outline some aspects of PageRank infrastructure from associated measures of convergence to link preprocessing. 1.
Interactive OutOfCore Isosurface Extraction
"... In this paper, we present a novel outofcore technique for the interactive computation of isosurfaces from volume data. Our algorithm minimizes the main memory and disk space requirements on the visualization workstation, while speeding up isosurface extraction queries. Our overall approach is a tw ..."
Abstract

Cited by 94 (18 self)
 Add to MetaCart
In this paper, we present a novel outofcore technique for the interactive computation of isosurfaces from volume data. Our algorithm minimizes the main memory and disk space requirements on the visualization workstation, while speeding up isosurface extraction queries. Our overall approach is a twolevel indexing scheme. First, by our metacell technique, we partition the original dataset into clusters of cells, called metacells. Secondly, we produce metaintervals associated with the metacells, and build an indexing data structure on the metaintervals. We separate the cell information, kept only in metacells in disk, from the indexing structure, which is also in disk and only contains pointers to metacells. Our metacell technique is an I/Oefficient approach for computing a kdtreelike partition of the dataset. Our indexing data structure, the binaryblocked I/O interval tree, is a new I/Ooptimal data structure to perform stabbing queries that report from a set of metainte...
Adaptive TetraPuzzles: Efficient OutofCore Construction and Visualization of Gigantic Multiresolution Polygonal Models
 ACM Transactions on Graphics
, 2004
"... We describe an efficient technique for outofcore construction and accurate viewdependent visualization of very large surface models. The method uses a regular conformal hierarchy of tetrahedra to spatially partition the model. Each tetrahedral cell contains a precomputed simplified version of the ..."
Abstract

Cited by 83 (32 self)
 Add to MetaCart
We describe an efficient technique for outofcore construction and accurate viewdependent visualization of very large surface models. The method uses a regular conformal hierarchy of tetrahedra to spatially partition the model. Each tetrahedral cell contains a precomputed simplified version of the original model, represented using cache coherent indexed strips for fast rendering. The representation is constructed during a finetocoarse simplification of the surface contained in diamonds (sets of tetrahedral cells sharing their longest edge). The construction preprocess operates outofcore and parallelizes nicely. Appropriate boundary constraints are introduced in the simplification to ensure that all conforming selective subdivisions of the tetrahedron hierarchy lead to correctly matching surface patches. For each frame at runtime, the hierarchy is traversed coarsetofine to select diamonds of the appropriate resolution given the view parameters. The resulting system can interatively render high quality views of outofcore models of hundreds of millions of triangles at over 40Hz (or 70M triangles/s) on current commodity graphics platforms.
I/O Optimal Isosurface Extraction
, 1997
"... In this paper we give I/Ooptimal techniques for the extraction of isosurfaces from volumetric data, by a novel application of the I/Ooptimal interval tree of Arge and Vitter. The main idea is to preprocess the dataset once and for all to build an efficient search structure in disk, and then each ti ..."
Abstract

Cited by 77 (17 self)
 Add to MetaCart
In this paper we give I/Ooptimal techniques for the extraction of isosurfaces from volumetric data, by a novel application of the I/Ooptimal interval tree of Arge and Vitter. The main idea is to preprocess the dataset once and for all to build an efficient search structure in disk, and then each time we want to extract an isosurface, we perform an outputsensitive query on the search structure to retrieve only those active cells that are intersected by the isosurface. During the query operation, only two blocks of main memory space are needed, and only those active cells are brought into the main memory, plus some negligible overhead of disk accesses. This implies that we can efficiently visualize very large datasets on workstations with just enough main memory to hold the isosurfaces themselves. The implementation is delicate but not complicated. We give the first implementation of the I/Ooptimal interval tree, and also implement our methods as an I/O filter for Vtk's isosurface ext...
External Memory Data Structures
, 2001
"... In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynami ..."
Abstract

Cited by 76 (32 self)
 Add to MetaCart
In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynamic data structures. We also briefly discuss some of the most popular external data structures used in practice.
The Buffer Tree: A Technique for Designing Batched External Data Structures
, 2003
"... We present a technique for designing external memory data structures that support batched operations I/O efficiently. We show how the technique can be used to develop external versions of a search tree, a priority queue, and a segment tree, and give examples of how these structures can be used to d ..."
Abstract

Cited by 75 (14 self)
 Add to MetaCart
We present a technique for designing external memory data structures that support batched operations I/O efficiently. We show how the technique can be used to develop external versions of a search tree, a priority queue, and a segment tree, and give examples of how these structures can be used to develop I/Oefficient algorithms. The developed algorithms are either extremely simple or straightforward generalizations of known internal memory algorithms—given the developed external data structures.
I/OComplexity of Graph Algorithms
, 1999
"... We show lower bounds of \Omega\Gamma E V Sort(V )) for the I/Ocomplexity of graph theoretic problems like connected components, biconnected components, and minimum spanning trees, where E and V are the number of edges and vertices in the input graph, respectively. We also present a deterministic O ..."
Abstract

Cited by 74 (0 self)
 Add to MetaCart
We show lower bounds of \Omega\Gamma E V Sort(V )) for the I/Ocomplexity of graph theoretic problems like connected components, biconnected components, and minimum spanning trees, where E and V are the number of edges and vertices in the input graph, respectively. We also present a deterministic O( E V Sort(V ) \Delta max(1; log log V BD E )) algorithm for the problem of graph connectivity, where B and D denote respectively the block size and number of disks. Our algorithm includes a breadth first search; this maybe of independent interest. 1 Introduction Data sets of many modern applications are too large to fit into main memory, and must reside on disk. To run such applications efficiently, it is often necessary to explicitly manage disk accesses as a part of the algorithm. In other words, the algorithm must be designed for a model that includes disk, rather than the customary RAM model. Recently, this area has received a lot of attention, and algorithms have been developed for...