Results 1  10
of
42
ExternalMemory Graph Algorithms
, 1995
"... We present a collection of new techniques for designing and analyzing efficient externalmemory algorithms for graph problems and illustrate how these techniques can be applied to a wide variety of specific problems. Our results include: ffl Proximateneighboring. We present a simple method for der ..."
Abstract

Cited by 186 (23 self)
 Add to MetaCart
We present a collection of new techniques for designing and analyzing efficient externalmemory algorithms for graph problems and illustrate how these techniques can be applied to a wide variety of specific problems. Our results include: ffl Proximateneighboring. We present a simple method for deriving externalmemory lower bounds via reductions from a problem we call the "proximate neighbors" problem. We use this technique to derive nontrivial lower bounds for such problems as list ranking, expression tree evaluation, and connected components. ffl PRAM simulation. We give methods for efficiently simulating PRAM computations in external memory, even for some cases in which the PRAM algorithm is not workoptimal. We apply this to derive a number of optimal (and simple) externalmemory graph algorithms. ffl Timeforward processing. We present a general technique for evaluating circuits (or "circuitlike" computations) in external memory. We also use this in a deterministic list rank...
ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs
 NTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING
, 2002
"... Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences, city streets, social networks and academic citations. Any kind of relationship, such as actors appearing in movies, can ..."
Abstract

Cited by 124 (18 self)
 Add to MetaCart
(Show Context)
Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences, city streets, social networks and academic citations. Any kind of relationship, such as actors appearing in movies, can be represented as a graph. This work presents a data mining tool, called ANF, that can quickly answer a number of interesting questions on graphrepresented data, such as the following. How robust is the Internet to failures? What are the most influential database papers? Are there gender differences in movie appearance patterns? At its core, ANF is based on a fast and memoryefficient approach for approximating the complete "neighbourhood function" for a graph. For the Internet graph (268K nodes), ANF's highlyaccurate approximation is more than 700 times faster than the exact computation. This reduces the running time from nearly a day to a matter of a minute or two, allowing users to perform ad hoc drilldown tasks and to repeatedly answer questions about changing data sources. To enable this drilldown, ANF employs new techniques for approximating neighbourhoodtype functions for graphs with distinguished nodes and/or edges. When compared to the best existing approximation, ANF's approach is both faster and more accurate, given the same resources. Additionally, unlike previous approaches, ANF scales gracefully to handle disk resident graphs. Finally, we present some of our results from mining large graphs using ANF.
External Memory Data Structures
, 2001
"... In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynami ..."
Abstract

Cited by 76 (32 self)
 Add to MetaCart
In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynamic data structures. We also briefly discuss some of the most popular external data structures used in practice.
Efficient ExternalMemory Data Structures and Applications
, 1996
"... In this thesis we study the Input/Output (I/O) complexity of largescale problems arising e.g. in the areas of database systems, geographic information systems, VLSI design systems and computer graphics, and design I/Oefficient algorithms for them. A general theme in our work is to design I/Oeffic ..."
Abstract

Cited by 41 (10 self)
 Add to MetaCart
(Show Context)
In this thesis we study the Input/Output (I/O) complexity of largescale problems arising e.g. in the areas of database systems, geographic information systems, VLSI design systems and computer graphics, and design I/Oefficient algorithms for them. A general theme in our work is to design I/Oefficient algorithms through the design of I/Oefficient data structures. One of our philosophies is to try to isolate all the I/O specific parts of an algorithm in the data structures, that is, to try to design I/O algorithms from internal memory algorithms by exchanging the data structures used in internal memory with their external memory counterparts. The results in the thesis include a technique for transforming an internal memory tree data structure into an external data structure which can be used in a batched dynamic setting, that is, a setting where we for example do not require that the result of a search operation is returned immediately. Using this technique we develop batched dynamic external versions of the (onedimensional) rangetree and the segmenttree and we develop an external priority queue. Following our general philosophy we show how these structures can be used in standard internal memory sorting algorithms
On externalmemory MST, SSSP and multiway planar graph separation
 In Proc. 8th Scandinavian Workshop on Algorithmic Theory, volume 1851 of LNCS
, 2000
"... Recently external memory graph algorithms have received considerable attention because massive graphs arise naturally in many applications involving massive data sets. Even though a large number of I/Oefficient graph algorithms have been developed, a number of fundamental problems still remain open ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
Recently external memory graph algorithms have received considerable attention because massive graphs arise naturally in many applications involving massive data sets. Even though a large number of I/Oefficient graph algorithms have been developed, a number of fundamental problems still remain open. In this paper we develop an improved algorithm for the problem of computing a minimum spanning tree of a general graph, as well as new algorithms for the single source shortest paths and the multiway graph separation problems on planar graphs.
On External Memory MST, SSSP and Multiway Planar Graph Separation (Extended Abstract)
, 2000
"... Recently external memory graph algorithms have received considerable attention because massive graphs arise naturally in many applications involving massive data sets. Even though a large number of I/Oefficient graph algorithms have been developed, a number of fundamental problems still remain ..."
Abstract

Cited by 32 (11 self)
 Add to MetaCart
Recently external memory graph algorithms have received considerable attention because massive graphs arise naturally in many applications involving massive data sets. Even though a large number of I/Oefficient graph algorithms have been developed, a number of fundamental problems still remain open. In this paper we develop improved algorithms for the problem of computing a minimum spanning tree of a general graph G = (V; E), as well as new algorithms for the single source shortest paths and the multiway graph separation problems on planar graphs.
Optimal Dynamic Range Searching in Nonreplicating Index Structures
 In Proc. International Conference on Database Theory, LNCS 1540
, 1997
"... We consider the problem of dynamic range searching in tree structures that do not replicate data. We propose a new dynamic structure, called the Otree, that achieves a query time complexity of O(n (d\Gamma1)=d ) on n ddimensional points and an amortized insertion/deletion time complexity of O(l ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
(Show Context)
We consider the problem of dynamic range searching in tree structures that do not replicate data. We propose a new dynamic structure, called the Otree, that achieves a query time complexity of O(n (d\Gamma1)=d ) on n ddimensional points and an amortized insertion/deletion time complexity of O(log n). We show that this structure is optimal when data is not replicated. In addition to optimal query and insertion/deletion times, the Otree also supports exact match queries in worstcase logarithmic time. 1 Introduction Given a set S of ddimensional points, a range query q is specified by d 1dimensional intervals [q s i ; q e i ], one for each dimension i, and retrieves all points p = (p 1 ; p 2 ; : : : p d ) in S such that h8i 2 f1; : : : ; dg : q s i p i q e i i. This type of searching in multidimensional space has important applications in geographic information systems, image databases, and computer graphics. Several structures such as the range trees [3], Prange trees [29...
I/OEfficient Algorithms for Problems on Gridbased Terrains (Extended Abstract)
 In Proc. Workshop on Algorithm Engineering and Experimentation
, 2000
"... Lars Arge Laura Toma Jeffrey Scott Vitter Center for Geometric Computing Department of Computer Science Duke University Durham, NC 277080129 Abstract The potential and use of Geographic Information Systems (GIS) is rapidly increasing due to the increasing availability of massive amoun ..."
Abstract

Cited by 30 (14 self)
 Add to MetaCart
(Show Context)
Lars Arge Laura Toma Jeffrey Scott Vitter Center for Geometric Computing Department of Computer Science Duke University Durham, NC 277080129 Abstract The potential and use of Geographic Information Systems (GIS) is rapidly increasing due to the increasing availability of massive amounts of geospatial data from projects like NASA's Mission to Planet Earth. However, the use of these massive datasets also exposes scalability problems with existing GIS algorithms. These scalability problems are mainly due to the fact that most GIS algorithms have been designed to minimize internal computation time, while I/O communication often is the bottleneck when processing massive amounts of data.
ExternalMemory Algorithms with Applications in Geographic Information Systems
, 1996
"... ..."
(Show Context)
Mining Patterns from Graph Traversals
 Data and Knowledge Engineering
, 2001
"... In data models that have graph representations, users navigate following the links of the graph structure. Conducting data mining on collected information about user accesses in such models, involves the determination of frequently occurring access sequences. In this paper, we examine the problem of ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
(Show Context)
In data models that have graph representations, users navigate following the links of the graph structure. Conducting data mining on collected information about user accesses in such models, involves the determination of frequently occurring access sequences. In this paper, we examine the problem of finding traversal patterns from such collections. The determination of patterns is based on the graph structure of the model. For this purpose, we present three algorithms, one which is levelwise with respect to the lengths of the patterns and two which are not. Additionally, we consider the fact that accesses within patterns may be interleaved with random accesses due to navigational purposes. The definition of the pattern type generalizes existing ones in order to take into account this fact. The performance of all algorithms and their sensitivity to several parameters is examined experimentally.