Results 1 
5 of
5
Parallel triangle counting in massive streaming graphs
 in Proc. of CIKM
, 2013
"... The number of triangles in a graph is a fundamental metric, used in social network analysis, link classification and recommendation, and more. Driven by these applications and the trend that modern graph datasets are both large and dynamic, we present the design and implementation of a fast and cac ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
The number of triangles in a graph is a fundamental metric, used in social network analysis, link classification and recommendation, and more. Driven by these applications and the trend that modern graph datasets are both large and dynamic, we present the design and implementation of a fast and cacheefficient parallel algorithm for estimating the number of triangles in a massive undirected graph whose edges arrive as a stream. It brings together the benefits of streaming algorithms and parallel algorithms. By building on the streaming algorithms framework, the algorithm has a small memory footprint. By leveraging the paralell cacheoblivious framework, it makes efficient use of the memory hierarchy of modern multicore machines without needing to know its specific parameters. We prove theoretical bounds on accuracy, memory access cost, and parallel runtime complexity, as well as showing empirically that the algorithm yields accurate results and substantial speedups compared to an optimized sequential implementation. (This is an expanded version of a CIKM’13 paper of the same title.) 1
Graph stream algorithms: A survey
, 2013
"... Over the last decade, there has been considerable interest in designing algorithms for processing massive graphs in the data stream model. The original motivation was twofold: a) in many applications, the dynamic graphs that arise are too large to be stored in the main memory of a single machine ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
Over the last decade, there has been considerable interest in designing algorithms for processing massive graphs in the data stream model. The original motivation was twofold: a) in many applications, the dynamic graphs that arise are too large to be stored in the main memory of a single machine and b) considering graph problems yields new insights into the complexity of stream computation. However, the techniques developed in this area are now finding applications in other areas including data structures for dynamic graphs, approximation algorithms, and distributed and parallel computation. We survey the stateoftheart results; identify general techniques; and highlight some simple algorithms that illustrate basic ideas. 1.
gSparsify: Graph Motif Based Sparsification for Graph Clustering
"... Graph clustering is a fundamental problem that partitions vertices of a graph into clusters with an objective to optimize the intuitive notions of intracluster density and intercluster sparsity. In many realworld applications, however, the sheer sizes and inherent complexity of graphs may render ..."
Abstract
 Add to MetaCart
(Show Context)
Graph clustering is a fundamental problem that partitions vertices of a graph into clusters with an objective to optimize the intuitive notions of intracluster density and intercluster sparsity. In many realworld applications, however, the sheer sizes and inherent complexity of graphs may render existing graph clustering methods inefficient or incapable of yielding quality graph clusters. In this paper, we propose gSparsify, a graph sparsification method, to preferentially retain a small subset of edges from a graph which are more likely to be within clusters, while eliminating others with less or no structure correlation to clusters. The resultant simplified graph is succinct in size with core cluster structures well preserved, thus enabling faster graph clustering without a compromise to clustering quality. We consider a quantitative approach to modeling the evidence that edges within densely knitted clusters are frequently involved in smallsize graph motifs, which are adopted as prime features to differentiate edges with varied cluster significance. Pathbased indexes and pathjoin algorithms are further designed to compute graphmotif based cluster significance of edges for graph sparsification. We perform experimental studies in realworld graphs, and results demonstrate that gSparsify can bring significant speedup to existing graph clustering methods with an improvement to graph clustering quality.
Continuous query processing; Temporal analytics; Dynamic social
"... networks; Incremental computation. ..."
(Show Context)