Results 1  10
of
43
The WebGraph Framework I: Compression Techniques
 In Proc. of the Thirteenth International World Wide Web Conference
, 2003
"... Studying web graphs is often dicult due to their large size. Recently, several proposals have been published about various techniques that allow to store a web graph in memory in a limited space, exploiting the inner redundancies of the web. The WebGraph framework is a suite of codes, algorithms ..."
Abstract

Cited by 269 (33 self)
 Add to MetaCart
Studying web graphs is often dicult due to their large size. Recently, several proposals have been published about various techniques that allow to store a web graph in memory in a limited space, exploiting the inner redundancies of the web. The WebGraph framework is a suite of codes, algorithms and tools that aims at making it easy to manipulate large web graphs. This papers presents the compression techniques used in WebGraph, which are centred around referentiation and intervalisation (which in turn are dual to each other).
GraphChi: Largescale Graph Computation On just a PC
 In Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, OSDI’12
, 2012
"... Current systems for graph computation require a distributed computing cluster to handle very large realworld problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains c ..."
Abstract

Cited by 109 (6 self)
 Add to MetaCart
(Show Context)
Current systems for graph computation require a distributed computing cluster to handle very large realworld problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains challenging, especially to nonexperts. In this work, we present GraphChi, a diskbased system for computing efficiently on graphs with billions of edges. By using a wellknown method to break large graphs into small parts, and a novel parallel sliding windows method, GraphChi is able to execute several advanced data mining, graph mining, and machine learning algorithms on very large graphs, using just a single consumerlevel computer. We further extend GraphChi to support graphs that evolve over time, and demonstrate that, on a single computer, GraphChi can process over one hundred thousand graph updates per second, while simultaneously performing computation. We show, through experiments and theoretical analysis, that GraphChi performs well on both SSDs and rotational hard drives. By repeating experiments reported for existing distributed systems, we show that, with only fraction of the resources, GraphChi can solve the same problems in very reasonable time. Our work makes largescale graph computation available to anyone with a modern PC. 1
Efficient Aggregation for Graph Summarization
"... Graphs are widely used to model real world objects and their relationships, and large graph datasets are common in many application domains. To understand the underlying characteristics of large graphs, graph summarization techniques are critical. However, existing graph summarization methods are mo ..."
Abstract

Cited by 81 (5 self)
 Add to MetaCart
(Show Context)
Graphs are widely used to model real world objects and their relationships, and large graph datasets are common in many application domains. To understand the underlying characteristics of large graphs, graph summarization techniques are critical. However, existing graph summarization methods are mostly statistical (studying statistics such as degree distributions, hopplots and clustering coefficients). These statistical methods are very useful, but the resolutions of the summaries are hard to control. In this paper, we introduce two databasestyle operations to summarize graphs. Like the OLAPstyle aggregation methods that allow users to drilldown or rollup to control the resolution of summarization, our methods provide an analogous functionality for large graph datasets. The first operation, called SNAP, produces a summary graph by grouping nodes based on userselected node attributes and relationships. The second operation, called kSNAP, further allows users to control the resolutions of summaries and provides the “drilldown ” and “rollup ” abilities to navigate through summaries with different resolutions. We propose an efficient algorithm to evaluate the SNAP operation. In addition, we prove that the kSNAP computation is NPcomplete. We propose two heuristic methods to approximate the kSNAP results. Through extensive experiments on a variety of real and synthetic datasets, we demonstrate the effectiveness and efficiency of the proposed methods.
A Geometric Preferential Attachment Model of Networks
 In Algorithms and Models for the WebGraph: Third International Workshop, WAW 2004
, 2004
"... We study a random graph Gn that combines certain aspects of geometric random graphs and preferential attachment graphs. This model yields a graph with powerlaw degree distribution where the expansion property depends on a tunable parameter of the model. The vertices of Gn are n sequentially generat ..."
Abstract

Cited by 62 (4 self)
 Add to MetaCart
(Show Context)
We study a random graph Gn that combines certain aspects of geometric random graphs and preferential attachment graphs. This model yields a graph with powerlaw degree distribution where the expansion property depends on a tunable parameter of the model. The vertices of Gn are n sequentially generated points x1, x2,..., xn chosen uniformly at random from the unit sphere in R 3. After generating xt, we randomly connect it to m points from those points in x1, x2,..., xt−1. 1
A Fast and Compact Web Graph Representation
"... Compressed graphs representation has become an attractive research topic because of its applications in the manipulation of huge Web graphs in main memory. By far the best current result is the technique by Boldi and Vigna, which takes advantage of several particular properties of Web graphs. In t ..."
Abstract

Cited by 33 (16 self)
 Add to MetaCart
(Show Context)
Compressed graphs representation has become an attractive research topic because of its applications in the manipulation of huge Web graphs in main memory. By far the best current result is the technique by Boldi and Vigna, which takes advantage of several particular properties of Web graphs. In this paper we show that the same properties can be exploited with a different and elegant technique, built on RePair compression, which achieves about the same space but much faster navigation of the graph. Moreover, the technique has the potential of adapting well to secondary memory. In addition, we introduce an approximate RePair version that works efficiently with limited main memory.
Compact representations of simplicial meshes in two and three dimensions
 International Journal of Computational Geometry and Applications
, 2003
"... We describe data structures for representing simplicial meshes compactly while supporting online queries and updates efficiently. Our data structure requires about a factor of five less memory than the most efficient standard data structures for triangular or tetrahedral meshes, while efficiently su ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
(Show Context)
We describe data structures for representing simplicial meshes compactly while supporting online queries and updates efficiently. Our data structure requires about a factor of five less memory than the most efficient standard data structures for triangular or tetrahedral meshes, while efficiently supporting traversal among simplices, storing data on simplices, and insertion and deletion of simplices. Our implementation of the data structures uses about 5 bytes/triangle in two dimensions (2D) and 7.5 bytes/tetrahedron in three dimensions (3D). We use the data structures to implement 2D and 3D incremental algorithms for generating a Delaunay mesh. The 3D algorithm can generate 100 Million tetrahedrons with 1 Gbyte of memory, including the space for the coordinates and all data used by the algorithm. The runtime of the algorithm is as fast as Shewchuk’s Pyramid code, the most efficient we know of, and uses a factor of 3.5 less memory overall. 1
DiscoveryDriven Graph Summarization
"... Large graph datasets are ubiquitous in many domains, including social networking and biology. Graph summarization techniques are crucial in such domains as they can assist in uncovering useful insights about the patterns hidden in the underlying data. One important type of graph summarization is to ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
(Show Context)
Large graph datasets are ubiquitous in many domains, including social networking and biology. Graph summarization techniques are crucial in such domains as they can assist in uncovering useful insights about the patterns hidden in the underlying data. One important type of graph summarization is to produce small and informative summaries based on userselected node attributes and relationships, and allowing users to interactively drilldown or rollup to navigate through summaries with different resolutions. However, two key components are missing from the previous work in this area that limit the use of this method in practice. First, the previous work only deals with categorical node attributes. Consequently, users have to manually bucketize numerical attributes based on domain knowledge, which is not always possible. Moreover, users often have to manually iterate through many resolutions of summaries to identify the most interesting ones. This paper addresses both these key issues to make the interactive graph summarization approach more useful in practice. We first present a method to automatically categorize numerical attributes values by exploiting the domain knowledge hidden inside the node attributes values and graph link structures. Furthermore, we propose an interestingness measure for graph summaries to point users to the potentially most insightful summaries. Using two real datasets, we demonstrate the effectiveness and efficiency of our techniques.
An Experimental Analysis of a Compact Graph Representation
 In ALENEX04
, 2004
"... In previous work we described a method for compactly representing graphs with small separators, which makes use of small separators, and presented preliminary experimental results. In this paper we extend the experimental results in several ways, including extensions for dynamic insertion and deleti ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
(Show Context)
In previous work we described a method for compactly representing graphs with small separators, which makes use of small separators, and presented preliminary experimental results. In this paper we extend the experimental results in several ways, including extensions for dynamic insertion and deletion of edges, a comparison of a variety of coding schemes, and an implementation of two applications using the representation.
Lineartime compression of boundedgenus graphs into informationtheoretically optimal number of bits
 In: 13th Symposium on Discrete Algorithms (SODA
, 2002
"... 1 I n t roduct ion This extended abstract summarizes a new result for the graph compression problem, addressing how to compress a graph G into a binary string Z with the requirement that Z can be decoded to recover G. Graph compression finds important applications in 3D model compression of Computer ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
(Show Context)
1 I n t roduct ion This extended abstract summarizes a new result for the graph compression problem, addressing how to compress a graph G into a binary string Z with the requirement that Z can be decoded to recover G. Graph compression finds important applications in 3D model compression of Computer Graphics [12, 1720] and compact routing table of Computer Networks [7}. For brevity, let a ~rgraph stand for a graph with property n. The informationtheoretically optimal number of bits required to represent an nnode ngraph is [log 2 N~(n)], where N,~(n) is the number of distinct nnode *rgraphs. Although determining or approximating the close forms of N ~ (n) for nontrivial classes of n is challenging, we provide a lineartime methodology for graph compression schemes that are informationtheoretically optimal with respect to continuous uperadditive functions (abbreviated as optimal for the rest of the extended abstract). 1 Specifically, if 7r satisfies certain properties, then we can compress any nnode medge 1rgraph G into a binary string Z such that G and Z can be computed from each other in O(m + n) time, and that the bit count of Z is at most fl(n) + o(fl(n)) for any continuous uperadditive function fl(n) with log 2 N~(n) < fl(n) + o(fl(n)). Our methodology is applicable to general classes of graphs; this extended abstract focuses on graphs with sublinear genus. 2 For example, if the input nnode,rgraph G is equipped with an embedding on its genus surface, which is a reasonable assumption for graphs arising from 3D model compression, then our methodology is applicable to any 7r satisfying the following statements:
Expansion and lack thereof in randomly perturbed graphs
 Internet Math
"... Developing models of complex networks has been a major industry in the fields of physics, mathematics, and computer science during the last decade. Empirical study of numerous large networks harvested from the real world has revealed that, unlike the classical models of random graphs developed ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
Developing models of complex networks has been a major industry in the fields of physics, mathematics, and computer science during the last decade. Empirical study of numerous large networks harvested from the real world has revealed that, unlike the classical models of random graphs developed