Results

**1 - 2**of**2**### Streaming Graph Partitioning in the Planted Partition Model

"... The sheer increase in the size of graph data has created a lot of interest into developing efficient distributed graph processing frameworks. Popular existing frameworks such as GraphLab and Pregel rely on balanced graph partition-ing in order to minimize communication and achieve work balance. In t ..."

Abstract
- Add to MetaCart

(Show Context)
The sheer increase in the size of graph data has created a lot of interest into developing efficient distributed graph processing frameworks. Popular existing frameworks such as GraphLab and Pregel rely on balanced graph partition-ing in order to minimize communication and achieve work balance. In this work we contribute to the recent research line of streaming graph partitioning [30, 31, 34] which computes an approximately balanced k-partitioning of the vertex set of a graph using a single pass over the graph stream using degree-based criteria. This graph partitioning framework is well tailored to processing large-scale and dynamic graphs. In this work we introduce the use of higher length walks for streaming graph partitioning and show that their use incurs a minor computational cost which can significantly improve the quality of the graph partition. We perform an average case analysis of our algorithm using the planted partition model [7, 25]. We complement the recent results of Stanton [30] by showing that our proposed method recovers the true partition with high probability even when the gap of the model tends to zero as the size of the graph grows. Further-more, among the wide number of choices for the length of the walks we show that the proposed length is optimal. Finally, we perform simulations which indicate that our asymptotic results hold even for small graph sizes.

### Scalable Large Near-Clique Detection in Large-Scale Networks via Sampling

"... Extracting dense subgraphs from large graphs is a key prim-itive in a variety of graph mining applications, ranging from mining social networks and the Web graph to bioinformat-ics [41]. In this paper we focus on a family of poly-time solvable formulations, known as the k-clique densest sub-graph pr ..."

Abstract
- Add to MetaCart

(Show Context)
Extracting dense subgraphs from large graphs is a key prim-itive in a variety of graph mining applications, ranging from mining social networks and the Web graph to bioinformat-ics [41]. In this paper we focus on a family of poly-time solvable formulations, known as the k-clique densest sub-graph problem (k-Clique-DSP) [57]. When k = 2, the problem becomes the well-known densest subgraph problem (DSP) [22, 31, 33, 39]. Our main contribution is a sam-pling scheme that gives densest subgraph sparsifier, yielding a randomized algorithm that produces high-quality approx-imations while providing significant speedups and improved space complexity. We also extend this family of formulations to bipartite graphs by introducing the (p, q)-biclique densest subgraph problem ((p,q)-Biclique-DSP), and devise an ex-