Results 1 
8 of
8
Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions
"... Finding dense substructures in a graph is a fundamental graph mining operation, with applications in bioinformatics, social networks, and visualization to name a few. Yet most standard formulations of this problem (like clique, quasiclique, kdensest subgraph) are NPhard. Furthermore, the goal is ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Finding dense substructures in a graph is a fundamental graph mining operation, with applications in bioinformatics, social networks, and visualization to name a few. Yet most standard formulations of this problem (like clique, quasiclique, kdensest subgraph) are NPhard. Furthermore, the goal is rarely to find the “true optimum”, but to identify many (if not all) dense substructures, understand their distribution in the graph, and ideally determine relationships among them. Current dense subgraph finding algorithms usually optimize some objective, and only find a few such subgraphs without providing any structural relations. We define the nucleus decomposition of a graph, which represents the graph as a forest of nuclei. Each nucleus is a subgraph where smaller cliques are present in many larger cliques. The forest of nuclei is a hierarchy by containment, where the edge density increases as we proceed towards leaf nuclei. Sibling nuclei can have limited intersections, which enables discovering overlapping dense subgraphs. With the right parameters, the nucleus decomposition generalizes the classic notions of kcores and ktruss decompositions. We give provably efficient algorithms for nucleus decompositions, and empirically evaluate their behavior in a variety of real graphs. The tree of nuclei consistently gives a global, hierarchical snapshot of dense substructures, and outputs dense subgraphs of higher quality than other stateoftheart solutions. Our algorithm can process graphs with tens of millions of edges in less than an hour. ∗Work done while the author was interning at Sandia National Laboratories, Livermore, CA.
Locally Estimating Core Numbers
"... Abstract—Graphs are a powerful way to model interactions and relationships in data from a wide variety of application domains. In this setting, entities represented by vertices at the “center ” of the graph are often more important than those associated with vertices on the “fringes”. For example, c ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract—Graphs are a powerful way to model interactions and relationships in data from a wide variety of application domains. In this setting, entities represented by vertices at the “center ” of the graph are often more important than those associated with vertices on the “fringes”. For example, central nodes tend to be more critical in the spread of information or disease and play an important role in clustering/community formation. Identifying such “core ” vertices has recently received additional attention in the context of network experiments, which analyze the response when a random subset of vertices are exposed to a treatment (e.g. inoculation, free product samples, etc). Specifically, the likelihood of having many central vertices in any exposure subset can have a significant impact on the experiment. We focus on using kcores and core numbers to measure the extent to which a vertex is central in a graph. Existing algorithms for computing the core number of a vertex require the entire graph as input, an unrealistic scenario in many real world applications. Moreover, in the context of network experiments, the subgraph induced by the treated vertices is only known in a probabilistic sense. We introduce a new method for estimating the core number based only on the properties of the graph within a region of radius δ around the vertex, and prove an asymptotic error bound of our estimator on random graphs. Further, we empirically validate the accuracy of our estimator for small values of δ on a representative corpus of real data sets. Finally, we evaluate the impact of improved local estimation on an open problem in network experimentation posed by Ugander et al. I.
Efficient Densest Subgraph Computation in Evolving Graphs
, 2015
"... Densest subgraph computation has emerged as an important primitive in a wide range of data analysis tasks such as community and event detection. Social media such as Facebook and Twitter are highly dynamic with new friendship links and tweets being generated incessantly, calling for efficient algori ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Densest subgraph computation has emerged as an important primitive in a wide range of data analysis tasks such as community and event detection. Social media such as Facebook and Twitter are highly dynamic with new friendship links and tweets being generated incessantly, calling for efficient algorithms that can handle very large and highly dynamic input data. While either scalable or dynamic algorithms for finding densest subgraphs have been proposed, a viable and satisfactory solution for addressing both the dynamic aspect of the input data and its large size is still missing. We study the densest subgraph problem in the the dynamic graph model, for which we present the first scalable algorithm with provable guarantees. In our model, edges are added adversarially while they are removed uniformly at
OLAK: An Efficient Algorithm to Prevent Unraveling in Social Networks
"... ABSTRACT In this paper, we study the problem of the anchored kcore. Given a graph G, an integer k and a budget b, we aim to identify b vertices in G so that we can determine the largest induced subgraph J in which every vertex, except the b vertices, has at least k neighbors in J. This problem was ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT In this paper, we study the problem of the anchored kcore. Given a graph G, an integer k and a budget b, we aim to identify b vertices in G so that we can determine the largest induced subgraph J in which every vertex, except the b vertices, has at least k neighbors in J. This problem was introduced by Bhawalkar and Kleinberg et al. in the context of user engagement in social networks, where a user may leave a community if he/she has less than k friends engaged. The problem has been shown to be NPhard and inapproximable. A polynomialtime algorithm for graphs with bounded treewidth has been proposed. However, this assumption usually does not hold in reallife graphs, and their techniques cannot be extended to handle general graphs. Motivated by this, we propose an efficient algorithm, namely onionlayer based anchored kcore (OLAK), for the anchored kcore problem on large scale graphs. To facilitate computation of the anchored kcore, we design an onion layer structure, which is generated by a simple onionpeelinglike algorithm against a small set of vertices in the graph. We show that computation of the best anchor can simply be conducted upon the vertices on the onion layers, which significantly reduces the search space. Based on the wellorganized layer structure, we develop efficient candidates exploration, early termination and pruning techniques to further speed up computation. Comprehensive experiments on 10 reallife graphs demonstrate the effectiveness and efficiency of our proposed methods.
Fast Hierarchy Construction for Dense Subgraphs
"... ABSTRACT Discovering dense subgraphs and understanding the relations among them is a fundamental problem in graph mining. We want to not only identify dense subgraphs, but also build a hierarchy among them (e.g., larger but sparser subgraphs formed by two smaller dense subgraphs). Peeling algorithm ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT Discovering dense subgraphs and understanding the relations among them is a fundamental problem in graph mining. We want to not only identify dense subgraphs, but also build a hierarchy among them (e.g., larger but sparser subgraphs formed by two smaller dense subgraphs). Peeling algorithms (kcore, ktruss, and nucleus decomposition) have been effective to locate many dense subgraphs. However, constructing a hierarchical representation of density structure, even correctly computing the connected kcores and ktrusses, have been mostly overlooked. Keeping track of connected components during peeling requires an additional traversal operation, which is as expensive as the peeling process. In this paper, we start with a thorough survey and point to nuances in problem formulations that lead to significant differences in runtimes. We then propose efficient and generic algorithms to construct the hierarchy of dense subgraphs for kcore, ktruss, or any nucleus decomposition. Our algorithms leverage the disjointset forest data structure to efficiently construct the hierarchy during traversal. Furthermore, we introduce a new idea to avoid traversal. We construct the subgraphs while visiting neighborhoods in the peeling process, and build the relations to previously constructed subgraphs. We also consider an existing idea to find the kcore hierarchy and adapt for our objectives efficiently. Experiments on different types of large scale realworld networks show significant speedups over naive algorithms and existing alternatives. Our algorithms also outperform the hypothetical limits of any possible traversalbased solution.
Querying KTruss Community in Large and Dynamic Graphs
"... Community detection which discovers densely connected structures in a network has been studied a lot. In this paper, we study online community search which is practically useful but less studied in the literature. Given a query vertex in a graph, the problem is to find meaningful communities that t ..."
Abstract
 Add to MetaCart
(Show Context)
Community detection which discovers densely connected structures in a network has been studied a lot. In this paper, we study online community search which is practically useful but less studied in the literature. Given a query vertex in a graph, the problem is to find meaningful communities that the vertex belongs to in an online manner. We propose a novel community model based on the ktruss concept, which brings nice structural and computational properties. We design a compact and elegant index structure which supports the efficient search of ktruss communities with a linear cost with respect to the community size. In addition, we investigate the ktruss community search problem in a dynamic graph setting with frequent insertions and deletions of graph vertices and edges. Extensive experiments on large realworld networks demonstrate the effectiveness and efficiency of our community model and search algorithms.
Influential Community Search in Large Networks
"... Community search is a problem of finding densely connected subgraphs that satisfy the query conditions in a network, which has attracted much attention in recent years. However, all the previous studies on community search do not consider the influence of a community. In this paper, we introduce a ..."
Abstract
 Add to MetaCart
(Show Context)
Community search is a problem of finding densely connected subgraphs that satisfy the query conditions in a network, which has attracted much attention in recent years. However, all the previous studies on community search do not consider the influence of a community. In this paper, we introduce a novel community model called kinfluential community based on the concept of kcore, which can capture the influence of a community. Based on the new community model, we propose a lineartime online search algorithm to find the topr kinfluential communities in a network. To further speed up the influential community search algorithm, we devise a linearspace index structure which supports efficient search of the topr kinfluential communities in optimal time. We also propose an efficient algorithm to maintain the index when the network is frequently updated. We conduct extensive experiments on 7 realworld large networks, and the results demonstrate the efficiency and effectiveness of the proposed methods. 1.