Results 1  10
of
25
Mobility Performance of
 MacrocellAssisted Small Cells in Manhattan Model,” Vehicular Technology Conference (VTC Spring), 2014 IEEE 79th
, 2014
"... Recent research efforts have made notable progress in improving the performance of (exhaustive) maximal clique enumeration (MCE). However, existing algorithms still suffer from exploring the huge search space of MCE. Furthermore, their results are often undesirable as many of the returned maximal c ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
Recent research efforts have made notable progress in improving the performance of (exhaustive) maximal clique enumeration (MCE). However, existing algorithms still suffer from exploring the huge search space of MCE. Furthermore, their results are often undesirable as many of the returned maximal cliques have large overlapping parts. This redundancy leads to problems in both computational efficiency and usefulness of MCE. In this paper, we aim at providing a concise and complete summary of the set of maximal cliques, which is useful to many applications. We propose the notion of τvisible MCE to achieve this goal and design algorithms to realize the notion. Based on the refined output space, we further consider applications including an efficient computation of the topk results with diversity and an interactive clique exploration process. Our experimental results demonstrate that our approach is capable of producing output of high usability and our algorithms achieve superior efficiency over classic MCE algorithms.
Massive graph triangulation
 In ACM SIGMOD Conference on Management of Data
, 2013
"... This paper studies I/Oefficient algorithms for settling the classic triangle listing problem, whose solution is a basic operator in dealing with many other graph problems. Specifically, given an undirected graph G, the objective of triangle listing is to find all the cliques involving 3 vertices ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
This paper studies I/Oefficient algorithms for settling the classic triangle listing problem, whose solution is a basic operator in dealing with many other graph problems. Specifically, given an undirected graph G, the objective of triangle listing is to find all the cliques involving 3 vertices in G. The problem has been well studied in internal memory, but remains an urgent difficult challenge when G does not fit in memory, rendering any algorithm to entail frequent I/O accesses. Although previous research has attempted to tackle the challenge, the stateoftheart solutions rely on a set of crippling assumptions to guarantee good performance. Motivated by this, we develop a new algorithm that is provably I/O and CPU efficient at the same time, without making any assumption on the input G at all. The algorithm uses ideas drastically different from all the previous approaches, and outperformed the existing competitors by a factor over an order of magnitude in our extensive experimentation.
Scalable Maximum Clique Computation Using MapReduce
"... We present a scalable and faulttolerant solution for the maximum clique problem based on the MapReduce framework. Thekeycontributionthatenablesusto effectively use MapReduce is a recursive partitioning method that partitions the graph into several subgraphs of similar size. After partitioning, the ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We present a scalable and faulttolerant solution for the maximum clique problem based on the MapReduce framework. Thekeycontributionthatenablesusto effectively use MapReduce is a recursive partitioning method that partitions the graph into several subgraphs of similar size. After partitioning, the maximum cliques of the different partitions can be computed independently, and the computation is sped up using a branch and bound method. Our experiments show that our approach leads to good scalability, which is unachievable by other partitioning methods since they result in partitions of different sizes and hence lead to load imbalance. Our method is more scalable than an MPI algorithm, and is simpler and more fault tolerant.
Fast algorithms for maximal clique enumeration with limited memory
 In Proceedings of the ACM SIGKDD international
, 2012
"... Maximal clique enumeration (MCE) is a longstanding problem in graph theory and has numerous important applications. Though extensively studied, most existing algorithms become impractical when the input graph is too large and is diskresident. We first propose an efficient partitionbased algorith ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
(Show Context)
Maximal clique enumeration (MCE) is a longstanding problem in graph theory and has numerous important applications. Though extensively studied, most existing algorithms become impractical when the input graph is too large and is diskresident. We first propose an efficient partitionbased algorithm for MCE that addresses the problem of processing large graphs with limited memory. We then further reduce the high cost of CPU computation of MCE by a careful nested partition based on a cost model. Finally, we parallelize our algorithm to further reduce the overall running time. We verified the efficiency of our algorithms by experiments in large realworld graphs.
TFLabel: a TopologicalFolding Labeling Scheme for Reachability Querying in a Large Graph
, 2013
"... Reachability querying is a basic graph operation with numerous important applications in databases, network analysis, computational biology, software engineering, etc. Although many indexes have been proposed to answer reachability queries, most of them are only efficient for handling relatively sma ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Reachability querying is a basic graph operation with numerous important applications in databases, network analysis, computational biology, software engineering, etc. Although many indexes have been proposed to answer reachability queries, most of them are only efficient for handling relatively small graphs. We propose TFlabel, an efficient and scalable labeling scheme for processing reachability queries. TFlabel is constructed based on a novel topological folding (TF) that recursively folds an input graph into half so as to reduce the label size, thus improving query efficiency. We show that TFlabel is efficient to construct and propose efficient algorithms and optimization schemes. Our experiments verify that TFlabel is significantly more scalable and efficient than the stateoftheart methods in both index construction and query processing.
Large Scale Cohesive Subgraphs Discovery for Social Network Visual Analysis
"... Graphs are widely used in large scale social network analysis nowadays. Not only analysts need to focus on cohesive subgraphs to study patterns among social actors, but also normal users are interested in discovering what happening in their neighborhood. However, effectively storing large scale soci ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Graphs are widely used in large scale social network analysis nowadays. Not only analysts need to focus on cohesive subgraphs to study patterns among social actors, but also normal users are interested in discovering what happening in their neighborhood. However, effectively storing large scale social network and efficiently identifying cohesive subgraphs is challenging. In this work we introduce a novel subgraph concept to capture the cohesion in social interactions, and propose an I/O efficient approach to discover cohesive subgraphs. Besides, we propose an analytic system which allows users to perform intuitive, visual browsing on large scale social networks. Our system stores the network as a social graph in the graph database, retrieves a local cohesive subgraph based on the input keywords, and then hierarchically visualizes the subgraph out on orbital layout, in which more important social actors are located in the center. By summarizing textual interactions between social actors as tag cloud, we provide a way to quickly locate active social communities and their interactions in a unified view. 1.
Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions
"... Finding dense substructures in a graph is a fundamental graph mining operation, with applications in bioinformatics, social networks, and visualization to name a few. Yet most standard formulations of this problem (like clique, quasiclique, kdensest subgraph) are NPhard. Furthermore, the goal is ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Finding dense substructures in a graph is a fundamental graph mining operation, with applications in bioinformatics, social networks, and visualization to name a few. Yet most standard formulations of this problem (like clique, quasiclique, kdensest subgraph) are NPhard. Furthermore, the goal is rarely to find the “true optimum”, but to identify many (if not all) dense substructures, understand their distribution in the graph, and ideally determine relationships among them. Current dense subgraph finding algorithms usually optimize some objective, and only find a few such subgraphs without providing any structural relations. We define the nucleus decomposition of a graph, which represents the graph as a forest of nuclei. Each nucleus is a subgraph where smaller cliques are present in many larger cliques. The forest of nuclei is a hierarchy by containment, where the edge density increases as we proceed towards leaf nuclei. Sibling nuclei can have limited intersections, which enables discovering overlapping dense subgraphs. With the right parameters, the nucleus decomposition generalizes the classic notions of kcores and ktruss decompositions. We give provably efficient algorithms for nucleus decompositions, and empirically evaluate their behavior in a variety of real graphs. The tree of nuclei consistently gives a global, hierarchical snapshot of dense substructures, and outputs dense subgraphs of higher quality than other stateoftheart solutions. Our algorithm can process graphs with tens of millions of edges in less than an hour. ∗Work done while the author was interning at Sandia National Laboratories, Livermore, CA.
Rectangle counting in large bipartite graphs (long version),” http://www.cse.cuhk. edu.hk/∼jcheng/rect.pdf
, 2013
"... Abstract—Rectangles are the smallest cycles (i.e., cycles of length 4) and most elementary substructures in a bipartite graph. Similar to triangle counting in unipartite graphs, rectangle counting has many important applications where data is modeled as bipartite graphs. However, efficient algorit ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Rectangles are the smallest cycles (i.e., cycles of length 4) and most elementary substructures in a bipartite graph. Similar to triangle counting in unipartite graphs, rectangle counting has many important applications where data is modeled as bipartite graphs. However, efficient algorithms for rectangle counting are lacking. We propose three different types of algorithms to cope with different data volumes and the availability of computing resources. We verified the efficiency of our algorithms with experiments on both large realworld and synthetic bipartite graphs. Keywordsbipartite graphs; rectangle counting; I.
Approximate closest community search in networks.
, 2015
"... ABSTRACT Recently, there has been significant interest in the study of the community search problem in social and information networks: given one or more query nodes, find densely connected communities containing the query nodes. However, most existing studies do not address the "free rider&qu ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
ABSTRACT Recently, there has been significant interest in the study of the community search problem in social and information networks: given one or more query nodes, find densely connected communities containing the query nodes. However, most existing studies do not address the "free rider" issue, that is, nodes far away from query nodes and irrelevant to them are included in the detected community. Some stateoftheart models have attempted to address this issue, but not only are their formulated problems NPhard, they do not admit any approximations without restrictive assumptions, which may not always hold in practice. In this paper, given an undirected graph G and a set of query nodes Q, we study community search using the ktruss based community model. We formulate our problem of finding a closest truss community (CTC), as finding a connected ktruss subgraph with the largest k that contains Q, and has the minimum diameter among such subgraphs. We prove this problem is NPhard. Furthermore, it is NPhard to approximate the problem within a factor (2−ε), for any ε > 0. However, we develop a greedy algorithmic framework, which first finds a CTC containing Q, and then iteratively removes the furthest nodes from Q, from the graph. The method achieves 2approximation to the optimal solution. To further improve the efficiency, we make use of a compact truss index and develop efficient algorithms for ktruss identification and maintenance as nodes get eliminated. In addition, using bulk deletion optimization and local exploration strategies, we propose two more efficient algorithms. One of them trades some approximation quality for efficiency while the other is a very efficient heuristic. Extensive experiments on 6 realworld networks show the effectiveness and efficiency of our community model and search algorithms.
Distributed Maximal Clique Computation
"... Maximal cliques are important substructures in graph analysis. Many algorithms for computing maximal cliques have been proposed in the literature; however, most of them are sequential algorithms that cannot scale due to the high complexity of the problem, while existing parallel algorithms for comp ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Maximal cliques are important substructures in graph analysis. Many algorithms for computing maximal cliques have been proposed in the literature; however, most of them are sequential algorithms that cannot scale due to the high complexity of the problem, while existing parallel algorithms for computing maximal cliques are mostly immature and especially suffer from skewed workload. In this paper, we first propose a distributed algorithm built on a sharenothing architecture for computing the set of maximal cliques. We effectively address the problem of skewed workload distribution due to highdegree vertices, which also leads to drastically reduced worstcase time complexity for computing maximal cliques in common realworld graphs. Then, we also devise algorithms to support efficient update maintenance of the set of maximal cliques when the underlying graph is updated. We verify the efficiency of our algorithms for computing and updating the set of maximal cliques with a range of realworld graphs from different application domains.