Results 1  10
of
12
Truss Decomposition in Massive Networks
"... The ktruss is a type of cohesive subgraphs proposed recently for the study of networks. While the problem of computing most cohesive subgraphs is NPhard, there exists a polynomial time algorithm for computing ktruss. Compared with kcore which is also efficient to compute, ktruss represents the ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
(Show Context)
The ktruss is a type of cohesive subgraphs proposed recently for the study of networks. While the problem of computing most cohesive subgraphs is NPhard, there exists a polynomial time algorithm for computing ktruss. Compared with kcore which is also efficient to compute, ktruss represents the “core ” of a kcore that keeps the key information of, while filtering out less important information from, the kcore. However, existing algorithms for computing ktruss are inefficient for handling today’s massive networks. We first improve the existing inmemory algorithm for computing ktruss in networks of moderate size. Then, we propose two I/Oefficient algorithms to handle massive networks that cannot fit in main memory. Our experiments on real datasets verify the efficiency of our algorithms and the value of ktruss. 1.
Mobility Performance of
 MacrocellAssisted Small Cells in Manhattan Model,” Vehicular Technology Conference (VTC Spring), 2014 IEEE 79th
, 2014
"... Recent research efforts have made notable progress in improving the performance of (exhaustive) maximal clique enumeration (MCE). However, existing algorithms still suffer from exploring the huge search space of MCE. Furthermore, their results are often undesirable as many of the returned maximal c ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
Recent research efforts have made notable progress in improving the performance of (exhaustive) maximal clique enumeration (MCE). However, existing algorithms still suffer from exploring the huge search space of MCE. Furthermore, their results are often undesirable as many of the returned maximal cliques have large overlapping parts. This redundancy leads to problems in both computational efficiency and usefulness of MCE. In this paper, we aim at providing a concise and complete summary of the set of maximal cliques, which is useful to many applications. We propose the notion of τvisible MCE to achieve this goal and design algorithms to realize the notion. Based on the refined output space, we further consider applications including an efficient computation of the topk results with diversity and an interactive clique exploration process. Our experimental results demonstrate that our approach is capable of producing output of high usability and our algorithms achieve superior efficiency over classic MCE algorithms.
Massive graph triangulation
 In ACM SIGMOD Conference on Management of Data
, 2013
"... This paper studies I/Oefficient algorithms for settling the classic triangle listing problem, whose solution is a basic operator in dealing with many other graph problems. Specifically, given an undirected graph G, the objective of triangle listing is to find all the cliques involving 3 vertices ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
This paper studies I/Oefficient algorithms for settling the classic triangle listing problem, whose solution is a basic operator in dealing with many other graph problems. Specifically, given an undirected graph G, the objective of triangle listing is to find all the cliques involving 3 vertices in G. The problem has been well studied in internal memory, but remains an urgent difficult challenge when G does not fit in memory, rendering any algorithm to entail frequent I/O accesses. Although previous research has attempted to tackle the challenge, the stateoftheart solutions rely on a set of crippling assumptions to guarantee good performance. Motivated by this, we develop a new algorithm that is provably I/O and CPU efficient at the same time, without making any assumption on the input G at all. The algorithm uses ideas drastically different from all the previous approaches, and outperformed the existing competitors by a factor over an order of magnitude in our extensive experimentation.
Fast algorithms for maximal clique enumeration with limited memory
 In Proceedings of the ACM SIGKDD international
, 2012
"... Maximal clique enumeration (MCE) is a longstanding problem in graph theory and has numerous important applications. Though extensively studied, most existing algorithms become impractical when the input graph is too large and is diskresident. We first propose an efficient partitionbased algorith ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
(Show Context)
Maximal clique enumeration (MCE) is a longstanding problem in graph theory and has numerous important applications. Though extensively studied, most existing algorithms become impractical when the input graph is too large and is diskresident. We first propose an efficient partitionbased algorithm for MCE that addresses the problem of processing large graphs with limited memory. We then further reduce the high cost of CPU computation of MCE by a careful nested partition based on a cost model. Finally, we parallelize our algorithm to further reduce the overall running time. We verified the efficiency of our algorithms by experiments in large realworld graphs.
TFLabel: a TopologicalFolding Labeling Scheme for Reachability Querying in a Large Graph
, 2013
"... Reachability querying is a basic graph operation with numerous important applications in databases, network analysis, computational biology, software engineering, etc. Although many indexes have been proposed to answer reachability queries, most of them are only efficient for handling relatively sma ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Reachability querying is a basic graph operation with numerous important applications in databases, network analysis, computational biology, software engineering, etc. Although many indexes have been proposed to answer reachability queries, most of them are only efficient for handling relatively small graphs. We propose TFlabel, an efficient and scalable labeling scheme for processing reachability queries. TFlabel is constructed based on a novel topological folding (TF) that recursively folds an input graph into half so as to reduce the label size, thus improving query efficiency. We show that TFlabel is efficient to construct and propose efficient algorithms and optimization schemes. Our experiments verify that TFlabel is significantly more scalable and efficient than the stateoftheart methods in both index construction and query processing.
Kreach: Who is in your small world
 PVLDB
"... We study the problem of answering khop reachability queries in a directed graph, i.e., whether there exists a directed path of length k, from a source query vertex to a target query vertex in the input graph. The problem of khop reachability is a general problem of the classic reachability (where ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
We study the problem of answering khop reachability queries in a directed graph, i.e., whether there exists a directed path of length k, from a source query vertex to a target query vertex in the input graph. The problem of khop reachability is a general problem of the classic reachability (where k = ∞). Existing indexes for processing classic reachability queries, as well as for processing shortest path queries, are not applicable or not efficient for processing khop reachability queries. We propose an index for processing khop reachability queries, which is simple in design and efficient to construct. Our experimental results on a wide range of real datasets show that our index is more efficient than the stateoftheart indexes even for processing classic reachability queries, for which these indexes are primarily designed. We also show that our index is efficient in answering khop reachability queries. 1.
Finding Maximum Clique in Stochastic Graphs Using Distributed Learning Automata
 International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems
, 2015
"... Because of unpredictable, uncertain and timevarying nature of real networks it seems that stochastic graphs, in which weights associated to the edges are random variables, may be a better candidate as a graph model for real world networks. Once the graph model is chosen to be a stochastic graph, ev ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Because of unpredictable, uncertain and timevarying nature of real networks it seems that stochastic graphs, in which weights associated to the edges are random variables, may be a better candidate as a graph model for real world networks. Once the graph model is chosen to be a stochastic graph, every feature of the graph such as path, clique, spanning tree and dominating set, to mention a few, should be treated as a stochastic feature. For example, choosing stochastic graph as the graph model of an online social network and defining community structure in terms of clique, and the associations among the individuals within the community as random variables, the concept of stochastic clique may be used to study community structure properties. In this paper maximum clique in stochastic graph is first defined and then several learning automatabased algorithms are proposed for solving maximum clique problem in stochastic graph where the probability distribution functions of the weights associated with the edges of the graph are unknown. It is shown that by a proper choice of the parameters of the proposed algorithms, one can make the probability of finding maximum clique in stochastic graph as close to unity as possible. Experimental results show that the proposed algorithms significantly reduce the number of samples needed to be taken from the edges of the stochastic graph as compared to the number of samples needed by standard sampling method at a given confidence level.
Towards Cohesive Anomaly Mining∗
"... In some applications, such as bioinformatics, social network analysis, and computational criminology, it is desirable to find compact clusters formed by a (very) small portion of objects in a large data set. Since such clusters are comprised of a small number of objects, they are extraordinary an ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
In some applications, such as bioinformatics, social network analysis, and computational criminology, it is desirable to find compact clusters formed by a (very) small portion of objects in a large data set. Since such clusters are comprised of a small number of objects, they are extraordinary and anomalous with respect to the entire data set. This specific type of clustering task cannot be solved well by the conventional clustering methods since generally those methods try to assign most of the data objects into clusters. In this paper, we model this novel and applicationinspired task as the problem of mining cohesive anomalies. We propose a general framework and a principled approach to tackle the problem. The experimental results on both synthetic and real data sets verify the effectiveness and efficiency of our approach.
AEfficiently Estimating Motif Statistics of Large Networks
"... Exploring statistics of locally connected subgraph patterns (also known as network motifs) has helped researchers better understand the structure and function of biological and online social networks (OSNs). Nowadays the massive size of some critical networks – often stored in already overloaded rel ..."
Abstract
 Add to MetaCart
(Show Context)
Exploring statistics of locally connected subgraph patterns (also known as network motifs) has helped researchers better understand the structure and function of biological and online social networks (OSNs). Nowadays the massive size of some critical networks – often stored in already overloaded relational databases – effectively limits the rate at which nodes and edges can be explored, making it a challenge to accurately discover subgraph statistics. In this work, we propose sampling methods to accurately estimate subgraph statistics from as few queried nodes as possible. We present sampling algorithms that efficiently and accurately estimate subgraph properties of massive networks. Our algorithms require no precomputation or complete network topology information. At the same time, we provide theoretical guarantees of convergence. We perform experiments using widely known data sets, and show that for the same accuracy, our algorithms require an order of magnitude less queries (samples) than the current stateoftheart algorithms.