Results 1  10
of
986
Finding community structure in networks using the eigenvectors of matrices
, 2006
"... We consider the problem of detecting communities or modules in networks, groups of vertices with a higherthanaverage density of edges connecting them. Previous work indicates that a robust approach to this problem is the maximization of the benefit function known as “modularity ” over possible div ..."
Abstract

Cited by 502 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of detecting communities or modules in networks, groups of vertices with a higherthanaverage density of edges connecting them. Previous work indicates that a robust approach to this problem is the maximization of the benefit function known as “modularity ” over possible divisions of a network. Here we show that this maximization process can be written in terms of the eigenspectrum of a matrix we call the modularity matrix, which plays a role in community detection similar to that played by the graph Laplacian in graph partitioning calculations. This result leads us to a number of possible algorithms for detecting community structure, as well as several other results, including a spectral measure of bipartite structure in networks and a new centrality measure that identifies those vertices that occupy central positions within the communities to which they belong. The algorithms and measures proposed are illustrated with applications to a variety of realworld complex networks.
Comparing community structure identification
 Journal of Statistical Mechanics: Theory and Experiment
, 2005
"... ..."
(Show Context)
Data Clustering: 50 Years Beyond KMeans
, 2008
"... Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and m ..."
Abstract

Cited by 294 (7 self)
 Add to MetaCart
Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and methods for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is exploratory in nature to find structure in data. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, Kmeans, was first published in 1955. In spite of the fact that Kmeans was proposed over 50 years ago and thousands of clustering algorithms have been published since then, Kmeans is still widely used. This speaks to the difficulty of designing a general purpose clustering algorithm and the illposed problem of clustering. We provide a brief overview of clustering, summarize well known clustering methods, discuss the major challenges and key issues in designing clustering algorithms, and point out some of the emerging and useful research directions, including semisupervised clustering, ensemble clustering, simultaneous feature selection, and data clustering and large scale data clustering.
BUBBLE Rap: Socialbased forwarding in delay tolerant networks
 in Proc. ACM MobiHoc
, 2008
"... In this paper we seek to improve our understanding of human mobility in terms of social structures, and to use these structures in the design of forwarding algorithms for Pocket Switched Networks (PSNs). Taking human mobility traces from the real world, we discover that human interaction is heteroge ..."
Abstract

Cited by 284 (31 self)
 Add to MetaCart
(Show Context)
In this paper we seek to improve our understanding of human mobility in terms of social structures, and to use these structures in the design of forwarding algorithms for Pocket Switched Networks (PSNs). Taking human mobility traces from the real world, we discover that human interaction is heterogeneous both in terms of hubs (popular individuals) and groups or communities. We propose a social based forwarding algorithm, BUBBLE, which is shown empirically to improve the forwarding efficiency significantly compared to oblivious forwarding schemes and to PROPHET algorithm. We also show how this algorithm can be implemented in a distributed way, which demonstrates that it is applicable in the decentralised environment of PSNs.
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract

Cited by 246 (14 self)
 Add to MetaCart
(Show Context)
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse realworld networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large realworld networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “communitylike.” This behavior is not explained, even at a qualitative level, by any of the commonlyused network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are wellembeddable in a lowdimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large welldefined clusters
, 2008
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract

Cited by 208 (17 self)
 Add to MetaCart
(Show Context)
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large realworld networks, ranging from traditional and online social networks, to technological and information networks and
Quantifying social group evolution
 Nature
, 2007
"... The rich set of interactions between individuals in the society [1,2,3,4,5,6,7] results in complex community structure, capturing highly connected circles of friends, families, or professional cliques in a social network [3,7,8,9,10]. Thanks to frequent changes in the activity and communication patt ..."
Abstract

Cited by 126 (3 self)
 Add to MetaCart
(Show Context)
The rich set of interactions between individuals in the society [1,2,3,4,5,6,7] results in complex community structure, capturing highly connected circles of friends, families, or professional cliques in a social network [3,7,8,9,10]. Thanks to frequent changes in the activity and communication patterns of individuals, the associated social and communication network is subject to constant evolution [7,11,12,13,14,15,16]. Our knowledge of the mechanisms governing the underlying community dynamics is limited, but is essential for a deeper understanding of the development and selfoptimisation of the society as a whole [17,18,19,20,21,22]. We have developed a new algorithm based on clique percolation [23,24], that allows, for the first time, to investigate the time dependence of overlapping communities on a large scale and as such, to uncover basic relationships characterising community evolution. Our focus is on networks capturing the collaboration between scientists and the calls between mobile phone users. We find that large groups persist longer if they are capable of dynamically altering their membership, suggesting that an ability to change the composition results in better adaptability. The behaviour of small groups displays the opposite tendency, the condition
Defining and Evaluating Network Communities based on Groundtruth. Extended version
, 2012
"... Abstract—Nodes in realworld networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractabili ..."
Abstract

Cited by 112 (4 self)
 Add to MetaCart
(Show Context)
Abstract—Nodes in realworld networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractability of algorithms, issues with evaluation and the lack of a reliable goldstandard groundtruth. In this paper we study a set of 230 large realworld social, collaboration and information networks where nodes explicitly state their group memberships. For example, in social networks nodes explicitly join various interest based social groups. We use such groups to define a reliable and robust notion of groundtruth communities. We then propose a methodology which allows us to compare and quantitatively evaluate how different structural definitions of network communities correspond to groundtruth communities. We choose 13 commonly used structural definitions of network communities and examine their sensitivity, robustness and performance in identifying the groundtruth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triadparticipationratio, consistently give the best performance in identifying groundtruth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameterfree community detection method that easily scales to networks with more than hundred million nodes. The proposed method achieves 30 % relative improvement over current local clustering methods. I.