## Comparing community structure identification (2005)

Venue: | Journal of Statistical Mechanics: Theory and Experiment |

Citations: | 157 - 3 self |

### BibTeX

@ARTICLE{Danon05comparingcommunity,

author = {Leon Danon and Albert Díaz-guilera and Jordi Duch},

title = {Comparing community structure identification},

journal = {Journal of Statistical Mechanics: Theory and Experiment},

year = {2005},

pages = {09008}

}

### Years of Citing Articles

### OpenURL

### Abstract

### Citations

11366 |
Computers and Intractability: A Guide to the Theory of NP-Completeness
- Garey, Johnson
- 1979
(Show Context)
Citation Context ...r science for decades. Here one looks to separate the graph into two densely connected communities of equal size, which are connected with the minimum number of links. This is an NP complete problem (=-=Garey and Johnson, 1979-=-), however several methods have been proposed to reduce the complexity of the task (Kernighan and Lin, 1970; Fiedler, 1973; Boettcher and Percus, 2001a). In real complex networks we often have no idea... |

2292 |
Dubes, “Algorithms for clustering data
- Jain, C
- 1988
(Show Context)
Citation Context ...be in the same community. 5.1 Hierarchical clustering Traditional methods for detecting communities in social networks have been based on “hierarchical clustering” (see for example (Scott, 2000) and (=-=Jain and Dubes, 1988-=-)). In general they proceed by calculating a similarity metric for each pair of vertices, representing how close the vertices are according to some property of the network. In the beginning, only vert... |

2084 |
Social network analysis: Methods and applications
- Wasserman, Faust
- 1994
(Show Context)
Citation Context ...on of community must be purely topological. Social networks has been the subject of interest for sociologists for decades. For a standard text on the social science approach to networks analysis see (=-=Wasserman and Faust, 1994-=-). The social science approach is largely 2s(though by no means exclusively) concerned with the effect an individual player has on the network and vice versa. As a result, the local properties of netw... |

1616 | The structure and function of complex networks
- Newman
- 2003
(Show Context)
Citation Context ...bert Díaz-Guilera ∗ 1 Introduction January 31, 2005 The study of complex networks has received an enormous amount of attention from the scientific community in recent years(Barabasi and Albert, 2002; =-=Newman, 2003-=-; Dorogovtsev and Mendes, 2003; Strogatz, 2001; Bornholdt and Schuster, 2002; Pastor-Satorras et al., 2003). Physicists in particular have become interested in the study of networks describing the top... |

1486 |
A Discipline of Programming
- Dijkstra
- 1976
(Show Context)
Citation Context ...Reichardt & Bornholdt [41] RB parameter dependent Table 1. Table summarising how the computational cost of different approaches scales with number of nodes n, number of links m and average degree 〈k〉 =-=[42]-=-. The labels shown here are used in Figures 2 and 3. In Figure 2 we show the sensitivity of all methods we have been able to gather. The percentage of correctly identified nodes is calculated using th... |

1372 | Statistical mechanics of complex networks
- Albert, Barabasi
- 2002
(Show Context)
Citation Context ...ch † , Alex Arenas † and Albert Díaz-Guilera ∗ 1 Introduction January 31, 2005 The study of complex networks has received an enormous amount of attention from the scientific community in recent years(=-=Barabasi and Albert, 2002-=-; Newman, 2003; Dorogovtsev and Mendes, 2003; Strogatz, 2001; Bornholdt and Schuster, 2002; Pastor-Satorras et al., 2003). Physicists in particular have become interested in the study of networks desc... |

1203 |
Modern Graph Theory
- Bollobas
- 1999
(Show Context)
Citation Context ... there exist some links between the m components, the degeneration is no longer present, leaving one eigenvector with eigenvalue zero and m − 1 eigenvector with eigenvalue slightly greater from zero (=-=Bollobas, 1998-=-). So it should be possible to find the blocks, at least approximately by considering the eigenvalues slightly greater than zero and looking at the components of their eigenvectors. As the Laplacian m... |

1099 |
An efficient heuristic procedure for partitioning graphs. The Bell system technical journal
- Kernighan, Lin
- 1970
(Show Context)
Citation Context ... size, which are connected with the minimum number of links. This is an NP complete problem (Garey and Johnson, 1979), however several methods have been proposed to reduce the complexity of the task (=-=Kernighan and Lin, 1970-=-; Fiedler, 1973; Boettcher and Percus, 2001a). In real complex networks we often have no idea how many communities we wish to discover, but in general it is more than two. This makes the process all t... |

826 |
Community structure in social and biological networks
- Girvan, Newman
- 2002
(Show Context)
Citation Context ... we shall go on to describe, in many different contexts, including metabolic networks (Ravasz et al., 2002; Holme et al., 2003), banking networks (Boss et al., 2003) and most notably social networks (=-=Girvan and Newman, 2002-=-). As a result, the problem of identification of communities has been the focus of many recent efforts. Community detection in large networks is potentially very useful. Nodes belonging to a tight-kni... |

823 |
Finding and evaluating community structure in networks
- Newman, Girvan
(Show Context)
Citation Context ...y identification A question that has been raised in recent years is how to evaluate a given partition of a network into communities. A simple approach that has become widely accepted was proposed in (=-=Newman and Girvan, 2004-=-). It is based on the intuitive idea that random networks do not exhibit community structure. Let us imagine that we have an arbitrary network, and an arbitrary partition of that network into Nc commu... |

603 |
Hierarchical grouping to optimize an objective function
- Ward
- 1963
(Show Context)
Citation Context ... the intuitive idea that a random walker will get trapped for a longer time in a a densely connected community. They calculate a distance measure between two nodes, and apply an agglomerative method (=-=Ward, 1063-=-), starting with all nodes in their own community, and joining them two by two. The main difference between this approach and the above is that at each step, the distances are recalculated. The two me... |

524 |
Modularity and community structure in networks
- Newman
(Show Context)
Citation Context ...orks and many more. Although several questions have been addressed, many important ones still resist complete resolution. One such problem is the analysis of modular structure found in many networks (=-=Newman, 2004-=-a). Distinct modules or communities within networks can loosely be defined as subsets of nodes which are more densely linked, when compared to the rest of the network. Such communities have been obser... |

515 |
Partitioning sparse matrices with eigenvectors of graphs
- Pothen, Simon, et al.
- 1990
(Show Context)
Citation Context ...nvectors, the sums of the components of each eigenvector must vanish (apart from the first, trivial eigenvector, which has all equal components). The problem studied in classic papers (Fiedler, 1973; =-=Pothen, 1990-=-) is a special case, where m = 2, the graph bisection problem. Here, the second eigenvector can provide a simple way to cut the graph in two. The components of the second eigenvector corresponding to ... |

436 |
Algebraic connectivity of graphs
- Fiedler
- 1973
(Show Context)
Citation Context ...d with the minimum number of links. This is an NP complete problem (Garey and Johnson, 1979), however several methods have been proposed to reduce the complexity of the task (Kernighan and Lin, 1970; =-=Fiedler, 1973-=-; Boettcher and Percus, 2001a). In real complex networks we often have no idea how many communities we wish to discover, but in general it is more than two. This makes the process all the more costly.... |

410 |
Finding community structure in very large networks
- Clauset, Newman, et al.
(Show Context)
Citation Context ...hem. This process is repeated until a maximum value of Q is obtained. The algorithm is one of the fastest available, especially when applied using the data structure for sparse networks described in (=-=Clauset et al., 2004-=-). However, while also pretty good at identifying community structure, more recent approaches have achieved even more accuracy (see Sec. 9). This method has been used to study the size distributions o... |

399 |
Exploring complex networks
- Strogatz
- 2001
(Show Context)
Citation Context ...on into community structure identification in complex networks. 1 Introduction The study of complex networks has received an enormous amount of attention from the scientific community in recent years =-=[1, 2, 3, 4, 5, 6]-=-. Physicists in particular have become interested in the study of networks describing the topologies of wide variety of systems, such as the world wide web, social and communication networks, biochemi... |

397 |
Evolution of Networks: from Biological Nets to the Internet and WWW
- Dorogovtsev, Mendes
- 2003
(Show Context)
Citation Context ...era ∗ 1 Introduction January 31, 2005 The study of complex networks has received an enormous amount of attention from the scientific community in recent years(Barabasi and Albert, 2002; Newman, 2003; =-=Dorogovtsev and Mendes, 2003-=-; Strogatz, 2001; Bornholdt and Schuster, 2002; Pastor-Satorras et al., 2003). Physicists in particular have become interested in the study of networks describing the topologies of wide variety of sys... |

359 |
Fast algorithm for detecting community structure in networks”, Phys
- Newman
(Show Context)
Citation Context ...m values. Here we outline the approaches that have tackled this problem. 6.1 Greedy algorithm In the first attempt at optimising Q directly Newman takes a greedy optimisation (hill climbing) approach =-=[35]-=-. At the start of the algorithm, each node is placed into its own partition. One can then calculate the change in Q should any two partitions be joined. The algorithm proceeds by choosing the pair of ... |

327 | A faster algorithm for betweenness centrality
- Brandes
- 2001
(Show Context)
Citation Context ...r retrieval and analysis. Calculation of link betweenness is the most computer intensive part of the algorithm. Using the fastest methods developed independently by Newman (Newman, 2001) and Brandes (=-=Brandes, 2001-=-) and for a network of size n with m links the speed of calculating all link betweenness-es still remains of O(mn) for unweighted networks. Unfortunately, the calculation needs to be repeated each tim... |

276 |
Hierarchical organization of modularity in metabolic networks
- Ravasz, Somera, et al.
- 2002
(Show Context)
Citation Context ... linked, when compared to the rest of the network. Such communities have been observed, using some of the methods we shall go on to describe, in many different contexts, including metabolic networks (=-=Ravasz et al., 2002-=-; Holme et al., 2003), banking networks (Boss et al., 2003) and most notably social networks (Girvan and Newman, 2002). As a result, the problem of identification of communities has been the focus of ... |

170 | Scientific collaboration networks. I. Network construction and fundamental results
- Newman
- 2001
(Show Context)
Citation Context ...the entire process for later retrieval and analysis. Calculation of link betweenness is the most computer intensive part of the algorithm. Using the fastest methods developed independently by Newman (=-=Newman, 2001-=-) and Brandes (Brandes, 2001) and for a network of size n with m links the speed of calculating all link betweenness-es still remains of O(mn) for unweighted networks. Unfortunately, the calculation n... |

163 | Coetzee, Self-organization and identification of web communities
- Flake, Lawrence, et al.
- 2002
(Show Context)
Citation Context ... potentially very useful. Nodes belonging to a tight-knit community are more than likely to have other properties in common. In the world wide web, community analysis has uncovered thematic clusters (=-=Flake et al., 2002-=-; Eckmann and Moses, 2002). In biochemical or neural networks, communities may be functional groups ∗ Departament de Fisica Fonamental,Universitat de Barcelona, Marti i Franques 1 08086 Barcelona, Spa... |

135 | The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth
- Steyvers, JB
- 2005
(Show Context)
Citation Context ...es. To test the method they study both undirected and directed networks, using the appropriate optimisation function for each case, and test the algorithm on the word association network reported in (=-=Steyvers and Tenenbaum, 2001-=-). The network has over 10000 nodes and the method is able to give qualitatively good results. 7.4 Approximate resistance networks In a development of the resistor network approach in (Newman and Girv... |

112 | Computing communities in large networks using random walks
- Pons, Latapy
- 2005
(Show Context)
Citation Context ...n the link with the highest link clustering). This time Zhou presents an algorithm to detect communities similar to hierarchical clustering algorithms described In a similar approach Latapy and Pons (=-=Latapy and Pons, 2004-=-) also employ the intuitive idea that a random walker will get trapped for a longer time in a a densely connected community. They calculate a distance measure between two nodes, and apply an agglomera... |

98 |
Finding all cliques of an undirected graph
- Bron, Kerbosch
- 1973
(Show Context)
Citation Context ... although implicitly. 3sSelf-referring definitions, while useful in characterising communities which are already known, are not the best choice while trying to find them. The Bron-Kerbosch algorithm (=-=Bron and Kerbosch, 1973-=-) for finding cliques in a network is very costly, running in worst case time that scales exponentially with network size. Comparative definitions, on the other hand, lend themselves much more easily ... |

94 |
Community Detection in Complex Networks using Extremal Optimization
- Duch, Arenas
(Show Context)
Citation Context ...high values of Q quickly. In case of large networks it requires less computer memory than the other presented, since it doesn’t need extra data structures. 6.3 Extremal optimisation In this approach (=-=Duch and Arenas, 2005-=-), an heuristic search procedure based on extremal optimisation (Boettcher and Percus, 2001b) is used to find the network community configuration that has the best modularity value. The algorithm opti... |

71 |
Superparamagnetic clustering of data
- Blatt, Wiseman, et al.
- 1996
(Show Context)
Citation Context ...node k to i and the distance from k to j summed over all nodes k. 18s8.3 Q-potts model Another interesting approach (Reichardt and Bornholdt, 2004) detects communities by mapping it to a spin system (=-=Blatt et al., 1996-=-). Here, each node is assigned a spin state between 1 and q, at random. The energy of the spin system is determined using a q-Potts Hamiltonian 6 The idea is that in the ground state of the system, co... |

71 |
Subnetwork hierarchies of biochemical pathways
- Holme, Huss, et al.
- 2003
(Show Context)
Citation Context ...d to the rest of the network. Such communities have been observed, using some of the methods we shall go on to describe, in many different contexts, including metabolic networks (Ravasz et al., 2002; =-=Holme et al., 2003-=-), banking networks (Boss et al., 2003) and most notably social networks (Girvan and Newman, 2002). As a result, the problem of identification of communities has been the focus of many recent efforts.... |

65 |
Self-similar community structure in a network of human interactions
- Guimerà, Danon, et al.
(Show Context)
Citation Context ...general it is more than two. This makes the process all the more costly. What is more, communities may also be hierarchical, that is communities may be further divided into sub-communities and so on (=-=Guimera et al., 2003-=-; Gleiser and Danon, 2003; Arenas et al., 2004). In this chapter we would like to present the recent advances made in the field of community identification in networks in a clear and simple fashion. T... |

60 |
Modularity from Fluctuations in Random Graphs and Complex Networks
- Guimerà, Sales-Pardo, et al.
(Show Context)
Citation Context ...lue of modularity can take negative values. 4sOne might be tempted to think that random networks will exhibit very small values of modularity. As Guimerà et al. show, this in general is not the case (=-=Guimerà et al., 2004-=-). It is possible to find a partition which not only has a nonzero value of modularity for random networks of finite size, but that this value is quite high, for example a network of 128 nodes and 102... |

59 |
Finding communities in linear time: A physics approach
- Wu, Huberman
- 2004
(Show Context)
Citation Context ...proximate resistance networks In a development of the resistor network approach in (Newman and Girvan, 2004) Wu et al. present an approximate method, in order to reduce the computational time needed (=-=Wu and Huberman, 2004-=-). In this method, a pair of nodes is chosen at random to be a voltage source, V1 = 1 and a sinks V2 = 0. The authors then approximate the voltage of all other nodes in the network iteratively, avoidi... |

58 |
Curvature of Co-links uncovers hidden thematic layers
- Eckmann, Moses
(Show Context)
Citation Context ...eful. Nodes belonging to a tight-knit community are more than likely to have other properties in common. In the world wide web, community analysis has uncovered thematic clusters (Flake et al., 2002; =-=Eckmann and Moses, 2002-=-). In biochemical or neural networks, communities may be functional groups ∗ Departament de Fisica Fonamental,Universitat de Barcelona, Marti i Franques 1 08086 Barcelona, Spain † Departament d’Enginy... |

53 |
Detecting fuzzy community structures in complex networks with a potts model
- Reichardt, Bornholdt
(Show Context)
Citation Context ...ity index is simply the square of the difference between the distance from another node k to i and the distance from k to j summed over all nodes k. 18s8.3 Q-potts model Another interesting approach (=-=Reichardt and Bornholdt, 2004-=-) detects communities by mapping it to a spin system (Blatt et al., 1996). Here, each node is assigned a spin state between 1 and q, at random. The energy of the spin system is determined using a q-Po... |

50 |
Detecting Network Communities: a new systematic and efficient
- Donetti, Munoz
- 2004
(Show Context)
Citation Context ... question gives the Laplacian matrix. 13s7.2 Multi dimensional spectral analysis Taking further advantage of the properties of the Laplacian matrix, Donetti and Muñoz present a very nice approach in (=-=Donetti and Muñoz, 2004-=-). The first few non-trivial eigenvectors can be extracted sequentially at minimum cost, using the Lanczos method, which can be applied to sparse matrices at minimum computational cost (Golub and Loan... |

37 |
Entropy of dialogues creates coherent structures in e-mail traffic
- Eckmann, Moses, et al.
(Show Context)
Citation Context .... The authors show that finding connected components of high curvature give a good idea of community structure. In a later effort, they go on to use the method to study communities in email dialogue (=-=Eckmann et al., 2004-=-). 16s(a) (b) Figure 5: How clustering is related to curvature according to (Eckmann and Moses, 2002). For a node i, the shortest path distance between any of its neighbours will be either 1, if the n... |

35 | 2001 Optimization with extremal dynamics
- Boettcher, Percus
(Show Context)
Citation Context ...mum number of links. This is an NP complete problem (Garey and Johnson, 1979), however several methods have been proposed to reduce the complexity of the task (Kernighan and Lin, 1970; Fiedler, 1973; =-=Boettcher and Percus, 2001-=-a). In real complex networks we often have no idea how many communities we wish to discover, but in general it is more than two. This makes the process all the more costly. What is more, communities m... |

35 |
Exploring complex networks. Nature
- Strogatz
(Show Context)
Citation Context ...1, 2005 The study of complex networks has received an enormous amount of attention from the scientific community in recent years(Barabasi and Albert, 2002; Newman, 2003; Dorogovtsev and Mendes, 2003; =-=Strogatz, 2001-=-; Bornholdt and Schuster, 2002; Pastor-Satorras et al., 2003). Physicists in particular have become interested in the study of networks describing the topologies of wide variety of systems, such as th... |

34 |
Community structure in jazz
- Gleiser, Danon
- 2003
(Show Context)
Citation Context ...n two. This makes the process all the more costly. What is more, communities may also be hierarchical, that is communities may be further divided into sub-communities and so on (Guimera et al., 2003; =-=Gleiser and Danon, 2003-=-; Arenas et al., 2004). In this chapter we would like to present the recent advances made in the field of community identification in networks in a clear and simple fashion. To this end, the sections ... |

34 | Local method for detecting communities
- Bagrow, Bollt
- 2005
(Show Context)
Citation Context ... demonstrates the ability of hierarchical clustering methods to deal with large data sets. 5.2 L-shell method This method proposes a different take on agglomerative methods. The algorithm proposed in =-=[34]-=- consists of a shell of size l, starting at a node i is a subset of nodes, all within a shortest path distance of d ≤ l (L-shell) spreading outward from a starting node i . As the shell expands the to... |

30 |
A method to find community structure based on information centrality
- FORTUNATO, LATORA, et al.
- 2004
(Show Context)
Citation Context ...e random walk approach ideas have been developed further by other authors (see Sec. 8.2 and Sec. 7.4). 7 6 7 ts4.3 Information centrality Another divisive algorithm was presented by Fortunato et al. (=-=Fortunato et al., 2004-=-). In this paper they employ the network efficiency measure, previously proposed by Latora and Marchiori (Latora and Marchiori, 2004) to quantify how efficient a particular network G is in the context... |

28 |
A measure of centrality based on network efficiency
- Latora, Marchiori
(Show Context)
Citation Context ...entrality Another divisive algorithm was presented by Fortunato et al. (Fortunato et al., 2004). In this paper they employ the network efficiency measure, previously proposed by Latora and Marchiori (=-=Latora and Marchiori, 2004-=-) to quantify how efficient a particular network G is in the context of information exchange. Once a particular link is removed from G, its efficiency is reduced by a measurable amount C I , or inform... |

20 |
Community analysis in social networks
- Arenas, Danon, et al.
- 2004
(Show Context)
Citation Context ... general it is more than two. This makes the process all the more costly. What is more, communities may also be hierarchical, that is communities may be further divided into sub-communities and so on =-=[19, 20, 21]-=-. In this chapter we would like to present the recent advances made in the field of community identification in networks in a clear and simple fashion. To this end, the sections are organised as follo... |

16 |
Extremal optimization for graph partitioning
- Boettcher, Percus
- 2001
(Show Context)
Citation Context ...mum number of links. This is an NP complete problem (Garey and Johnson, 1979), however several methods have been proposed to reduce the complexity of the task (Kernighan and Lin, 1970; Fiedler, 1973; =-=Boettcher and Percus, 2001-=-a). In real complex networks we often have no idea how many communities we wish to discover, but in general it is more than two. This makes the process all the more costly. What is more, communities m... |

13 |
Network Brownian motion: A new method to measure vertex-vertex proximity and to identify communities and subcommunities
- Zhou, Lipowsky
- 2004
(Show Context)
Citation Context ...tances are different, d < d ′ . 8.2 Random walk based methods In a set of papers, Zhou and collaborators develop a methodology for community detection based on random walks (Zhou, 2003b; Zhou, 2003a; =-=Zhou and Lipowsky, 2004-=-). Apart from a method for finding communities, Zhou also presents a definition of what a community is. Also worthy of note is that the method is applicable to both directed and undirected networks. I... |

12 | dissimilarity index, and network community structure - Zhou, Distance |

9 |
dissimilarity index and network community structure
- Distance
- 2003
(Show Context)
Citation Context ...se the method to study communities in email dialogue [48]. 8.2 Random walk based methods In a set of papers, Zhou and collaborators develop a methodology for community detection based on random walks =-=[49, 50, 51]-=-. Apart from a method for finding communities, Zhou also presents a definition of what a community is. Also worthy of note is that the method is applicable to both directed and undirected networks. In... |

8 |
Somera A L, Mongru D
- Ravasz
(Show Context)
Citation Context ...e rest of the network. Such communities have been observed in different kinds of networks, most notably in social networks, but also in networks of other origin such as metabolic or economic networks =-=[8, 9, 10, 11]-=-. As a result, the problem of identification of communities has been the focus of many recent efforts. Community detection in large networks is potentially very useful. Nodes belonging to a tight-knit... |

4 | Communities detection in large networks
- Capocci, Servedio, et al.
(Show Context)
Citation Context ...well (see Sec. 9). In the comparison section, we use the aliases DMCS and DMCA for Single Angular and Complete Angular analyses respectively. 147.3 Constrained optimisation This method, described in =-=[45]-=- is based on the spectral properties of the simple adjacency matrix as opposed to the Laplacian. The authors recast the costly problem of extracting eigenvectors of an N × N matrix into a constrained ... |

2 |
A local method for detecting communities. cond-mat/0412482
- Bagrow, Bollt
- 2004
(Show Context)
Citation Context ...demonstrates the ability of hierarchical clustering methods to deal with large data sets. 5.2 L-shell method This method proposes a different take on agglomerative methods. The algorithm proposed in (=-=Bagrow and Bollt, 2004-=-) consists of a shell of size l, starting at a node i is a subset of nodes, all within a shortest path distance of d ≤ l (L-shell) spreading outward from a starting node i . As the shell expands the t... |

2 |
The network topology of the interbank market. cond-mat/0309582
- Boss, Elsinger, et al.
- 2003
(Show Context)
Citation Context ...unities have been observed, using some of the methods we shall go on to describe, in many different contexts, including metabolic networks (Ravasz et al., 2002; Holme et al., 2003), banking networks (=-=Boss et al., 2003-=-) and most notably social networks (Girvan and Newman, 2002). As a result, the problem of identification of communities has been the focus of many recent efforts. Community detection in large networks... |