Results 1  10
of
528
Comparing community structure identification
 Journal of Statistical Mechanics: Theory and Experiment
, 2005
"... ..."
(Show Context)
A measure of betweenness centrality based on random walks
 Social Networks
, 2005
"... Betweenness is a measure of the centrality of a node in a network, and is normally calculated as the fraction of shortest paths between node pairs that pass through the node of interest. Betweenness is, in some sense, a measure of the influence a node has over the spread of information through the n ..."
Abstract

Cited by 280 (0 self)
 Add to MetaCart
(Show Context)
Betweenness is a measure of the centrality of a node in a network, and is normally calculated as the fraction of shortest paths between node pairs that pass through the node of interest. Betweenness is, in some sense, a measure of the influence a node has over the spread of information through the network. By counting only shortest paths, however, the conventional definition implicitly assumes that information spreads only along those shortest paths. Here we propose a betweenness measure that relaxes this assumption, including contributions from essentially all paths between nodes, not just the shortest, although it still gives more weight to short paths. The measure is based on random walks, counting how often a node is traversed by a random walk between two other nodes. We show how our measure can be calculated using matrix methods, and give some examples of its application to particular networks. 1
Email as spectroscopy: Automated discovery of community structure within organizations
, 2003
"... Abstract. We describe a methodology for the automatic identification of communities of practice from email logs within an organization. We use a betweenness centrality algorithm that can rapidly find communities within a graph representing information flows. We apply this algorithm to an email corpu ..."
Abstract

Cited by 200 (7 self)
 Add to MetaCart
(Show Context)
Abstract. We describe a methodology for the automatic identification of communities of practice from email logs within an organization. We use a betweenness centrality algorithm that can rapidly find communities within a graph representing information flows. We apply this algorithm to an email corpus of nearly one million messages collected over a twomonth span, and show that the method is effective at identifying true communities, both formal and informal, within these scalefree graphs. This approach also enables the identification of leadership roles within the communities. These studies are complemented by a qualitative evaluation of the results in the field.
The Internet ASLevel Topology: Three Data Sources and One Definitive Metric
"... We calculate an extensive set of characteristics for Internet AS topologies extracted from the three data sources most frequently used by the research community: traceroutes, BGP, and WHOIS. We discover that traceroute and BGP topologies are similar to one another but differ substantially from the W ..."
Abstract

Cited by 107 (15 self)
 Add to MetaCart
We calculate an extensive set of characteristics for Internet AS topologies extracted from the three data sources most frequently used by the research community: traceroutes, BGP, and WHOIS. We discover that traceroute and BGP topologies are similar to one another but differ substantially from the WHOIS topology. Among the widely considered metrics, we find that the joint degree distribution appears to fundamentally characterize Internet AS topologies as well as narrowly define values for other important metrics. We discuss the interplay between the specifics of the three data collection mechanisms and the resulting topology views. In particular, we show how the data collection peculiarities explain differences in the resulting joint degree distributions of the respective topologies. Finally, we release to the community the input topology datasets, along with the scripts and output of our calculations. This supplement should enable researchers to validate their models against real data and to make more informed selection of topology data sources for their specific needs.
Extracting social networks and contact information from email and the web
 In Proceedings of CEAS1
, 2004
"... Abstract. We present an endtoend system that extracts a user’s social network and its members’ contact information given the user’s email inbox. The system identifies unique people in email, finds their Web presence, and automatically fills the fields of a contact address book using conditional ra ..."
Abstract

Cited by 100 (4 self)
 Add to MetaCart
(Show Context)
Abstract. We present an endtoend system that extracts a user’s social network and its members’ contact information given the user’s email inbox. The system identifies unique people in email, finds their Web presence, and automatically fills the fields of a contact address book using conditional random fields—a type of probabilistic model wellsuited for such information extraction tasks. By recursively calling itself on new people discovered on the Web, the system builds a social network with multiple degrees of separation from the user. Additionally, a set of expertisedescribing keywords are extracted and associated with each person. We outline the collection of statistical and learning components that enable this system, and present experimental results on the real email of two users; we also present results with a simple method of learning transfer, and discuss the capabilities of the system for addressbook population, expertfinding, and social network analysis. 1
Visual Unrolling of Network Evolution and the Analysis of Dynamic Discourse
, 2002
"... A new method for visualizing the class of incrementally evolving networks is presented. In addition to the intermediate states of the network it conveys the nature of the change between them by unrolling the dynamics of the network. Each modification is shown in a separate layer of a threedimension ..."
Abstract

Cited by 83 (7 self)
 Add to MetaCart
A new method for visualizing the class of incrementally evolving networks is presented. In addition to the intermediate states of the network it conveys the nature of the change between them by unrolling the dynamics of the network. Each modification is shown in a separate layer of a threedimensional representation, where the stack of layers corresponds to a time line of the evolution. We focus on discourse networks as the driving application, but our method extends to any type of network evolving in similar ways.
On Variants of ShortestPath Betweenness Centrality and their Generic Computation
 SOCIAL NETWORKS
, 2008
"... Betweenness centrality based on shortest paths is a standard measure of control utilized in numerous studies and implemented in all relevant software tools for network analysis. In this paper, a number of variants are reviewed, placed into context, and shown to be computable with simple variants of ..."
Abstract

Cited by 75 (1 self)
 Add to MetaCart
(Show Context)
Betweenness centrality based on shortest paths is a standard measure of control utilized in numerous studies and implemented in all relevant software tools for network analysis. In this paper, a number of variants are reviewed, placed into context, and shown to be computable with simple variants of the algorithm commonly used for the standard case. Key words: Betweenness centrality, algorithms, valued networks, load centrality 1
GPS: A Graph Processing System ∗
"... GPS (for Graph Processing System) is a complete opensource system we developed for scalable, faulttolerant, and easytoprogram execution of algorithms on extremely large graphs. GPS is similar to Google’s proprietary Pregel system [MAB+ 11], with some useful additional functionality described in ..."
Abstract

Cited by 63 (3 self)
 Add to MetaCart
(Show Context)
GPS (for Graph Processing System) is a complete opensource system we developed for scalable, faulttolerant, and easytoprogram execution of algorithms on extremely large graphs. GPS is similar to Google’s proprietary Pregel system [MAB+ 11], with some useful additional functionality described in the paper. In distributed graph processing systems like GPS and Pregel, graph partitioning is the problem of deciding which vertices of the graph are assigned to which compute nodes. In addition to presenting the GPS system itself, we describe how we have used GPS to study the effects of different graph partitioning schemes. We present our experiments on the performance of GPS under different static partitioning schemes—assigning vertices to workers “intelligently ” before the computation starts—and with GPS’s dynamic repartitioning feature, which reassigns vertices to different compute nodes during the computation by observing their message sending patterns. 1
An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation
 IEEE TPAMI
"... Word sense disambiguation (WSD), the task of identifying the intended meanings (senses) of words in context, has been a longstanding research objective for natural language processing. In this paper we are concerned with graphbased algorithms for largescale WSD. Under this framework, finding the ..."
Abstract

Cited by 62 (15 self)
 Add to MetaCart
Word sense disambiguation (WSD), the task of identifying the intended meanings (senses) of words in context, has been a longstanding research objective for natural language processing. In this paper we are concerned with graphbased algorithms for largescale WSD. Under this framework, finding the right sense for a given word amounts to identifying the most “important ” node among the set of graph nodes representing its senses. We introduce a graphbased WSD algorithm which has few parameters and does not require sense annotated data for training. Using this algorithm, we investigate several measures of graph connectivity with the aim of identifying those best suited for WSD. We also examine how the chosen lexicon and its connectivity influences WSD performance. We report results on standard data sets, and show that our graphbased approach performs comparably to the state of the art.