Results 1  10
of
16
Better Approximation of Betweenness Centrality
"... Estimating the importance or centrality of the nodes in large networks has recently attracted increased interest. Betweenness is one of the most important centrality indices, which basically counts the number of shortest paths going through a node. Betweenness has been used in diverse applications, ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
Estimating the importance or centrality of the nodes in large networks has recently attracted increased interest. Betweenness is one of the most important centrality indices, which basically counts the number of shortest paths going through a node. Betweenness has been used in diverse applications, e.g., social network analysis or route planning. Since exact computation is prohibitive for large networks, approximation algorithms are important. In this paper, we propose a framework for unbiased approximation of betweenness that generalizes a previous approach by Brandes. Our best new schemes yield significantly better approximation than before for many real world inputs. In particular, we also get good approximations for the betweenness of unimportant nodes.
Efficient search ranking in social networks
 in CIKM, 2007
"... In social networks such as Orkut, www.orkut.com, a large portion of the user queries refer to names of other people. Indeed, more than 50 % of the queries in Orkut are about names of other users, with an average of 1.8 terms per query. Further, the users usually search for people with whom they main ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
In social networks such as Orkut, www.orkut.com, a large portion of the user queries refer to names of other people. Indeed, more than 50 % of the queries in Orkut are about names of other users, with an average of 1.8 terms per query. Further, the users usually search for people with whom they maintain relationships in the network. These relationships can be modelled as edges in a friendship graph, a graph in which the nodes represent the users. In this context, search ranking can be modelled as a function that depends on the distances among users in the graph, more specifically, of shortest paths in the friendship graph. However, application of this idea to ranking is not straightforward because the large size of modern social networks (dozens of millions of users) prevents efficient computation of shortest paths at query time. We overcome this by designing a ranking formula that strikes a balance between producing good results and reducing query processing time. Using data from the Orkut social network, which includes over 40 million users, we show that our ranking, augmented by this new signal, produces high quality results, while maintaining query processing time small.
Discovering Correlated SpatioTemporal Changes in Evolving Graphs
 UNDER CONSIDERATION FOR PUBLICATION IN KNOWLEDGE AND INFORMATION SYSTEMS
, 2007
"... Graphs provide powerful abstractions of relational data, and are widely used in fields such as network management, web page analysis and sociology. While many graph representations of data describe dynamic and time evolving relationships, most graph mining work treats graphs as static entities. Our ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
Graphs provide powerful abstractions of relational data, and are widely used in fields such as network management, web page analysis and sociology. While many graph representations of data describe dynamic and time evolving relationships, most graph mining work treats graphs as static entities. Our focus in this paper is to discover regions of a graph that are evolving in a similar manner. To discover regions of correlated spatiotemporal change in graphs, we propose an algorithm called cSTAG. Whereas most clustering techniques are designed to find clusters that optimise a single distance measure, cSTAG addresses the problem of finding clusters that optimise both temporal and spatial distance measures simultaneously. We show the effectiveness of cSTAG using a quantitative analysis of accuracy on synthetic data sets, as well as demonstrating its utility on two large, reallife data sets, where one is the routing topology of the Internet, and the other is the dynamic graph of files accessed together on the 1998 World Cup official website.
Graph clustering with network structure indices
, 2007
"... Graph clustering has become ubiquitous in the study of relational data sets. We examine two simple algorithms: a new graphical adaptation of the kmedoids algorithm and the GirvanNewman method based on edge betweenness centrality. We show that they can be effective at discovering the latent groups ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
Graph clustering has become ubiquitous in the study of relational data sets. We examine two simple algorithms: a new graphical adaptation of the kmedoids algorithm and the GirvanNewman method based on edge betweenness centrality. We show that they can be effective at discovering the latent groups or communities that are defined by the link structure of a graph. However, both approaches rely on prohibitively expensive computations, given the size of modern relational data sets. Network structure indices (NSIs) are a proven technique for indexing network structure and efficiently finding short paths. We show how incorporating NSIs into these graph clustering algorithms can overcome these complexity limitations. We also present promising quantitative and qualitative evaluations of the modified algorithms on synthetic and real data sets. 1.
Approximate shortest distance computing: A querydependent local landmark scheme
 In ICDE
, 2012
"... Abstract—Shortest distance query between two nodes is a fundamental operation in largescale networks. Most existing methods in the literature take a landmark embedding approach, which selects a set of graph nodes as landmarks and computes the shortest distances from each landmark to all nodes as an ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Abstract—Shortest distance query between two nodes is a fundamental operation in largescale networks. Most existing methods in the literature take a landmark embedding approach, which selects a set of graph nodes as landmarks and computes the shortest distances from each landmark to all nodes as an embedding. To handle a shortest distance query between two nodes, the precomputed distances from the landmarks to the query nodes are used to compute an approximate shortest distance based on the triangle inequality. In this paper, we analyze the factors that affect the accuracy of the distance estimation in the landmark embedding approach. In particular we find that a globally selected, queryindependent landmark set plus the triangulation based distance estimation introduces a large relative error, especially for nearby query nodes. To address this issue, we propose a querydependent local landmark scheme, which identifies a local landmark close to the specific query nodes and provides a more accurate distance estimation than the traditional global landmark approach. Specifically, a local landmark is defined as the least common ancestor of the two query nodes in the shortest path tree rooted at a global landmark. We propose efficient local landmark indexing and retrieval techniques, which are crucial to achieve low offline indexing complexity and online query complexity. Two optimization techniques on graph compression and graph online search are also proposed, with the goal to further reduce index size and improve query accuracy. Our experimental results on largescale social networks and road networks demonstrate that the local landmark scheme reduces the shortest distance estimation error significantly when compared with global landmark embedding. I.
Querying Shortest Path Distance with Bounded Errors in Large Graphs
"... Abstract. Shortest paths and shortest path distances are important primary queries for users to query in a large graph. In this paper, we propose a new approach to answer shortest path and shortest path distance queries efficiently with an error bound. The error bound is controlled by a userspecif ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Shortest paths and shortest path distances are important primary queries for users to query in a large graph. In this paper, we propose a new approach to answer shortest path and shortest path distance queries efficiently with an error bound. The error bound is controlled by a userspecified parameter, and the online query efficiency is achieved with prepossessing offline. In the offline preprocessing, we take a reference node embedding approach which computes the singlesource shortest paths from each reference node to all the other nodes. To guarantee the userspecified error bound, we design a novel coveragebased reference node selection strategy, and show that selecting the optimal set of reference nodes is NPhard. We propose a greedy selection algorithm which exploits the submodular property of the formulated objective function, and use a graph partitioningbased heuristic to further reduce the offline computational complexity of reference node embedding. In the online query answering, we use the precomputed distances to provide a lower bound and an upper bound of the true shortest path distance based on the triangle inequality. In addition, we propose a linear algorithm which computes the approximate shortest path between two nodes within the error bound. We perform extensive experimental evaluation on a largescale road network and a social network and demonstrate the effectiveness and efficiency of our proposed methods. 1
Chapter X SUPPORTING A USER IN HIS ANNOTATION AND BROWSING ACTIVITIES IN FOLKSONOMIES
"... In this chapter we present a new approach to supporting users to annotate and browse resources referred by a folksonomy. Our approach proposes two hierarchical structures and two related algorithms to arrange groups of semantically related tags in a hierarchy; this allows users to visualize tags of ..."
Abstract
 Add to MetaCart
(Show Context)
In this chapter we present a new approach to supporting users to annotate and browse resources referred by a folksonomy. Our approach proposes two hierarchical structures and two related algorithms to arrange groups of semantically related tags in a hierarchy; this allows users to visualize tags of their interests according to desired semantic granularities and, then, helps them to find those tags best expressing their information needs. In this chapter we first illustrate the technical characteristics of our approach; then we describe the prototype implementing it; after this, we illustrate various experiments allowing its performance to be tested; finally, we compare it with other related approaches already proposed in the literature. 1.
Contents lists available at ScienceDirect Information
"... Systems journal homepage: www.elsevier.com/locate/infosys Exploitation of semantic relationships and hierarchical data structures to support a user in his annotation and browsing ..."
Abstract
 Add to MetaCart
(Show Context)
Systems journal homepage: www.elsevier.com/locate/infosys Exploitation of semantic relationships and hierarchical data structures to support a user in his annotation and browsing