Results 1 -
5 of
5
The link-prediction problem for social networks
- J. American Society for Information Science and Technology
"... Given a snapshot of a social network, can we infer which new interactions among its members are likely to occur in the near future? We formalize this question as the link-prediction problem, and we develop approaches to link prediction based on measures for analyzing the “proximity” of nodes in a ne ..."
Abstract
-
Cited by 269 (4 self)
- Add to MetaCart
Given a snapshot of a social network, can we infer which new interactions among its members are likely to occur in the near future? We formalize this question as the link-prediction problem, and we develop approaches to link prediction based on measures for analyzing the “proximity” of nodes in a network. Experiments on large co-authorship networks suggest that information about future interactions can be extracted from network topology alone, and that fairly subtle measures for detecting node proximity can outperform more direct measures. 1
Information Diffusion through Blogspace
- In WWW ’04
, 2004
"... We study the dynamics of information propagation in environments of low-overhead personal publishing, using a large collection of weblogs over time as our example domain. We characterize and model this collection at two levels. First, we present a macroscopic characterization of topic propagation th ..."
Abstract
-
Cited by 162 (4 self)
- Add to MetaCart
We study the dynamics of information propagation in environments of low-overhead personal publishing, using a large collection of weblogs over time as our example domain. We characterize and model this collection at two levels. First, we present a macroscopic characterization of topic propagation through our corpus, formalizing the notion of long-running "chatter" topics consisting recursively of "spike" topics generated by outside world events, or more rarely, by resonances within the community. Second, we present a microscopic characterization of propagation from individual to individual, drawing on the theory of infectious diseases to model the flow. We propose, validate, and employ an algorithm to induce the underlying propagation network from a sequence of posts, and report on the results.
The Discoverability of the Web
- In Proc. WWW, 2007. accrued 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Dataset 1 0.0005 0.001 0.0015 0.002 query sketches c=100 c=1000 c=10000 0.25 0.2 0.15 0.1 0.05 Dataset 2 0.0005 0.001 0.0015 0.002 query sketches c=100 c=1000 c=10000
, 2007
"... Previous studies have highlighted the high arrival rate of new content on the web. We study the extent to which this new content can be efficiently discovered by a crawler. Our study has two parts. First, we study the inherent difficulty of the discovery problem using a maximum cover formulation, un ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Previous studies have highlighted the high arrival rate of new content on the web. We study the extent to which this new content can be efficiently discovered by a crawler. Our study has two parts. First, we study the inherent difficulty of the discovery problem using a maximum cover formulation, under an assumption of perfect estimates of likely sources of links to new content. Second, we relax this assumption and study a more realistic setting in which algorithms must use historical statistics to estimate which pages are most likely to yield links to new content. We recommend a simple algorithm that performs comparably to all approaches we consider. We measure the overhead of discovering new content, defined as the average number of fetches required to discover one new page. We show first that with perfect foreknowledge of where to explore for links to new content, it is possible to discover 90 % of all new content with under 3 % overhead, and 100 % of new content with 9 % overhead. But actual algorithms, which do not have access to perfect foreknowledge, face a more difficult task: one quarter of new content is simply not amenable to efficient discovery. Of the remaining three quarters, 80 % of new content during a given week may be discovered with 160 % overhead if content is recrawled fully on a monthly basis.
An Algorithmic Approach to Social Networks
- PhD thesis at MIT References 118 Science and Artificial Intelligence Laboratory
, 2005
"... ..."
Wayfinding in Social Networks
"... Abstract With the recent explosion of popularity of commercial social-networking sites like Facebook and MySpace, the size of social networks that can be studied scientifically has passed from the scale traditionally studied by sociologists and anthropologists to the scale of networks more typically ..."
Abstract
- Add to MetaCart
Abstract With the recent explosion of popularity of commercial social-networking sites like Facebook and MySpace, the size of social networks that can be studied scientifically has passed from the scale traditionally studied by sociologists and anthropologists to the scale of networks more typically studied by computer scientists. In this chapter, I will highlight a recent line of computational research into the modeling and analysis of the small-world phenomenon—the observation that typical pairs of people in a social network are connected by very short chains of intermediate friends—and the ability of members of a large social network to collectively find efficient routes to reach individuals in the network. I will survey several recent mathematical models of social networks that account for these phenomena, with an emphasis both on provable properties of these social-network models and on the empirical validation of the models against real large-scale social-network data.

