Results 1 - 10
of
178
A Random Graph Model for Massive Graphs
, 2000
"... We propose a random graph model which is a special case of sparse random graphs with given degree sequences. This model involves only a small number of parameters, called logsize and log-log growth rate. These parameters capture some universal characteristics of massive graphs. Furthermore, from the ..."
Abstract
-
Cited by 281 (25 self)
- Add to MetaCart
We propose a random graph model which is a special case of sparse random graphs with given degree sequences. This model involves only a small number of parameters, called logsize and log-log growth rate. These parameters capture some universal characteristics of massive graphs. Furthermore, from these parameters, various properties of the graph can be derived. For example, for certain ranges of the parameters, we will compute the expected distribution of the sizes of the connected components which almost surely occur with high probability. We will illustrate the consistency of our model with the behavior of some massive graphs derived from data in telecommunications. We will also discuss the threshold function, the giant component, and the evolution of random graphs in this model.
Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations
, 2005
"... How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include hea ..."
Abstract
-
Cited by 196 (31 self)
- Add to MetaCart
How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include heavy tails for in- and out-degree distributions, communities, small-world phenomena, and others. However, given the lack of information about network evolution over long periods, it has been hard to convert these findings into statements about trends over time. Here we study a wide range of real graphs, and we observe some surprising phenomena. First, most of these graphs densify over time, with the number of edges growing superlinearly in the number of nodes. Second, the average distance between nodes often shrinks over time, in contrast to the conventional wisdom that such distance parameters should increase slowly as a function of the number of nodes (like O(log n) orO(log(log n)). Existing graph generation models do not exhibit these types of behavior, even at a qualitative level. We provide a new graph generator, based on a “forest fire” spreading process, that has a simple, intuitive justification, requires very few parameters (like the “flammability” of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study.
A Brief History of Generative Models for Power Law and Lognormal Distributions
- INTERNET MATHEMATICS
"... Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying ..."
Abstract
-
Cited by 193 (7 self)
- Add to MetaCart
Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying
Measurement and Analysis of Online Social Networks
- In Proceedings of the 5th ACM/USENIX Internet Measurement Conference (IMC’07
, 2007
"... Online social networking sites like Orkut, YouTube, and Flickr are among the most popular sites on the Internet. Users of these sites form a social network, which provides a powerful means of sharing, organizing, and finding content and contacts. The popularity of these sites provides an opportunity ..."
Abstract
-
Cited by 185 (12 self)
- Add to MetaCart
Online social networking sites like Orkut, YouTube, and Flickr are among the most popular sites on the Internet. Users of these sites form a social network, which provides a powerful means of sharing, organizing, and finding content and contacts. The popularity of these sites provides an opportunity to study the characteristics of online social network graphs at large scale. Understanding these graphs is important, both to improve current systems and to design new applications of online social networks. This paper presents a large-scale measurement study and analysis of the structure of multiple online social networks. We examine data gathered from four popular online social networks: Flickr, YouTube, LiveJournal, and Orkut. We crawled the publicly accessible user links on each site, obtaining a large portion of each social network’s graph. Our data set contains over 11.3 million users and 328 million links. We believe that this is the first study to examine multiple online social networks at scale. Our results confirm the power-law, small-world, and scalefree properties of online social networks. We observe that the indegree of user nodes tends to match the outdegree; that the networks contain a densely connected core of high-degree nodes; and that this core links small groups of strongly clustered, low-degree nodes at the fringes of the network. Finally, we discuss the implications of these structural properties for the design of social network based systems.
Stochastic Models for the Web Graph
"... The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph m ..."
Abstract
-
Cited by 183 (9 self)
- Add to MetaCart
The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph models differ from the traditional Gn;p models in two ways: 1. Independently chosen edges do not result in the statistics (degree distributions, clique multitudes) observed on the web. Thus, edges in our model are statistically dependent on each other. 2. Our model introduces new vertices in the graph as time evolves. This captures the fact that the web is changing with time. Our results are two fold: we show that graphs generated using our model exhibit the statistics observed on the web graph, and additionally, that natural graph models proposed earlier do not exhibit them. This remains true even when these earlier models are generalized to account for the arrival of vertices over time. In particular, the sparse random graphs in our models exhibit properties that do not arise in far denser random graphs generated by Erdos-R'enyi models. 1
Random Walks in Peer-to-Peer Networks
, 2004
"... We quantify the effectiveness of random walks for searching and construction of unstructured peer-to-peer (P2P) networks. For searching, we argue that random walks achieve improvement over flooding in the case of clustered overlay topologies and in the case of re-issuing the same request several tim ..."
Abstract
-
Cited by 142 (2 self)
- Add to MetaCart
We quantify the effectiveness of random walks for searching and construction of unstructured peer-to-peer (P2P) networks. For searching, we argue that random walks achieve improvement over flooding in the case of clustered overlay topologies and in the case of re-issuing the same request several times. For construction, we argue that an expander can be maintained dynamically with constant operations per addition. The key technical ingredient of our approach is a deep result of stochastic processes indicating that samples taken from consecutive steps of a random walk can achieve statistical properties similar to independent sampling (if the second eigenvalue of the transition matrix is bounded away from 1, which translates to good expansion of the network; such connectivity is desired, and believed to hold, in every reasonable network and network model). This property has been previously used in complexity theory for construction of pseudorandom number generators. We reveal another facet of this theory and translate savings in random bits to savings in processing overhead.
Self-Organization and Identification of Web Communities
- IEEE Computer
, 2002
"... Despite the decentralized and unorganized nature of the web, we show that the web self-organizes such that communities of highly related pages can be efficiently identified based purely on connectivity. ..."
Abstract
-
Cited by 121 (0 self)
- Add to MetaCart
Despite the decentralized and unorganized nature of the web, we show that the web self-organizes such that communities of highly related pages can be efficiently identified based purely on connectivity.
Connected Components in Random Graphs with Given Expected Degree Sequences
- ANNALS OF COMBINATORICS
"... ..."
Searching the Web
- ACM TRANSACTIONS ON INTERNET TECHNOLOGY
, 2001
"... We offer an overview of current Web search engine design. After introducing a generic search engine architecture, we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the use of link analysis for boosting search performance. The most common design and im ..."
Abstract
-
Cited by 108 (1 self)
- Add to MetaCart
We offer an overview of current Web search engine design. After introducing a generic search engine architecture, we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the use of link analysis for boosting search performance. The most common design and implementation techniques for each of these components are presented. For this presentation we draw from the literature and from our own experimental search engine testbed. Emphasis is on introducing the fundamental concepts and the results of several performance analyses we conducted to compare different designs.

