Results 1  10
of
160
Evolution of networks
 Adv. Phys
, 2002
"... We review the recent fast progress in statistical physics of evolving networks. Interest has focused mainly on the structural properties of random complex networks in communications, biology, social sciences and economics. A number of giant artificial networks of such a kind came into existence rece ..."
Abstract

Cited by 268 (2 self)
 Add to MetaCart
We review the recent fast progress in statistical physics of evolving networks. Interest has focused mainly on the structural properties of random complex networks in communications, biology, social sciences and economics. A number of giant artificial networks of such a kind came into existence recently. This opens a wide field for the study of their topology, evolution, and complex processes occurring in them. Such networks possess a rich set of scaling properties. A number of them are scalefree and show striking resilience against random breakdowns. In spite of large sizes of these networks, the distances between most their vertices are short — a feature known as the “smallworld” effect. We discuss how growing networks selforganize into scalefree structures and the role of the mechanism of preferential linking. We consider the topological and structural properties of evolving networks, and percolation in these networks. We present a number of models demonstrating the main features of evolving networks and discuss current approaches for their simulation and analytical study. Applications of the general results to particular networks in Nature are discussed. We demonstrate the generic connections of the network growth processes with the general problems
What's New on the Web? The Evolution of the Web from a Search Engine Perspective
, 2004
"... We seek to gain improved insight into how Web search engines should cope with the evolving Web, in an attempt to provide users with the most uptodate results possible. For this purpose we collected weekly snapshots of some 150 Web sites over the course of one year, and measured the evolution of co ..."
Abstract

Cited by 161 (15 self)
 Add to MetaCart
We seek to gain improved insight into how Web search engines should cope with the evolving Web, in an attempt to provide users with the most uptodate results possible. For this purpose we collected weekly snapshots of some 150 Web sites over the course of one year, and measured the evolution of content and link structure. Our measurements focus on aspects of potential interest to search engine designers: the evolution of link structure over time, the rate of creation of new pages and new distinct content on the Web, and the rate of change of the content of existing pages under searchcentric measures of degree of change.
Deeper inside pagerank
 Internet Mathematics
, 2004
"... Abstract. This paper serves as a companion or extension to the “Inside PageRank” paper by Bianchini et al. [Bianchini et al. 03]. It is a comprehensive survey of all issues associated with PageRank, covering the basic PageRank model, available and recommended solution methods, storage issues, existe ..."
Abstract

Cited by 142 (4 self)
 Add to MetaCart
Abstract. This paper serves as a companion or extension to the “Inside PageRank” paper by Bianchini et al. [Bianchini et al. 03]. It is a comprehensive survey of all issues associated with PageRank, covering the basic PageRank model, available and recommended solution methods, storage issues, existence, uniqueness, and convergence properties, possible alterations to the basic model, suggested alternatives to the traditional solution methods, sensitivity and conditioning, and finally the updating problem. We introduce a few new results, provide an extensive reference list, and speculate about exciting areas of future research. 1.
Exploiting the Block Structure of the Web for Computing PageRank
, 2003
"... The web link graph has a nested block structure: the vast majority of hyperlinks link pages on a host to other pages on the same host, and many of those that do not link pages within the same domain. We show how to exploit this structure to speed up the computation of PageRank by a 3stage alg ..."
Abstract

Cited by 129 (5 self)
 Add to MetaCart
The web link graph has a nested block structure: the vast majority of hyperlinks link pages on a host to other pages on the same host, and many of those that do not link pages within the same domain. We show how to exploit this structure to speed up the computation of PageRank by a 3stage algorithm whereby (1) the local PageRanks of pages for each host are computed independently using the link structure of that host, (2) these local PageRanks are then weighted by the "importance" of the corresponding host, and (3) the standard PageRank algorithm is then run using as its starting vector the weighted concatenation of the local PageRanks. Empirically, this algorithm speeds up the computation of PageRank by a factor of 2 in realistic scenarios. Further, we develop a variant of this algorithm that efficiently computes many different "personalized" PageRanks, and a variant that efficiently recomputes PageRank after node updates.
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract

Cited by 120 (10 self)
 Add to MetaCart
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse realworld networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large realworld networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “communitylike.” This behavior is not explained, even at a qualitative level, by any of the commonlyused network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are wellembeddable in a lowdimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Graph evolution: Densification and shrinking diameters
 ACM TKDD
, 2007
"... How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include hea ..."
Abstract

Cited by 117 (13 self)
 Add to MetaCart
How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include heavy tails for in and outdegree distributions, communities, smallworld phenomena, and others. However, given the lack of information about network evolution over long periods, it has been hard to convert these findings into statements about trends over time. Here we study a wide range of real graphs, and we observe some surprising phenomena. First, most of these graphs densify over time, with the number of edges growing superlinearly in the number of nodes. Second, the average distance between nodes often shrinks over time, in contrast to the conventional wisdom that such distance parameters should increase slowly as a function of the number of nodes (like O(log n) or O(log(log n)). Existing graph generation models do not exhibit these types of behavior, even at a qualitative level. We provide a new graph generator, based on a “forest fire” spreading process, that has a simple, intuitive justification, requires very few parameters (like the “flammability ” of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study. We also notice that the “forest fire” model exhibits a sharp transition between sparse graphs and graphs that are densifying. Graphs with decreasing distance between the nodes are generated around this transition point. Last, we analyze the connection between the temporal evolution of the degree distribution and densification of a graph. We find that the two are fundamentally related. We also observe that real networks exhibit this type of r
ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs
 NTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING
, 2002
"... Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences, city streets, social networks and academic citations. Any kind of relationship, such as actors appearing in movies, can ..."
Abstract

Cited by 93 (19 self)
 Add to MetaCart
Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences, city streets, social networks and academic citations. Any kind of relationship, such as actors appearing in movies, can be represented as a graph. This work presents a data mining tool, called ANF, that can quickly answer a number of interesting questions on graphrepresented data, such as the following. How robust is the Internet to failures? What are the most influential database papers? Are there gender differences in movie appearance patterns? At its core, ANF is based on a fast and memoryefficient approach for approximating the complete "neighbourhood function" for a graph. For the Internet graph (268K nodes), ANF's highlyaccurate approximation is more than 700 times faster than the exact computation. This reduces the running time from nearly a day to a matter of a minute or two, allowing users to perform ad hoc drilldown tasks and to repeatedly answer questions about changing data sources. To enable this drilldown, ANF employs new techniques for approximating neighbourhoodtype functions for graphs with distinguished nodes and/or edges. When compared to the best existing approximation, ANF's approach is both faster and more accurate, given the same resources. Additionally, unlike previous approaches, ANF scales gracefully to handle disk resident graphs. Finally, we present some of our results from mining large graphs using ANF.
Random Evolution in Massive Graphs
, 2001
"... Many massive graphs (such as WWW graphs and Call graphs) share certain universal characteristics which can be described by socalled the "power law". In this paper, we will first briefly survey the history and previous work on power law graphs. Then we will give four evolution models for generating p ..."
Abstract

Cited by 89 (7 self)
 Add to MetaCart
Many massive graphs (such as WWW graphs and Call graphs) share certain universal characteristics which can be described by socalled the "power law". In this paper, we will first briefly survey the history and previous work on power law graphs. Then we will give four evolution models for generating power law graphs by adding one node/edge at a time. We will show that for any given edge density and desired distributions for indegrees and outdegrees (not necessarily the same, but adhered to certain general conditions), the resulting graph will almost surely satisfy the power law and the in/outdegree conditions. We will show that our most general directed and undirected models include nearly all known models as special cases. In addition, we consider another crucial aspects of massive graphs that is called "scalefree" in the sense that the f requency of sampling (w.r.t. the growth rate) is independent of the parameter of the resulting power law graphs. We will show that our evolution models generate scalefree power law graphs. 1
Parallel crawlers
 In Proceedings of the 11th international conference on World Wide Web
, 2002
"... In this paper we study how we can design an effective parallel crawler. As the size of the Web grows, it becomes imperative to parallelize a crawling process, in order to finish downloading pages in a reasonable amount of time. We first propose multiple architectures for a parallel crawler and ident ..."
Abstract

Cited by 86 (3 self)
 Add to MetaCart
In this paper we study how we can design an effective parallel crawler. As the size of the Web grows, it becomes imperative to parallelize a crawling process, in order to finish downloading pages in a reasonable amount of time. We first propose multiple architectures for a parallel crawler and identify fundamental issues related to parallel crawling. Based on this understanding, we then propose metrics to evaluate a parallel crawler, and compare the proposed architectures using 40 million pages collected from the Web. Our results clarify the relative merits of each architecture and provide a good guideline on when to adopt which architecture. 1