• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A simple conceptual model for the Internet topology. Global Telecommunications Conference (2001)

by S L Tauro, C Palmer, G Siganos, M Faloutsos
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 46
Next 10 →

Measurement and Analysis of Online Social Networks

by Alan Mislove, Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, Bobby Bhattacharjee - In Proceedings of the 5th ACM/USENIX Internet Measurement Conference (IMC’07 , 2007
"... Online social networking sites like Orkut, YouTube, and Flickr are among the most popular sites on the Internet. Users of these sites form a social network, which provides a powerful means of sharing, organizing, and finding content and contacts. The popularity of these sites provides an opportunity ..."
Abstract - Cited by 185 (12 self) - Add to MetaCart
Online social networking sites like Orkut, YouTube, and Flickr are among the most popular sites on the Internet. Users of these sites form a social network, which provides a powerful means of sharing, organizing, and finding content and contacts. The popularity of these sites provides an opportunity to study the characteristics of online social network graphs at large scale. Understanding these graphs is important, both to improve current systems and to design new applications of online social networks. This paper presents a large-scale measurement study and analysis of the structure of multiple online social networks. We examine data gathered from four popular online social networks: Flickr, YouTube, LiveJournal, and Orkut. We crawled the publicly accessible user links on each site, obtaining a large portion of each social network’s graph. Our data set contains over 11.3 million users and 328 million links. We believe that this is the first study to examine multiple online social networks at scale. Our results confirm the power-law, small-world, and scalefree properties of online social networks. We observe that the indegree of user nodes tends to match the outdegree; that the networks contain a densely connected core of high-degree nodes; and that this core links small groups of strongly clustered, low-degree nodes at the fringes of the network. Finally, we discuss the implications of these structural properties for the design of social network based systems.

R-MAT: A recursive model for graph mining

by Deepayan Chakrabarti, Yiping Zhan, Christos Faloutsos - In Fourth SIAM International Conference on Data Mining (SDM’ 04 , 2004
"... How does a ‘normal ’ computer (or social) network look like? How can we spot ‘abnormal ’ sub-networks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal money-laundering rings), forecasting, and simulations (“how will a computer ..."
Abstract - Cited by 90 (13 self) - Add to MetaCart
How does a ‘normal ’ computer (or social) network look like? How can we spot ‘abnormal ’ sub-networks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal money-laundering rings), forecasting, and simulations (“how will a computer virus spread?”). The heart of the problem is finding the properties of real graphs that seem to persist over multiple disciplines. We list such “laws ” and, more importantly, we propose a simple, parsimonious model, the “recursive matrix ” (R-MAT) model, which can quickly generate realistic graphs, capturing the essence of each graph in only a few parameters. Contrary to existing generators, our model can trivially generate weighted, directed and bipartite graphs; it subsumes the celebrated Erdős-Rényi model as a special case; it can match the power law behaviors, as well as the deviations from them (like the “winner does not take it all ” model of Pennock et al. [21]). We present results on multiple, large real graphs, where we show that our parameter fitting algorithm (AutoMAT-fast) fits them very well. 1

Power-Laws and the AS-level Internet Topology

by Georgos Siganos, Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos - IEEE/ACM Transactions on Networking , 2003
"... In this paper, we study and characterize the topology of the Internet at the Autonomous System level. First, we show that the topology can be described efficiently with power-laws. The elegance and simplicity of the powerlaws provide a novel perspective into the seemingly uncontrolled Internet struc ..."
Abstract - Cited by 77 (8 self) - Add to MetaCart
In this paper, we study and characterize the topology of the Internet at the Autonomous System level. First, we show that the topology can be described efficiently with power-laws. The elegance and simplicity of the powerlaws provide a novel perspective into the seemingly uncontrolled Internet structure. Second, we show that power-laws appear consistently over the last 5 years. We also observe that the power-laws hold even in the most recent and more complete topology [10] with correlation coefficient above 99% for the degree power-law. In addition, we study the evolution of the power-law exponents over the 5 year interval and observe a variation for the degree based power-law of less than 10%. Third, we provide relationships between the exponents and other topological metrics.

ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs

by Christopher R. Palmer, Phillip B. Gibbons, Christos Faloutsos - NTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING , 2002
"... Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences, city streets, social networks and academic citations. Any kind of relationship, such as actors appearing in movies, can ..."
Abstract - Cited by 73 (15 self) - Add to MetaCart
Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences, city streets, social networks and academic citations. Any kind of relationship, such as actors appearing in movies, can be represented as a graph. This work presents a data mining tool, called ANF, that can quickly answer a number of interesting questions on graph-represented data, such as the following. How robust is the Internet to failures? What are the most influential database papers? Are there gender differences in movie appearance patterns? At its core, ANF is based on a fast and memory-efficient approach for approximating the complete "neighbourhood function" for a graph. For the Internet graph (268K nodes), ANF's highly-accurate approximation is more than 700 times faster than the exact computation. This reduces the running time from nearly a day to a matter of a minute or two, allowing users to perform ad hoc drill-down tasks and to repeatedly answer questions about changing data sources. To enable this drill-down, ANF employs new techniques for approximating neighbourhood-type functions for graphs with distinguished nodes and/or edges. When compared to the best existing approximation, ANF's approach is both faster and more accurate, given the same resources. Additionally, unlike previous approaches, ANF scales gracefully to handle disk resident graphs. Finally, we present some of our results from mining large graphs using ANF.

Statistical properties of community structure in large social and information networks

by Kevin J. Lang, Anirban Dasgupta, Michael W. Mahoney
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract - Cited by 65 (6 self) - Add to MetaCart
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse real-world networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large real-world networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “community-like.” This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are well-embeddable in a low-dimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.

Graph evolution: Densification and shrinking diameters

by Jure Leskovec, Jon Kleinberg, Christos Faloutsos - ACM TKDD , 2007
"... How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include hea ..."
Abstract - Cited by 63 (9 self) - Add to MetaCart
How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include heavy tails for in- and out-degree distributions, communities, small-world phenomena, and others. However, given the lack of information about network evolution over long periods, it has been hard to convert these findings into statements about trends over time. Here we study a wide range of real graphs, and we observe some surprising phenomena. First, most of these graphs densify over time, with the number of edges growing super-linearly in the number of nodes. Second, the average distance between nodes often shrinks over time, in contrast to the conventional wisdom that such distance parameters should increase slowly as a function of the number of nodes (like O(log n) or O(log(log n)). Existing graph generation models do not exhibit these types of behavior, even at a qualitative level. We provide a new graph generator, based on a “forest fire” spreading process, that has a simple, intuitive justification, requires very few parameters (like the “flammability ” of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study. We also notice that the “forest fire” model exhibits a sharp transition between sparse graphs and graphs that are densifying. Graphs with decreasing distance between the nodes are generated around this transition point. Last, we analyze the connection between the temporal evolution of the degree distribution and densification of a graph. We find that the two are fundamentally related. We also observe that real networks exhibit this type of r

The Internet AS-Level Topology: Three Data Sources and One Definitive Metric

by Priya Mahadevan, Dmitri Krioukov, Marina Fomenkov, Bradley Huffaker, Xenofontas Dimitropoulos, kc claffy, Amin Vahdat
"... We calculate an extensive set of characteristics for Internet AS topologies extracted from the three data sources most frequently used by the research community: traceroutes, BGP, and WHOIS. We discover that traceroute and BGP topologies are similar to one another but differ substantially from the W ..."
Abstract - Cited by 54 (11 self) - Add to MetaCart
We calculate an extensive set of characteristics for Internet AS topologies extracted from the three data sources most frequently used by the research community: traceroutes, BGP, and WHOIS. We discover that traceroute and BGP topologies are similar to one another but differ substantially from the WHOIS topology. Among the widely considered metrics, we find that the joint degree distribution appears to fundamentally characterize Internet AS topologies as well as narrowly define values for other important metrics. We discuss the interplay between the specifics of the three data collection mechanisms and the resulting topology views. In particular, we show how the data collection peculiarities explain differences in the resulting joint degree distributions of the respective topologies. Finally, we release to the community the input topology datasets, along with the scripts and output of our calculations. This supplement should enable researchers to validate their models against real data and to make more informed selection of topology data sources for their specific needs.

Graph mining: Laws, generators, and algorithms

by Deepayan Chakrabarti, Christos Faloutsos - ACM COMPUTING SURVEYS , 2006
"... How does the Web look? How could we tell an abnormal social network from a normal one? These and similar questions are important in many fields where the data can intuitively be cast as a graph; examples range from computer networks to sociology to biology and many more. Indeed, any M : N relation i ..."
Abstract - Cited by 49 (7 self) - Add to MetaCart
How does the Web look? How could we tell an abnormal social network from a normal one? These and similar questions are important in many fields where the data can intuitively be cast as a graph; examples range from computer networks to sociology to biology and many more. Indeed, any M : N relation in database terminology can be represented as a graph. A lot of these questions boil down to the following: "How can we generate synthetic but realistic graphs?" To answer this, we must first understand what patterns are common in real-world graphs and can thus be considered a mark of normality/realism. This survey give an overview of the incredible variety of work that has been done on these problems. One of our main contributions is the integration of points of view from physics, mathematics, sociology, and computer science. Further, we briefly describe recent advances on some related and interesting graph problems.

Planetary-Scale Views on a Large Instant-Messaging Network

by Jure Leskovec , Eric Horvitz
"... We present a study of anonymized data capturing a month of high-level communication activities within the whole of the Microsoft Messenger instant-messaging system. We examine characteristics and patterns that emerge from the collective dynamics of large numbers of people, rather than the actions an ..."
Abstract - Cited by 43 (3 self) - Add to MetaCart
We present a study of anonymized data capturing a month of high-level communication activities within the whole of the Microsoft Messenger instant-messaging system. We examine characteristics and patterns that emerge from the collective dynamics of large numbers of people, rather than the actions and characteristics of individuals. The dataset contains summary properties of 30 billion conversations among 240 million people. From the data, we construct a communication graph with 180 million nodes and 1.3 billion undirected edges, creating the largest social network constructed and analyzed to date. We report on multiple aspects of the dataset and synthesized graph. We find that the graph is well-connected and robust to node removal. We investigate on a planetary-scale the oft-cited report that people are separated by “six degrees of separation” and find that the average path length among Messenger users is 6.6. We also find that people tend to communicate more with each other when they have similar age, language, and location, and that cross-gender conversations are both more frequent and of longer duration than conversations with the same gender.

On the Curvature of the Internet and its usage for Overlay Construction and Distance Estimation

by Yuval Shavitt, Tomer Tankel , 2004
"... It was noted in recent years that the Internet structure resembles a star with a highly connected core and long stretched tendrils. In this work we present a new quantity, the Internet geometric curvature, that captures the above observation by a single number. We embed the Internet distance metric ..."
Abstract - Cited by 42 (1 self) - Add to MetaCart
It was noted in recent years that the Internet structure resembles a star with a highly connected core and long stretched tendrils. In this work we present a new quantity, the Internet geometric curvature, that captures the above observation by a single number. We embed the Internet distance metric in a hyperbolic space with an optimal curvature and achieve an accuracy better than achieved before for the Euclidean space. This proves our hypothesis regarding the internet curvature. We demonstrate the strength of our embedding with two applications: selecting the closest server and building an application level multicast tree.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University