Results 1  10
of
82
RMAT: A recursive model for graph mining
 In Fourth SIAM International Conference on Data Mining (SDM’ 04
, 2004
"... How does a ‘normal ’ computer (or social) network look like? How can we spot ‘abnormal ’ subnetworks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal moneylaundering rings), forecasting, and simulations (“how will a computer ..."
Abstract

Cited by 225 (19 self)
 Add to MetaCart
(Show Context)
How does a ‘normal ’ computer (or social) network look like? How can we spot ‘abnormal ’ subnetworks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal moneylaundering rings), forecasting, and simulations (“how will a computer virus spread?”). The heart of the problem is finding the properties of real graphs that seem to persist over multiple disciplines. We list such “laws ” and, more importantly, we propose a simple, parsimonious model, the “recursive matrix ” (RMAT) model, which can quickly generate realistic graphs, capturing the essence of each graph in only a few parameters. Contrary to existing generators, our model can trivially generate weighted, directed and bipartite graphs; it subsumes the celebrated ErdősRényi model as a special case; it can match the power law behaviors, as well as the deviations from them (like the “winner does not take it all ” model of Pennock et al. [21]). We present results on multiple, large real graphs, where we show that our parameter fitting algorithm (AutoMATfast) fits them very well. 1
Random Walks in PeertoPeer Networks
, 2004
"... We quantify the effectiveness of random walks for searching and construction of unstructured peertopeer (P2P) networks. For searching, we argue that random walks achieve improvement over flooding in the case of clustered overlay topologies and in the case of reissuing the same request several tim ..."
Abstract

Cited by 211 (2 self)
 Add to MetaCart
We quantify the effectiveness of random walks for searching and construction of unstructured peertopeer (P2P) networks. For searching, we argue that random walks achieve improvement over flooding in the case of clustered overlay topologies and in the case of reissuing the same request several times. For construction, we argue that an expander can be maintained dynamically with constant operations per addition. The key technical ingredient of our approach is a deep result of stochastic processes indicating that samples taken from consecutive steps of a random walk can achieve statistical properties similar to independent sampling (if the second eigenvalue of the transition matrix is bounded away from 1, which translates to good expansion of the network; such connectivity is desired, and believed to hold, in every reasonable network and network model). This property has been previously used in complexity theory for construction of pseudorandom number generators. We reveal another facet of this theory and translate savings in random bits to savings in processing overhead.
The Internet ASLevel Topology: Three Data Sources and One Definitive Metric
"... We calculate an extensive set of characteristics for Internet AS topologies extracted from the three data sources most frequently used by the research community: traceroutes, BGP, and WHOIS. We discover that traceroute and BGP topologies are similar to one another but differ substantially from the W ..."
Abstract

Cited by 100 (15 self)
 Add to MetaCart
We calculate an extensive set of characteristics for Internet AS topologies extracted from the three data sources most frequently used by the research community: traceroutes, BGP, and WHOIS. We discover that traceroute and BGP topologies are similar to one another but differ substantially from the WHOIS topology. Among the widely considered metrics, we find that the joint degree distribution appears to fundamentally characterize Internet AS topologies as well as narrowly define values for other important metrics. We discuss the interplay between the specifics of the three data collection mechanisms and the resulting topology views. In particular, we show how the data collection peculiarities explain differences in the resulting joint degree distributions of the respective topologies. Finally, we release to the community the input topology datasets, along with the scripts and output of our calculations. This supplement should enable researchers to validate their models against real data and to make more informed selection of topology data sources for their specific needs.
On Certain Connectivity Properties of the Internet Topology
 IN PROC. 35TH ACM SYMP. ON THEORY OF COMPUTING
, 2003
"... We show that random graphs in the preferential connectivity model have constant conductance, and hence have worstcase routing congestion that scales logarithmically with the number of nodes. Another immediate implication is constant spectral gap between the first and second eigenvalues of the rando ..."
Abstract

Cited by 77 (3 self)
 Add to MetaCart
We show that random graphs in the preferential connectivity model have constant conductance, and hence have worstcase routing congestion that scales logarithmically with the number of nodes. Another immediate implication is constant spectral gap between the first and second eigenvalues of the random walk matrix associated with these graphs. We also show that the expected frugality (overpayment in the VickreyClarkeGroves mechanism for shortest paths) of a random graph is bounded by a small constant.
Conductance and Congestion in Power Law Graphs
, 2003
"... It has been observed that the degrees of the topologies of several communication networks follow heavy tailed statistics. What is the impact of such heavy tailed statistics on the performance of basic communication tasks that a network is presumed to support? How does performance scale with the size ..."
Abstract

Cited by 64 (4 self)
 Add to MetaCart
It has been observed that the degrees of the topologies of several communication networks follow heavy tailed statistics. What is the impact of such heavy tailed statistics on the performance of basic communication tasks that a network is presumed to support? How does performance scale with the size of the network? We study routing in families of sparse random graphs whose degrees follow heavy tailed distributions. Instantiations of such random graphs have been proposed as models for the topology of the Internet at the level of Autonomous Systems as well as at the level of routers. Let n be the number of nodes. Suppose that for each pair of nodes with degrees du and dv we have O(dudv ) units of demand. Thus the total demand is O(n ). We argue analytically and experimentally that in the considered random graph model such demand patterns can be routed so that the flow through each link is at most O . This is to be compared with a bound # that holds for arbitrary graphs. Similar results were previously known for sparse random regular graphs, a.k.a. "expander graphs." The significance is that Internetlike topologies, which grow in a dynamic, decentralized fashion and appear highly inhomogeneous, can support routing with performance characteristics comparable to those of their regular counterparts, at least under the assumption of uniform demand and capacities. Our proof uses approximation algorithms for multicommodity flow and establishes strong bounds of a generalization of "expansion," namely "conductance." Besides routing, our bounds on conductance have further implications, most notably on the gap between first and second eigenvalues of the stochastic normalization of the adjacency matrix of the graph.
Network Topologies: Inference, Modelling and Generation
 IEEE COMMUNICATIONS SURVEYS & TUTORIALS
"... Accurate measurement, inference and modelling techniques are fundamental to Internet topology research. Spatial analysis of the Internet is needed to develop network planning, optimal routing algorithms and failure detection measures. A first step towards achieving such goals is the availability of ..."
Abstract

Cited by 37 (12 self)
 Add to MetaCart
(Show Context)
Accurate measurement, inference and modelling techniques are fundamental to Internet topology research. Spatial analysis of the Internet is needed to develop network planning, optimal routing algorithms and failure detection measures. A first step towards achieving such goals is the availability of network topologies at different levels of granularity, facilitating realistic simulations of new Internet systems. The main objective of this survey is to familiarize the reader with research on network topology over the past decade. We study techniques for inference, modelling and generation of the Internet topology at both router and administrative level. We also compare the mathematical models assigned to various topologies and the generation tools based on them. We conclude with a look at emerging areas of research and potential future research directions.
kcore decomposition of Internet graphs: hierarchies, selfsimilarity and measurement biases
 NETWORKS AND HETEROGENEOUS MEDIA
, 2008
"... We consider the kcore decomposition of network models and Internet graphs at the autonomous system (AS) level. The kcore analysis allows to characterize networks beyond the degree distribution and uncover structural properties and hierarchies due to the specific architecture of the system. We com ..."
Abstract

Cited by 28 (1 self)
 Add to MetaCart
(Show Context)
We consider the kcore decomposition of network models and Internet graphs at the autonomous system (AS) level. The kcore analysis allows to characterize networks beyond the degree distribution and uncover structural properties and hierarchies due to the specific architecture of the system. We compare the kcore structure obtained for AS graphs with those of several network models and discuss the differences and similarities with the real Internet architecture. The presence of biases and the incompleteness of the real maps are discussed and their effect on the kcore analysis is assessed with numerical experiments simulating biased exploration on a wide range of network models. We find that the kcore analysis provides an interesting characterization of the fluctuations and incompleteness of maps as well as information helping to discriminate the original underlying structure.
Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
"... We study the application of spectral clustering, prediction and visualization methods to graphs with negatively weighted edges. We show that several characteristic matrices of graphs can be extended to graphs with positively and negatively weighted edges, giving signed spectral clustering methods, s ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
(Show Context)
We study the application of spectral clustering, prediction and visualization methods to graphs with negatively weighted edges. We show that several characteristic matrices of graphs can be extended to graphs with positively and negatively weighted edges, giving signed spectral clustering methods, signed graph kernels and network visualization methods that apply to signed graphs. In particular, we review a signed variant of the graph Laplacian. We derive our results by considering random walks, graph clustering, graph drawing and electrical networks, showing that they all result in the same formalism for handling negatively weighted edges. We illustrate our methods using examples from social networks with negative edges and bipartite rating graphs. 1
Complex network measurements: Estimating the relevance of observed properties
 In INFOCOM 2008. 27th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies
, 2008
"... Abstract—Complex networks, modeled as large graphs, received much attention during these last years. However, data on such networks is only available through intricate measurement procedures. Until recently, most studies assumed that these procedures eventually lead to samples large enough to be r ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
Abstract—Complex networks, modeled as large graphs, received much attention during these last years. However, data on such networks is only available through intricate measurement procedures. Until recently, most studies assumed that these procedures eventually lead to samples large enough to be representative of the whole, at least concerning some key properties. This has crucial impact on network modeling and simulation, which rely on these properties. Recent contributions proved that this approach may be misleading, but no solution has been proposed. We provide here the first practical way to distinguish between cases where it is indeed misleading, and cases where the observed properties may be trusted. It consists in studying how the properties of interest evolve when the sample grows, and in particular whether they reach a steady state or not.
Dynamic analysis of the Autonomous System graph
 in IPS 2004, International Workshop on Interdomain Performance and Simulation
, 2004
"... In this paper we investigate to what extent the information provided by BGP routing tables about the graph of the Autonomous Systems (ASes) can be used to understand dynamic phenomena occurring in the network. First, we classify the time scales at which such an analysis can be performed and, consequ ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
(Show Context)
In this paper we investigate to what extent the information provided by BGP routing tables about the graph of the Autonomous Systems (ASes) can be used to understand dynamic phenomena occurring in the network. First, we classify the time scales at which such an analysis can be performed and, consequently, the kinds of phenomena that could be anticipated. Second, we improve cuttingedge technologies used to analyze the structure of the network, most notably spectral methods for graph clustering, in order to be able to analyze a whole sequence of consecutive snapshots that capture the temporal evolution of the network. Finally, we use such tools to analyze the data collected by the Oregon RouteViews project [20] during the last few years. We confirm stable properties of the AS graph, find major trends and notice that events occurring on a smaller timeframe, like wormattacks, misconfigurations, outages, DDoS attacks, etc. seem to have a very diverse degree of impact on the AS graph structure, which suggests that these techniques could be used to distinguish some of them. 1