## On Unbiased Sampling for Unstructured Peer-to-Peer Networks (2006)

### Cached

### Download Links

- [www.imconf.net]
- [www2.research.att.com]
- [ubinet.engr.uconn.edu]
- [conferences.sigcomm.org]
- [ix.cs.uoregon.edu]
- [mirage.cs.uoregon.edu]
- [www.barsoom.org]
- [www.barsoom.org]
- [www.research.att.com]
- [www2.research.att.com]
- [mirage.cs.uoregon.edu]
- [www2.research.att.com]
- DBLP

### Other Repositories/Bibliography

Venue: | in Proc. ACM IMC |

Citations: | 49 - 6 self |

### BibTeX

@INPROCEEDINGS{Stutzbach06onunbiased,

author = {Daniel Stutzbach and Reza Rejaie},

title = {On Unbiased Sampling for Unstructured Peer-to-Peer Networks},

booktitle = {in Proc. ACM IMC},

year = {2006},

pages = {27--40}

}

### Years of Citing Articles

### OpenURL

### Abstract

This paper addresses the difficult problem of selecting representative samples of peer properties (e.g., degree, link bandwidth, number of files shared) in unstructured peer-to-peer systems. Due to the large size and dynamic nature of these systems, measuring the quantities of interest on every peer is often prohibitively expensive, while sampling provides a natural means for estimating system-wide behavior efficiently. However, commonly-used sampling techniques for measuring peer-to-peer systems tend to introduce considerable bias for two reasons. First, the dynamic nature of peers can bias results towards short-lived peers, much as naively sampling flows in a router can lead to bias towards short-lived flows. Second, the heterogeneous nature of the overlay topology can lead to bias towards high-degree peers. We present a detailed examination of the ways that the behavior of peer-to-peer systems can introduce bias and suggest the Metropolized Random Walk with Backtracking (MRWB) as a viable and promising technique for collecting nearly unbiased samples. We conduct an extensive simulation study to demonstrate that the proposed technique works well for a wide variety of common peer-to-peer network conditions. Using the Gnutella network, we empirically show that our implementation of the MRWB technique yields more accurate samples than relying on commonlyused sampling techniques. Furthermore, we provide insights into the causes of the observed differences. The tool we have developed, ion-sampler, selects peer addresses uniformly at random using the MRWB technique. These addresses may then be used as input to another measurement tool to collect data on a particular property.

### Citations

2923 | A scalable content-addressable network
- Ratnasamy, Francis, et al.
- 2001
(Show Context)
Citation Context ... where peers select neighbors through a predominantly random process. Most popular P2P systems in use today belong to this unstructured category. For structured P2P systems such as Chord [35] and CAN =-=[30]-=-, knowledge of the structure significantly facilitates unbiased sampling as we discuss in Section 7. The main contributions of this paper are (i) a detailed examination of the ways that the topologica... |

2493 |
E.: Equation of state calculations by fast computing machines
- Metropolis, Rosenbluth, et al.
- 1953
(Show Context)
Citation Context ...and (ii) an in-depth exploration of the applicability of a sampling technique called the Metropolized Random Walk with Backtracking (MRWB), representing a variation of the Metropolis– Hastings method =-=[8,14,28]-=-. Our study indicates that MRWB results in nearly unbiased samples under a wide variety of commonly encountered peer-to-peer network conditions. The technique assumes that the P2P system provides some... |

1347 |
Monte carlo sampling methods using Markov chains and their applications
- Hastings
- 1970
(Show Context)
Citation Context ...and (ii) an in-depth exploration of the applicability of a sampling technique called the Metropolized Random Walk with Backtracking (MRWB), representing a variation of the Metropolis– Hastings method =-=[8,14,28]-=-. Our study indicates that MRWB results in nearly unbiased samples under a wide variety of commonly encountered peer-to-peer network conditions. The technique assumes that the P2P system provides some... |

615 | Chord: a scalable peer-to-peer lookup protocol for internet applications
- Stoica, Morris, et al.
- 2003
(Show Context)
Citation Context ... P2P systems, where peers select neighbors through a predominantly random process. Most popular P2P systems in use today belong to this unstructured category. For structured P2P systems such as Chord =-=[35]-=- and CAN [30], knowledge of the structure significantly facilitates unbiased sampling as we discuss in Section 7. The main contributions of this paper are (i) a detailed examination of the ways that t... |

475 | Understanding the MetropolisHastings algorithm
- Chib, Greenberg
- 1995
(Show Context)
Citation Context ...and (ii) an in-depth exploration of the applicability of a sampling technique called the Metropolized Random Walk with Backtracking (MRWB), representing a variation of the Metropolis– Hastings method =-=[8,14,28]-=-. Our study indicates that MRWB results in nearly unbiased samples under a wide variety of commonly encountered peer-to-peer network conditions. The technique assumes that the P2P system provides some... |

424 | Measurement, modeling, and analysis of a peer-to-peer file-sharing workload
- Gummadi, Dunn, et al.
- 2003
(Show Context)
Citation Context ...ion xm) that features many short sessions coupled with a few very long sessions. Some prior measurement studies of peerto-peer systems have suggested that session lengths follow a Pareto distribution =-=[6, 12, 34]-=-. One difficulty with this model is that xm is a lower-bound on the session length, and fits of xm to empirical data are often unreasonably high (i.e., placing a lower bound significantly higher than ... |

379 | Handling Churn in a DHT
- Rhea, Geels, et al.
- 2004
(Show Context)
Citation Context ...istribution is a one-parameter distribution (rate λ) that features sessions relatively close together in length. It has been used in many prior simulation and analysis studies of peer-to-peer systems =-=[24,25,31]-=-. Pareto: The Pareto (or power-law) distribution is a twoparameter distribution (shape α, location xm) that features many short sessions coupled with a few very long sessions. Some prior measurement s... |

328 | Graphs over time: densification laws, shrinking diameters and possible explanations
- Leskovec, Kleinberg, et al.
- 2005
(Show Context)
Citation Context ...rticularly concerned if some nodes are preferred over others. Some studies [11,43] additionally use random walks as a component of their overlay-construction algorithm. Recent work by Leskovec et al. =-=[23]-=- discusses the evolution of graphs over time and focuses on empirically observed properties such as densification (i.e., networks become denser over time) and shrinking diameter (i.e., as networks gr... |

323 | Analyzing peer-to-peer traffic across large networks
- Sen, Wang
- 2004
(Show Context)
Citation Context ...ion xm) that features many short sessions coupled with a few very long sessions. Some prior measurement studies of peerto-peer systems have suggested that session lengths follow a Pareto distribution =-=[6, 12, 34]-=-. One difficulty with this model is that xm is a lower-bound on the session length, and fits of xm to empirical data are often unreasonably high (i.e., placing a lower bound significantly higher than ... |

296 | Understanding Availability
- Bhagwan, Savage, et al.
- 2003
(Show Context)
Citation Context ...nontrivial when the structure of the peer-to-peer system changes underneath the measurements. First-generation measurement studies of P2P systems typically relied on ad-hoc sampling techniques (e.g., =-=[4, 33]-=-) and provided valuable information concerning basic system behavior. However, lacking any critical assessment of the quality of these sampling techniques, the measurements resulting from these studie... |

268 | Random walks on graphs: a survey
- Lovász
- 1993
(Show Context)
Citation Context ...es, and temporal causes of sampling bias have received little attention in past measurement studies of the Web. Several properties of random walks on graphs have been extensively studied analytically =-=[26]-=-, such as the access time, cover time, and mixing time. While these properties have many useful applications, they are only well-defined for static graphs. To our knowledge the application of random w... |

241 | Dissecting bittorrent: Five months in a torrent’s lifetime
- Izal, Biersack, et al.
- 2004
(Show Context)
Citation Context ...etween occurrences of the same peer at different times. This formulation is appropriate if peer session lengths are exponentially distributed (i.e., memoryless). However, existing measurement studies =-=[16, 29, 33, 39]-=- show session lengths are heavily skewed, with many peers being present for just a short time (a few minutes) while other peers remain in the system for a very long time (i.e., longer than ∆). As a co... |

223 | The Bittorrent P2P file-sharing system: Measurements and analysis
- Pouwelse, Garbacki, et al.
- 2005
(Show Context)
Citation Context ...etween occurrences of the same peer at different times. This formulation is appropriate if peer session lengths are exponentially distributed (i.e., memoryless). However, existing measurement studies =-=[16, 29, 33, 39]-=- show session lengths are heavily skewed, with many peers being present for just a short time (a few minutes) while other peers remain in the system for a very long time (i.e., longer than ∆). As a co... |

211 | Analysis of the evolution of peer-to-peer systems
- Liben-Nowell, Balakrishnan, et al.
- 2002
(Show Context)
Citation Context ...istribution is a one-parameter distribution (rate λ) that features sessions relatively close together in length. It has been used in many prior simulation and analysis studies of peer-to-peer systems =-=[24,25,31]-=-. Pareto: The Pareto (or power-law) distribution is a twoparameter distribution (shape α, location xm) that features many short sessions coupled with a few very long sessions. Some prior measurement s... |

194 |
A probabilistic proof of an asymptotic formula for the number of labelled regular graphs
- Bollobas
- 1980
(Show Context)
Citation Context ...of the different meanings of graph sampling to place our work in the context of other research on sampling graphs. Sampling from a class of graphs has been well studied in the graph theory literature =-=[5, 17]-=-, where the main objective is to prove that for a class of graphs sharing some property (e.g., same node degree distribution), a given random algorithm is capable of generating all graphs in the class... |

187 | A.: Random walks in peer-to-peer networks
- Gkantsidis, Mihail, et al.
- 2004
(Show Context)
Citation Context ...fined for static graphs. To our knowledge the application of random walks as a method of selecting nodes uniformly at random from a dynamically changing graph has not been studied. A number of papers =-=[7,11,27,43]-=- have made use of random walks as a basis for searching unstructured P2P networks. However, searching simply requires locating a certain piece of data anywhere along the walk, and is not particularly ... |

172 |
Estimating latency between arbitrary internet end hosts
- King
- 2002
(Show Context)
Citation Context ...tor that models peer arrivals, departures, latencies, and neighbor connections. We now describe our simulation environment. The latencies between peers are modeled using values from the King data set =-=[13]-=-. Peers learn about one another using one of several peer discovery mechanisms described below. Peers have a target minimum number of connections (i.e., degree) that they attempt to maintain at all ti... |

128 | Measuring and Analyzing the Characteristics of Napster and Gnutella
- Saroiu, Gummadi, et al.
- 2003
(Show Context)
Citation Context ...nontrivial when the structure of the peer-to-peer system changes underneath the measurements. First-generation measurement studies of P2P systems typically relied on ad-hoc sampling techniques (e.g., =-=[4, 33]-=-) and provided valuable information concerning basic system behavior. However, lacking any critical assessment of the quality of these sampling techniques, the measurements resulting from these studie... |

118 |
Estimating Latency between Arbitrary Internet End Hosts
- Gummadi, Saroiu, et al.
- 2002
(Show Context)
Citation Context ...tor that models peer arrivals, departures, latencies, and neighbor connections. We now describe our simulation environment. The latencies between peers are modeled using values from the King data set =-=[40]-=-. Peers learn about one another using one of several peer discovery mechanisms described below. Peers have a target minimum number of connections (i.e., degree) that they attempt to maintain at all ti... |

115 |
Respondent-driven sampling: A new approach to the study of hidden populations
- Heckathorn
- 1997
(Show Context)
Citation Context ...accurate estimates of the number of peers in an unstructured P2P network that have a certain property can also be viewed as a problem in studying the sizes of hidden populations. Following Heckathorn =-=[29]-=-, a population is called “hidden” if (i) there exists no sampling frame, so the size and boundary of the population are unknown, and (ii) there exist strong privacy concerns, to the point that public ... |

109 |
On Near-Uniform URL Sampling
- Henzinger, Heydon, et al.
- 2000
(Show Context)
Citation Context ...router-level topology changes at a much slower rate than the overlay topology of P2P networks. Another closely related problem is selecting Web pages uniformly at random from the set of all Web pages =-=[3,15,32]-=-. Web pages naturally form a graph, with hyper-links forming edges between pages. Unlike peer-to-peer networks, the Web graph is directed and only outgoing links are easily discovered. Much of the wor... |

96 |
Characterizing Churn in Peer-toPeer Networks
- Stutzbach, Rejaie
(Show Context)
Citation Context ...etween occurrences of the same peer at different times. This formulation is appropriate if peer session lengths are exponentially distributed (i.e., memoryless). However, existing measurement studies =-=[16, 29, 33, 39]-=- show session lengths are heavily skewed, with many peers being present for just a short time (a few minutes) while other peers remain in the system for a very long time (i.e., longer than ∆). As a co... |

94 | Characterizing Unstructured Overlay Topologies
- Stutzbach, Rejaie, et al.
- 2005
(Show Context)
Citation Context ...th lengths Barabási–Albert: Graphs with extreme degree distributions, also known as power-law or scale-free graphs Gnutella: Snapshots of the Gnutella ultrapeer topology, captured in our earlier work =-=[41]-=- To make the results more comparable, the number of vertices (|V | = 161, 680) and edges (|E| = 1,946, 596) in each graph are approximately the same. 1 Table 1 presents the results of the goodness-of-... |

82 |
Subnets of scale-free networks are not scale-free: sampling properties of networks
- Stumpf
- 2005
(Show Context)
Citation Context ... graph. Others have used sampling to extract information about graphs (e.g., selecting representative subgraphs from a large, intractable graph) while maintaining properties of the original structure =-=[19,20,36]-=-. Sampling is also frequently used as a component of efficient, randomized algorithms [42]. However, these studies assume complete knowledge of the graphs in question. Our problem is quite different i... |

79 | A performance vs. cost framework for evaluating DHT design tradeoffs under churn
- Li, Stribling, et al.
- 2005
(Show Context)
Citation Context ...istribution is a one-parameter distribution (rate λ) that features sessions relatively close together in length. It has been used in many prior simulation and analysis studies of peer-to-peer systems =-=[24,25,31]-=-. Pareto: The Pareto (or power-law) distribution is a twoparameter distribution (shape α, location xm) that features many short sessions coupled with a few very long sessions. Some prior measurement s... |

79 |
Targeted sampling: Options for the study of hidden populations
- Watters, Biernacki
- 1989
(Show Context)
Citation Context ...s in the social and statistical sciences for studying hidden populations include snowball sampling or other forms of chain referral techniques [30], key informant sampling [31], and targeted sampling =-=[32]-=-. More recently, Heckathorn [29] (see also [33], [34]) proposed respondentdriven sampling, a snowball-type method for sampling and estimation in hidden populations. In respondent-driven sampling, resp... |

77 | Approximating Aggregate Queries about Web Pages via Random Walks
- Bar-Yossef, Berg, et al.
(Show Context)
Citation Context ...router-level topology changes at a much slower rate than the overlay topology of P2P networks. Another closely related problem is selecting Web pages uniformly at random from the set of all Web pages =-=[3,15,32]-=-. Web pages naturally form a graph, with hyper-links forming edges between pages. Unlike peer-to-peer networks, the Web graph is directed and only outgoing links are easily discovered. Much of the wor... |

77 | Friendships that last: Peer lifespan and its role in p2p protocols, in Web content caching and distribution: proceedings of the 8th international workshop
- Bustamante, Qiao
- 2003
(Show Context)
Citation Context ...ion xm) that features many short sessions coupled with a few very long sessions. Some prior measurement studies of peerto-peer systems have suggested that session lengths follow a Pareto distribution =-=[6, 12, 34]-=-. One difficulty with this model is that xm is a lower-bound on the session length, and fits of xm to empirical data are often unreasonably high (i.e., placing a lower bound significantly higher than ... |

74 | Random sampling from a search engine’s index
- Bar-Yossef, Gurevich
(Show Context)
Citation Context ...lgorithm into a tool called ion-sampler. While sampling techniques based on the original Metropolis–Hastings method have been considered earlier (e.g., see Awan et al. [9] and Bar-Yossef and Gurevich =-=[10]-=-), we show that in the context of unstructured P2P systems, our modification of the basic Metropolis–Hastings method results in nearly unbiased samples under a wide variety of commonly encountered pee... |

74 |
Snowball Sampling
- GOODMAN
- 1961
(Show Context)
Citation Context ...keeping track of various peer properties. Proposed methods in the social and statistical sciences for studying hidden populations include snowball sampling or other forms of chain referral techniques =-=[30]-=-, key informant sampling [31], and targeted sampling [32]. More recently, Heckathorn [29] (see also [33], [34]) proposed respondentdriven sampling, a snowball-type method for sampling and estimation i... |

71 |
Random Walks on Graphs: A Survey. Combinatorics Paul Erdos is Eighty
- Lovasz
- 1993
(Show Context)
Citation Context ...graphs A popular technique for exploring connectivity structures consists of performing random walks on graphs. Several properties of random walks on graphs have been extensively studied analytically =-=[24]-=-, such as the access time, cover time, and mixing time. While these properties have many useful applications, they are, in general, only well-defined for static graphs. To our knowledge the applicatio... |

63 |
Sampling and estimation in hidden populations using respondent-driven sampling. Sociological Methodology 34
- Salganik, Heckathorn
- 2004
(Show Context)
Citation Context ...udying hidden populations include snowball sampling or other forms of chain referral techniques [30], key informant sampling [31], and targeted sampling [32]. More recently, Heckathorn [29] (see also =-=[33]-=-, [34]) proposed respondentdriven sampling, a snowball-type method for sampling and estimation in hidden populations. In respondent-driven sampling, respondents are not selected from a sampling frame ... |

52 | Improving lookup performance over a widely-deployed DHT
- Stutzbach, Rejaie
- 2006
(Show Context)
Citation Context ...effective, as long as there is little variation in the amount of identifier space that each peer is responsible for. We made use of this sampling technique in our study of the widely-deployed Kad DHT =-=[38]-=-. 8 Conclusions and Future Work This paper explores the problem of sampling representative peer properties in large and dynamic unstructured P2P systems. We show that the topological and temporal prop... |

47 | On lifetime-based node failure and stochastic resilience of decentralized peer-to-peer networks
- Leonard, Rai, et al.
- 2005
(Show Context)
Citation Context ...wer bound significantly higher than the median session length reported by other measurement studies). In their insightful analytical study of churn in peer-to-peer systems, Leonard, Rai, and Loguinov =-=[22]-=- instead suggest using a shifted Pareto distribution (shape α, scale β) with α ≈ 2. We use this shifted Pareto distribution, holding α fixed and varying the scale parameter β. We examine two different... |

42 | Methods for sampling pages uniformly from the world wide web
- Rusmevichientong, Pennock, et al.
- 2001
(Show Context)
Citation Context ...router-level topology changes at a much slower rate than the overlay topology of P2P networks. Another closely related problem is selecting Web pages uniformly at random from the set of all Web pages =-=[3,15,32]-=-. Web pages naturally form a graph, with hyper-links forming edges between pages. Unlike peer-to-peer networks, the Web graph is directed and only outgoing links are easily discovered. Much of the wor... |

39 |
Capturing Accurate Snapshots of the Gnutella Network
- Stutzbach, Rejaie
- 2005
(Show Context)
Citation Context ...ally, in Section 6 we describe the implementation of the ion-sampler tool based on MRWB and empirically evaluate its accuracy through comparison with complete snapshots of Gnutella taken with Cruiser =-=[37]-=-, as well as compare it with results from previously used, more ad-hoc, sampling techniques. Section 7 discusses some important questions such as how many samples to collect, when sampling with a know... |

38 |
Fast uniform generation of regular graphs
- Jerrum, Sinclair
- 1990
(Show Context)
Citation Context ...of the different meanings of graph sampling to place our work in the context of other research on sampling graphs. Sampling from a class of graphs has been well studied in the graph theory literature =-=[5, 17]-=-, where the main objective is to prove that for a class of graphs sharing some property (e.g., same node degree distribution), a given random algorithm is capable of generating all graphs in the class... |

37 | Sampling regular graphs and peer-to-peer network
- Cooper, Dyer, et al.
(Show Context)
Citation Context ... objective is to prove that for a class of graphs sharing some property (e.g., same node degree distribution), a given random algorithm is capable of generating all graphs in the class. Cooper et al. =-=[9]-=- used this approach to show that their algorithm for overlay construction generates graphs with good properties. Our objective is quite different; instead of sampling a graph from a class of graphs ou... |

36 | On heterogeneous overlay construction and random node selection in unstructured p2p networks
- Vishnumurthy, Francis
- 2006
(Show Context)
Citation Context ...fined for static graphs. To our knowledge the application of random walks as a method of selecting nodes uniformly at random from a dynamically changing graph has not been studied. A number of papers =-=[7,11,27,43]-=- have made use of random walks as a basis for searching unstructured P2P networks. However, searching simply requires locating a certain piece of data anywhere along the walk, and is not particularly ... |

29 | A survey of models of the web graph
- Bonato
- 2004
(Show Context)
Citation Context ...., see [5]). Hence, the dynamic graph models proposed in [35] are not appropriate for our purpose, and neither are the evolving graph models specifically designed to describe the Web graph (e.g., see =-=[36]-=- and references therein). III. SAMPLING WITH DYNAMICS We develop a formal and general model of a P2P system as follows. If we take an instantaneous snapshot of the system at time t, we can view the ov... |

26 |
Handling churn in a
- Rhea, Geels, et al.
- 2004
(Show Context)
Citation Context ...e time required to query peers, rather than the speed of global variations relative to the length of the walk. While the median session length reported by measurement studies varies considerably (see =-=[42]-=- for a summary), none report a median below 1 minute and two studies report a median session length of one hour [3], [4]. In summary, these results demonstrate that MRWB can gracefully tolerate peer d... |

25 |
On the bias of traceroute sampling
- Achlioptas, Clauset, et al.
- 2005
(Show Context)
Citation Context ...o ours is sampling Internet routers by running traceroute from a few hosts to many destinations for the purpose of discovering the Internet’s router-level topology. Using simulation [21] and analysis =-=[1]-=-, research has shown that traceroute measurements can result in measurement bias in the sense that the obtained samples support the inference of power law-type degree distributions irrespective of the... |

18 | Evaluating sampling techniques for large dynamic graphs
- Rasti, Torkjazi, et al.
- 2008
(Show Context)
Citation Context ... While our focus is on P2P networks, many of our results apply to any large, dynamic, undirected graph where nodes may be queried for a list of their neighbors. Building on our earlier formulation in =-=[40]-=-, the basic problem in sampling P2P networks concerns the selection of representative samples of peer properties such as peer degree, link bandwidth, or the number of files shared. To measure peer pro... |

17 |
Distributed uniform sampling in unstructured peer-to-peer networks
- Awan, Ferreira, et al.
- 2006
(Show Context)
Citation Context ...networks and their properties are by and large inconsistent with the design and usage of actual P2P networks. Hence, the graph models proposed in [23] are not appropriate for our purpose. Awan et al. =-=[2]-=- also address the problem of gathering uniform samples from peer-to-peer networks. They examine several techniques, including the Metropolis–Hastings method, but only evaluate the techniques over stat... |

17 | Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling
- Salganik, Matthew
(Show Context)
Citation Context ... hidden populations include snowball sampling or other forms of chain referral techniques [30], key informant sampling [31], and targeted sampling [32]. More recently, Heckathorn [29] (see also [33], =-=[34]-=-) proposed respondentdriven sampling, a snowball-type method for sampling and estimation in hidden populations. In respondent-driven sampling, respondents are not selected from a sampling frame but fr... |

16 | 2-6. Reducing large internet topologies for faster simulations
- Kirshnamurthy, Faloutsos, et al.
- 2005
(Show Context)
Citation Context ... graph. Others have used sampling to extract information about graphs (e.g., selecting representative subgraphs from a large, intractable graph) while maintaining properties of the original structure =-=[19,20,36]-=-. Sampling is also frequently used as a component of efficient, randomized algorithms [42]. However, these studies assume complete knowledge of the graphs in question. Our problem is quite different i... |

15 | Bootstrapping in Gnutella: a measurement study
- Karbhari, Ammar, et al.
- 2004
(Show Context)
Citation Context ...make contact for 45 minutes, the rendezvous point removes it from the list of known peers. History: Many P2P applications connect to the network using addresses they learned during a previous session =-=[18]-=-. A large fraction of these addresses will timeout, but typically enough of the peers will still be active to avoid the need to contact a centralized rendezvous point. As tracking the re-appearance of... |

8 |
Random Sampling
- Tsay, Lovejoy, et al.
- 1999
(Show Context)
Citation Context ...ative subgraphs from a large, intractable graph) while maintaining properties of the original structure [19,20,36]. Sampling is also frequently used as a component of efficient, randomized algorithms =-=[42]-=-. However, these studies assume complete knowledge of the graphs in question. Our problem is quite different in that we do not know the graphs in advance. A closely related problem to ours is sampling... |

8 |
Graphs over Time
- Leskovec, Kleinberg, et al.
(Show Context)
Citation Context ... is still at an early stage and is largely concerned with generative models that are capable of reproducing certain observed properties of evolving graphs. For example, recent work by Leskovec et al. =-=[35]-=- focuses on empirically observed properties such as densification (i.e., networks become denser over time) and shrinking diameter (i.e., as networks grow, their diameter decreases) and on new graph ge... |

7 | Sampling internet topologies: How small can we go
- Krishnamurthy, Sun, et al.
- 2003
(Show Context)
Citation Context ... graph. Others have used sampling to extract information about graphs (e.g., selecting representative subgraphs from a large, intractable graph) while maintaining properties of the original structure =-=[19,20,36]-=-. Sampling is also frequently used as a component of efficient, randomized algorithms [42]. However, these studies assume complete knowledge of the graphs in question. Our problem is quite different i... |