## Measurement-based analysis, modeling, and synthesis of the Internet delay space (2006)

### Cached

### Download Links

Citations: | 16 - 1 self |

### BibTeX

@MISC{Zhang06measurement-basedanalysis,,

author = {Bo Zhang and T. S. Eugene Ng and Animesh N and Rudolf Riedi and Peter Druschel and Guohui Wang},

title = {Measurement-based analysis, modeling, and synthesis of the Internet delay space},

year = {2006}

}

### OpenURL

### Abstract

Understanding the characteristics of the Internet delay space (i.e., the all-pairs set of static round-trip propagation delays among edge networks in the Internet) is important for the design of global-scale distributed systems. For instance, algorithms used in overlay networks are often sensitive to violations of the triangle inequality and to the growth properties within the Internet delay space. Since designers of distributed systems often rely on simulation and emulation to study design alternatives, they need a realistic model of the Internet delay space. Our analysis shows that existing models do not adequately capture important properties of the Internet delay space. In this paper, we analyze measured delays among thousands of Internet edge networks and identify key properties that are important for distributed system design. Furthermore, we derive a simple model of the Internet delay space based on our analytical findings. This model preserves the relevant metrics far better than existing models, allows for a compact representation, and can be used to synthesize delay data for simulations and emulations at a scale where direct measurement and storage are impractical.

### Citations

3460 | Chord: A scalable peer-to-peer lookup service for internet applications
- Stoica, Morris, et al.
- 2001
(Show Context)
Citation Context ...ion is a hard problem, but efficient algorithms have been identified to solve the problem for growth-restricted metric spaces [16]. These algorithms are used, for instance, in Tapestry [49] and Chord =-=[41]-=- to select overlay neighbors. In a growth-restricted metric space, if the number of nodes with a delay of at most r from some node i is Bi(r), then Bi(2r) ≤ c · Bi(r), where c is a constant. We charac... |

1247 | On power-law relationships of the Internet topology
- Faloutsos, Faloutsos, et al.
(Show Context)
Citation Context ...actice. To its credit, P2PSim provides a 1740×1740 delay space matrix, which is not a trivial amount of data to obtain. The second approach is to start with a statistical network topology model (e.g. =-=[45, 48, 8, 10, 18]-=-) and assign artificial link delays to the topology. The delay space is then modeled by the all-pair shortest-path delays within the topology. The properties of such delay models, however, tend to dif... |

680 | How to model an Internetwork
- Zegura, Calvert, et al.
- 1996
(Show Context)
Citation Context ...actice. To its credit, P2PSim provides a 1740×1740 delay space matrix, which is not a trivial amount of data to obtain. The second approach is to start with a statistical network topology model (e.g. =-=[45, 48, 8, 10, 18]-=-) and assign artificial link delays to the topology. The delay space is then modeled by the all-pair shortest-path delays within the topology. The properties of such delay models, however, tend to dif... |

639 |
Routing of multipoint connections
- Waxman
- 1988
(Show Context)
Citation Context ...actice. To its credit, P2PSim provides a 1740×1740 delay space matrix, which is not a trivial amount of data to obtain. The second approach is to start with a statistical network topology model (e.g. =-=[45, 48, 8, 10, 18]-=-) and assign artificial link delays to the topology. The delay space is then modeled by the all-pair shortest-path delays within the topology. The properties of such delay models, however, tend to dif... |

595 | Scalable application layer multicast
- Banerjee, Bhattacharjee, et al.
(Show Context)
Citation Context ...ality, since Internet routing may not be optimal with respect to delay. Unfortunately, many distributed nearest neighbor selection algorithms rely on the assumption that the triangle inequality holds =-=[33, 16, 44]-=-. Thus, it is important to characterize the frequency and severity of the violations in the Internet delay space. 3.2 Analysis Results We now present an analysis of the measured delay data with respec... |

554 | Kademlia: A peer-to-peer information system based on the xor metric
- Maymounkov, Mazieres
- 2002
(Show Context)
Citation Context ...ce trends sometimes only show at scale. 6.2 Structured Overlay Networks Structured overlay networks like Chord, Kademlia and Pastry use proximity neighbor selection (PNS) to choose overlay neighbors =-=[41, 23, 5]-=-. PNS has been shown to effectively reduce routing stretch, query delay and network load, and to increase overlay robustness to failures and even to certain security attacks [37]. Here, we show the im... |

526 | Predicting Internet network distance with coordinates-based approaches
- Ng, Zhang
(Show Context)
Citation Context ...hat is robust to missing data. Then, we do a principal component analysis on the 5D Euclidean coordinates to get the first 2 principal components. Several techniques exist to compute the 5D embedding =-=[24, 7, 35, 6, 19, 43]-=-. Here, we use a slightly modified version of the Vivaldi [7] method that avoids the missing measurements. We use 32 neighbors per node in Vivaldi. Figure 2(b) displays the scatter plots of the first ... |

468 | Vivaldi: A Decentralized Network Coordinate System
- Dabek, Cox, et al.
(Show Context)
Citation Context ...hat is robust to missing data. Then, we do a principal component analysis on the 5D Euclidean coordinates to get the first 2 principal components. Several techniques exist to compute the 5D embedding =-=[24, 7, 35, 6, 19, 43]-=-. Here, we use a slightly modified version of the Vivaldi [7] method that avoids the missing measurements. We use 32 neighbors per node in Vivaldi. Figure 2(b) displays the scatter plots of the first ... |

265 | The end-to-end effects of internet path selection
- Savage, Collins, et al.
- 1999
(Show Context)
Citation Context ...cteristics. Some of the delay space properties reported in this paper have been observed in previous work. For example, triangle inequality violations and routing inefficiencies have been observed in =-=[34]-=- and [24]. Some of the characteristics of delay distributions and their implications for global clustering have been observed in Skitter. However, many of the observations made in this paper are new. ... |

246 | The impact of DHT routing geometry on resilience and proximity
- Gummadi, Gummadi, et al.
- 2003
(Show Context)
Citation Context ...astry employ proximity neighbor selection (PNS) to reduce the expected delay stretch S, i.e., the ratio of the delay of an overlay route over the direct routing delay averaged over all pairs of nodes =-=[14, 4, 30, 5]-=-. We choose to include the D(k) metric because analysis has shown that in Tapestry and Pastry, the expected delay stretch S in the overlay can be predicted based on the function D(k) [5]. 3sTriangle i... |

196 | iplane: An information plane for distributed services,” in OSDI’06
- Madhyastha, Isdal, et al.
- 2006
(Show Context)
Citation Context ...ructural model of the Internet, using BGP tables, traceroute, ping and other measurements to capture the coarse-grained (e.g., AS-level) topology of the Internet and thesassociated static link delays =-=[22]-=-. Given such a model, the delay for a given pair of IP addresses can be estimated by adding the link delays on the predicted route through the topology. If the topology model captures the coarse-grain... |

162 | PIC: Practical Internet coordinates for distance estimation
- Costa, Castro, et al.
- 2004
(Show Context)
Citation Context ...hat is robust to missing data. Then, we do a principal component analysis on the 5D Euclidean coordinates to get the first 2 principal components. Several techniques exist to compute the 5D embedding =-=[24, 7, 35, 6, 19, 43]-=-. Here, we use a slightly modified version of the Vivaldi [7] method that avoids the missing measurements. We use 32 neighbors per node in Vivaldi. Figure 2(b) displays the scatter plots of the first ... |

162 |
estimating latency between arbitrary internet end hosts
- King
- 2002
(Show Context)
Citation Context ...l. Currently, two approaches are used to obtain a delay model. The first approach, adopted for instance by the P2PSim simulator [25], is to collect actual delay measurements using a tool such as King =-=[13]-=-. However, due to limitations of the measurement methodology and the quadratic time requirement for measuring a delay matrix, measured data tends to be incomplete and there are limits to the size of a... |

161 | Virtual Landmarks for the Internet
- Tang, Crovella
(Show Context)
Citation Context |

157 | A Better Model for Generating Test Networks
- Doar
(Show Context)
Citation Context |

153 | A first-principles approach to understanding the internet’s router-level topology
- Li, Alderson, et al.
- 2004
(Show Context)
Citation Context |

149 | M.: Finding nearest neighbors in growth-restricted metrics
- Karger, Ruhl
- 2002
(Show Context)
Citation Context ...ant because they can influence the load balance of delay-optimized overlay networks, and the effectiveness of server placement policies and caching strategies. Having realistic growth characteristics =-=[16]-=- in the delay space is equally important, because the effectiveness of certain distributed algorithms depends on them. Many distributed systems are also sensitive to the inefficiency of IP routing wit... |

145 | Lighthouses for scalable distributed location
- Pias, Crowcroft, et al.
- 2003
(Show Context)
Citation Context ...low dimensional Euclidean embedding of the delay space to enhance the completeness and scalability of the delay space representation. Many approaches for computing such an embedding have been studied =-=[24, 7, 35, 6, 19, 43, 36, 26]-=-. We have not considered the impact of using different computation methods or using different embedding objective functions. This represents another area for future work. 8. CONCLUSIONS To the best of... |

140 | Inet-3.0: Internet topology generator
- Winick, Jamin
- 2002
(Show Context)
Citation Context ...mples.s2.2 Topology Model Delay Spaces We also generate delay matrices based on existing topology models and compare them against the measured Internet delay space. The two generators we use are Inet =-=[46]-=- and GT-ITM [48]. The Inet generator creates a topology that has power-law node degree distribution properties. The GT-ITM generator is used to generate a topology based on the Transit-Stub model. We ... |

140 | Meridian: A Lightweight Network Location Service without Virtual Coordinates - Wong, Silvkins, et al. |

133 | Routing algorithms for dhts: Some open questions
- Ratnasamy, Shenker, et al.
- 2002
(Show Context)
Citation Context ...astry employ proximity neighbor selection (PNS) to reduce the expected delay stretch S, i.e., the ratio of the delay of an overlay route over the direct routing delay averaged over all pairs of nodes =-=[14, 4, 30, 5]-=-. We choose to include the D(k) metric because analysis has shown that in Tapestry and Pastry, the expected delay stretch S in the overlay can be predicted based on the function D(k) [5]. 3sTriangle i... |

133 | Big-bang simulation for embedding network distances in Euclidean space
- Shavitt, Tankel
(Show Context)
Citation Context |

129 | Exploiting network proximity in peer-to-peer overlay networks
- Castro, Druschel, et al.
- 2002
(Show Context)
Citation Context ...astry employ proximity neighbor selection (PNS) to reduce the expected delay stretch S, i.e., the ratio of the delay of an overlay route over the direct routing delay averaged over all pairs of nodes =-=[14, 4, 30, 5]-=-. We choose to include the D(k) metric because analysis has shown that in Tapestry and Pastry, the expected delay stretch S in the overlay can be predicted based on the function D(k) [5]. 3sTriangle i... |

113 | Constructing Internet Coordinate System Based on Delay Measurement
- Lim, Hou, et al.
(Show Context)
Citation Context |

111 | Efficient topology-aware overlay network
- Waldvogel, Rinaldi
- 2003
(Show Context)
Citation Context ...ality, since Internet routing may not be optimal with respect to delay. Unfortunately, many distributed nearest neighbor selection algorithms rely on the assumption that the triangle inequality holds =-=[33, 16, 44]-=-. Thus, it is important to characterize the frequency and severity of the violations in the Internet delay space. 3.2 Analysis Results We now present an analysis of the measured delay data with respec... |

98 | D.: Security for structured peer-to-peer overlay networks
- Castro, Druschel, et al.
(Show Context)
Citation Context ... above can have an impact on application performance metrics whose relevance is more immediately apparent. Effectiveness of PNS on Eclipse Attacks - In recent work on defenses against Eclipse attacks =-=[3]-=- on structured overlay networks, Singh et al. [37] argue that PNS alone is a weak defense. While earlier work has shown that PNS is effective against Eclipse attacks based on simulations with a GT-ITM... |

90 | On the Accuracy of Embeddings for Internet Coordinate Systems
- Lua, Griffin, et al.
- 2005
(Show Context)
Citation Context ...umptions: • A low-dimensional Euclidean embedding can model the input delay data with reasonable accuracy, ignoring triangle inequality violations and local clustering properties. Some recent studies =-=[21, 17]-=- have shown that Euclidean embedding has difficulties in predicting pairwise Internet delays very accurately. Note, however, that we do not aim at predicting pairwise delays, we only use the Euclidean... |

81 | OASIS: Anycast for any service
- Freedman, Lakshminarayanan, et al.
- 2006
(Show Context)
Citation Context ...hey employ. Server selection redirects clients to an appropriate server, based on factors such as the location of the client, network conditions, and server load. A number of server selection systems =-=[47, 11, 7]-=- have been proposed and studied. In this section, the performance of Meridian [47], Vivaldi [7] and random server selection is evaluated using four different delay spaces: measured data, DS 2 , Inet a... |

81 |
Statistical Inference and Simulation for Spatial Point Processes
- Møller, Waagepetersen
(Show Context)
Citation Context ...e relative intensities, one can synthesize an artificial map of a certain size by generating random points in each hyper-cube according to the intensities using an inhomogeneous Poisson point process =-=[20, 31]-=- 1 . Indeed, this simple method can mimic the point distribution of the original map and generate a realistic overall delay distribution and global clustering structure. However, this method ignores t... |

77 |
Triangulation and embedding using small sets of beacons
- Kleinberg, Slivkins, et al.
- 2009
(Show Context)
Citation Context ...nd use Euclidean distances to model the delays in the delay space. Such a Euclidean map has a scalable O(N) representation. Although several techniques exist to compute a Euclidean embedding robustly =-=[24, 7, 35, 6, 19, 43, 40, 39]-=-, and previous studies have shown that an Internet delay space can be overall well approximated by a Euclidean embedding with as little as 5 dimensions, such an embedding tends to inflate the small va... |

55 | On the Curvature of the Internet and its usage for Overlay Construction and Distance Estimation
- Shavitt, Tankel
(Show Context)
Citation Context ...low dimensional Euclidean embedding of the delay space to enhance the completeness and scalability of the delay space representation. Many approaches for computing such an embedding have been studied =-=[24, 7, 35, 6, 19, 43, 36, 26]-=-. We have not considered the impact of using different computation methods or using different embedding objective functions. This represents another area for future work. 8. CONCLUSIONS To the best of... |

55 | Eclipse attacks on overlay networks: Threats and defenses
- Singh, Ngan, et al.
- 2006
(Show Context)
Citation Context ...lay neighbors [41, 23, 5]. PNS has been shown to effectively reduce routing stretch, query delay and network load, and to increase overlay robustness to failures and even to certain security attacks =-=[37]-=-. Here, we show the importance of using a good delay space to evaluate the effectiveness of PNS. To eliminate the influence of a particular PNS implementation, we assume in our simulations that the ov... |

52 | Beehive: O(1) lookup performance for power-law query distributions in peer-to-peer overlays - Ramasubramanian, Sirer - 2004 |

47 | A study of internet round-trip delay
- Acharya, Saltz
- 1996
(Show Context)
Citation Context ...mation incurs an additional constant storage overhead for the model. With these statistics, the delay between node i and j is then computed from the model as follows. Draw a pseudo-random number ρ in =-=[0,1]-=- based on the IDs of i and j. Let the Euclidean distance between i and j be lij and the cluster-cluster group be g. Based on P Type−1 g,l , P ij Type−2 g,l , P ij Type−1&2 g,l , and using ρ as a ranij... |

45 | Proximity neighbor selection in tree-based structurd peer-to-peer overlays
- Castro, Druschel, et al.
- 2003
(Show Context)
Citation Context ...al cluster heads will become clear in subsequent sections. Local clustering is relevant, for instance, to the in-degree and thus the load balance among nodes in delay-optimized overlay networks (e.g. =-=[5]-=-). For example, dense local clustering can lead to an overlay node having an unexpectedly high number of neighbors and can potentially create a load imbalance in the overlay. Growth metrics - Distribu... |

43 | Asymptotically efficient approaches to fault-tolerance in peer-to-peer networks
- Hildrum, Kubiatowicz
- 2004
(Show Context)
Citation Context ...d overlay networks, Singh et al. [37] argue that PNS alone is a weak defense. While earlier work has shown that PNS is effective against Eclipse attacks based on simulations with a GT-ITM delay model =-=[15]-=-, Singh et al. demonstrate that the defense breaks down when using measured delay data as a basis for simulations. Moreover, they show that the effectiveness of the defense diminishes with increasing ... |

35 |
A course on point processes
- Reiss
- 1993
(Show Context)
Citation Context ...e relative intensities, one can synthesize an artificial map of a certain size by generating random points in each hyper-cube according to the intensities using an inhomogeneous Poisson point process =-=[20, 31]-=- 1 . Indeed, this simple method can mimic the point distribution of the original map and generate a realistic overall delay distribution and global clustering structure. However, this method ignores t... |

31 | Wide area redirection of dynamic content by Internet data centers
- RANJAN, KARRER, et al.
(Show Context)
Citation Context ...igure 1: Nearest neighbor directed graph analysis technique. global clustering structure is, for instance, relevant to the placement of large data centers and web request redirection algorithms (e.g. =-=[29]-=-). Our algorithm to determine the global clustering works as follows. Given N nodes in the measured input data, it first treats each node as a singleton cluster. The algorithm then iteratively finds t... |

30 | On suitability of Euclidean embedding of Internet hosts
- Lee, Zhang, et al.
- 2006
(Show Context)
Citation Context ...Internet delay space can be overall well approximated by a Euclidean embedding with as little as 5 dimensions, such an embedding tends to inflate the small values (< 10ms) in the delay space too much =-=[17]-=-. In order to create a model that also preserves small values, we first use the Vivaldi algorithm to create a 5D Euclidean embedding of the measured delay space, then we explicitly adjust the Euclidea... |

30 | Distributed Approaches to Triangulation and Embedding
- Slivkins
- 2004
(Show Context)
Citation Context ...nd use Euclidean distances to model the delays in the delay space. Such a Euclidean map has a scalable O(N) representation. Although several techniques exist to compute a Euclidean embedding robustly =-=[24, 7, 35, 6, 19, 43, 40, 39]-=-, and previous studies have shown that an Internet delay space can be overall well approximated by a Euclidean embedding with as little as 5 dimensions, such an embedding tends to inflate the small va... |

16 |
Tapestry: An infrastructure for wide-area fault-tolerant location and routing
- Zhao, Kubiatowicz, et al.
- 2001
(Show Context)
Citation Context ...neighbor selection is a hard problem, but efficient algorithms have been identified to solve the problem for growth-restricted metric spaces [16]. These algorithms are used, for instance, in Tapestry =-=[49]-=- and Chord [41] to select overlay neighbors. In a growth-restricted metric space, if the number of nodes with a delay of at most r from some node i is Bi(r), then Bi(2r) ≤ c · Bi(r), where c is a cons... |

14 |
Aleksandrs Slivkins, and Emin Gun Sirer. Meridian: a lightweight network location service without virtual coordinates
- Wong
- 2005
(Show Context)
Citation Context ...hey employ. Server selection redirects clients to an appropriate server, based on factors such as the location of the client, network conditions, and server load. A number of server selection systems =-=[47, 11, 7]-=- have been proposed and studied. In this section, the performance of Meridian [47], Vivaldi [7] and random server selection is evaluated using four different delay spaces: measured data, DS 2 , Inet a... |

1 |
Beehive: O(1) lookup performance for pwer-law query distributions in peer-to-peer overlays
- Ramasubramanian, Sirer
- 2004
(Show Context)
Citation Context ... to expose important trends. Performance of Proactive Replication - The benefits of proactive replication in structured overlays to reduce overlay lookup hops and latency has been explored by Beehive =-=[28]-=-. We experimented with a simple prototype that does proactive replication based on the number of desired replicas of an object. The placement of replicas is biased towards nodes that share a longer pr... |

1 | Internet Routing Policies and Round-Trip - Zheng, Luo, et al. - 1994 |