## Constructing Internet Coordinate System Based on Delay Measurement (2003)

### Cached

### Download Links

- [www.icir.org]
- [www.imconf.net]
- [www.net-glyph.org]
- [www.cs.utexas.edu]
- [user.informatik.uni-goettingen.de]
- [www.cs.uccs.edu]
- [cs.uccs.edu]
- [cs.uccs.edu]
- DBLP

### Other Repositories/Bibliography

Citations: | 114 - 3 self |

### BibTeX

@MISC{Lim03constructinginternet,

author = {Hyuk Lim and Jennifer C. Hou and Chong-Ho Choi and Seoul Korea},

title = {Constructing Internet Coordinate System Based on Delay Measurement},

year = {2003}

}

### Years of Citing Articles

### OpenURL

### Abstract

In this paper, we consider the problem of how to represent the locations of Internet hosts in a Cartesian coordinate system to facilitate estimate of the network distance between two arbitrary Internet hosts. We envision an infrastructure that consists of beacon nodes and provides the service of estimating network distance between two hosts without direct delay measurement. We show that the principal component analysis (PCA) technique can e#ectively extract topological information from delay measurements between beacon hosts. Based on PCA, we devise a transformation method that projects the distance data space into a new coordinate system of (much) smaller dimensions. The transformation retains as much topological information as possible and yet enables end hosts to easily determine their locations in the coordinate system. The resulting new coordinate system is termed as the Internet Coordinate System (ICS). As compared to existing work (e.g., IDMaps [1] and GNP [2]), ICS incurs smaller computation overhead in calculating the coordinates of hosts and smaller measurement overhead (required for end hosts to measure their distances to beacon hosts). Finally, we show via experimentation with real-life data sets that ICS is robust and accurate, regardless of the number of beacon nodes (as long as it exceeds certain threshold) and the complexity of network topology.

### Citations

8940 |
Introduction to Algorithms
- Cormen
- 2001
(Show Context)
Citation Context ...oordinate system, the network distance from the host to a host can be estimated without direct measurement by computing a distance metric function , (i.e., ). The generalized distance metric function =-=[8]-=- is defined as Some of the most important metrics are the Manhattan distance , the Euclidean distance , and the Chebyshev distance . In particular, it has been shown that can be expressed as Note that... |

2292 |
Dubes, “Algorithms for clustering data
- Jain, C
- 1988
(Show Context)
Citation Context ...he one hand, if the distances among hosts that are available to serve as beacon nodes can be measured, a clustering algorithm can be applied to group hosts that are close to one another into clusters =-=[18]-=-. Each host is initially assigned to its own cluster, and pairs of neighboring clusters are repeatedly merged into a single cluster until clusters remain. The median node in each cluster is selected a... |

2240 |
Principal Component Analysis
- Jolliffe
- 1986
(Show Context)
Citation Context ...o two-dimensional space. The dimensionality depends not on the dimension of the distance matrix but on the network topology, and can be much smaller than . We apply principal component analysis (PCA) =-=[11]-=-, [13], [14] to reduce the dimension of the distance matrix while retaining as much topological information as possible. In a nutshell, PCA transforms a data set that consists of a large number of (po... |

1614 |
A simplex method for function minimization
- Nelder, Mead
- 1964
(Show Context)
Citation Context ...lowing cost function: where is the measured distance between host and the th beacon nodes, and is the coordinate of the host . GNP tackles both optimization problems using the Simplex Downhill method =-=[10]-=-. Unfortunately, the Simplex Downhill method (5) (6) only gives a local minimum that is close to the starting value and does not guarantee that the result is unique in the case that the cost functions... |

714 | How to model an internetwork
- Zegura, Calvert, et al.
- 1996
(Show Context)
Citation Context ...-D. VI. EMPIRICAL STUDY To validate the effectiveness of ICS in inferring the Internet topology, we conduct experiments using both an empirical data set (NLANR) [16] and a synthetic data set (GT-ITM) =-=[19]-=-. As discussed in Section IV-B, the NLANR data set contains real delay data measured by ping. The GT-ITM data set, on the other hand, is obtained using the GT-ITM topology generator [19] and the ns-2 ... |

595 | End-to-End Routing Behavior in the Internet - Paxson - 1997 |

542 | Predicting Internet Network Distance with Coordinates-Based Approaches
- Ng, Zhang
- 2002
(Show Context)
Citation Context ... topology is to enable estimation of the network distance between arbitrary hosts without direct measurement between these hosts. Several approaches have been proposed, among which IDMaps [2] and GNP =-=[3]-=- may have received the most attention. Both assume a common architecture that consists of a small number of well-positioned infrastructure nodes (called beacon nodes in this paper). Every beacon node ... |

303 | Topologically-aware overlay construction and server selection
- Ratnasamy, Handley, et al.
(Show Context)
Citation Context ... heavily on the number and placement of beacon nodes. If the number of beacon nodes is small, the estimation performance may not be good. In order to extract topological information, Ratnasamy et al. =-=[9]-=- proposed a binning scheme. A bin is defined as the list of beacon nodes in the order of increasing delay. The bin of a host indicates the relative distances to all the beacon nodes. For example, if t... |

162 | Virtual landmarks for the Internet
- TANG, CROVELLA
- 2003
(Show Context)
Citation Context ...te that for a coordinate based approach, violation to the triangle inequality of network distance measurements may degrade the performance of the distance estimation. Fortunately it has been shown in =-=[6]-=- that violation to the triangle inequality violations is not particularly frequent through various measurement data sets. III. RELATED WORK A. Methods in the Distance Data Space Several methods have b... |

150 | Locating nearby copies of replicated Internet servers
- Guyton, Schwartz
- 1995
(Show Context)
Citation Context ...istance to a large number of hosts (such as servers). One important issue in realizing these measurement architectures is how to represent the location of a host. IDMaps and Hotz’s triangulation [4], =-=[5]-=-, for example, use the original distances to beacon nodes to represent the location of a host, while GNP [3] and Lighthouse [7] transform the original distance data space into a Cartesian coordinate s... |

144 | Lighthouses for scalable distributed locations
- PIAS, CROWCROFT, et al.
- 2003
(Show Context)
Citation Context ...o represent the location of a host. IDMaps and Hotz’s triangulation [4], [5], for example, use the original distances to beacon nodes to represent the location of a host, while GNP [3] and Lighthouse =-=[7]-=- transform the original distance data space into a Cartesian coordinate system and uses coordinates in the coordinate system to represent the location. As will be discussed in Section III, the major a... |

125 | An architecture for a global Internet host distance estimation service
- Francis, Jamin, et al.
- 1999
(Show Context)
Citation Context ...ting network topology is to enable estimation of the network distance between arbitrary hosts without direct measurement between these hosts. Several approaches have been proposed, among which IDMaps =-=[2]-=- and GNP [3] may have received the most attention. Both assume a common architecture that consists of a small number of well-positioned infrastructure nodes (called beacon nodes in this paper). Every ... |

94 |
Applied Linear Algebra
- Noble, Daniel
- 1988
(Show Context)
Citation Context ...s, and ’s are the singular values of in the decreasing order (i.e., if ). Note that . This means that the eigenvectors of make up with the associated (real nonnegative) eigenvalues of the diagonal of =-=[12]-=-. Similarly, . The columns . .. (7)sLIM et al.: CONSTRUCTING INTERNET COORDINATE SYSTEM BASED ON DELAY MEASUREMENT 517 of the matrix are the principal components and the orthogonal basis of the new su... |

75 |
Principal Component Analysis for Clustering Gene Expression Data
- Yeung, Ruzzo
(Show Context)
Citation Context ...dimensional space. The dimensionality depends not on the dimension of the distance matrix but on the network topology, and can be much smaller than . We apply principal component analysis (PCA) [11], =-=[13]-=-, [14] to reduce the dimension of the distance matrix while retaining as much topological information as possible. In a nutshell, PCA transforms a data set that consists of a large number of (possibly... |

71 | Automatic Choice of Dimensionality for PCA
- Minka
- 2001
(Show Context)
Citation Context ...a -dimensional coordinate system is how to determine the adequate degree, , of dimensions in the coordinate system. This problem has not been extensively studied, and is usually application-dependent =-=[15]-=-. One of the commonly adopted criteria is the cumulative percentage of variation that selected principal components contribute to [11]. The percentage, , of variation accounted for by the first princi... |

59 | Adaptive Dimension Reduction for Clustering High Dimensional Data
- Ding, He, et al.
(Show Context)
Citation Context ...ional space. The dimensionality depends not on the dimension of the distance matrix but on the network topology, and can be much smaller than . We apply principal component analysis (PCA) [11], [13], =-=[14]-=- to reduce the dimension of the distance matrix while retaining as much topological information as possible. In a nutshell, PCA transforms a data set that consists of a large number of (possibly) corr... |

53 |
End-to-End Routing Behavior
- Paxson
- 1996
(Show Context)
Citation Context ...the Internet to coordinates in a coordinate system of smaller dimensions and still retain as much topological information as possible, we apply PCA to two real-life data sets: • NPD-Routes-2 data set =-=[16]-=-: contains Internet route measurements obtained by traceroute. The measurements were made between 33 Internet hosts in the Network Probe Daemon (NPD) framework from November 3, 1995, to December 21, 1... |

44 | Routing information organization to support scalable interdomain routing with heterogeneous path requirements - Hotz - 1994 |

16 | The ns Manual. http://www.isi.edu/nsnam/ns/ns-documentation.html - Fall, Varadhan - 2002 |

1 |
Available: http://www.isi.edu/nsnam/ns/ns-documentation Hyuk Lim (M’03) received the
- Simulator–ns-2
- 1996
(Show Context)
Citation Context ...ssed in Section IV-B, the NLANR data set contains real delay data measured by ping. The GT-ITM data set, on the other hand, is obtained using the GT-ITM topology generator [19] and the ns-2 simulator =-=[20]-=-. The quality of a coordinate system can be affected by several factors such as the number and distribution of beacon nodes and the complexity of the network topology. With the use of the GT-ITM topol... |