Results 1 - 10
of
31
The Architecture of the Remos System
"... Remos provides resource information to distributed applications. Its design goals of scalability, flexibility, and portability are achieved through an architecture that allows components to be positioned across the network, each collecting informationabout its local network. To collect information f ..."
Abstract
-
Cited by 32 (9 self)
- Add to MetaCart
Remos provides resource information to distributed applications. Its design goals of scalability, flexibility, and portability are achieved through an architecture that allows components to be positioned across the network, each collecting informationabout its local network. To collect information from different types of networks and from hosts on those networks, Remos provides several collectors that use different technologies, such as SNMP or benchmarking. By matching the appropriate collector to each particular network environment and by providing an architecture for distributing the output of these collectors across all querying environments, Remos collects appropriately detailed information at each site and distributes this information where needed in a scalable manner. Prediction services are integrated at the user-level, allowing history-based data collected across the network to be used to generate the predictions needed by a particular user. Remos has been implemented and tested in a variety of networks and is in use in a number of different environments.
Probabilistic Fault Localization in Communication Systems Using Belief Networks
- IEEE/ACM Transactions on Networking
, 2004
"... Abstract—We apply Bayesian reasoning techniques to perform fault localization in complex communication systems while using dynamic, ambiguous, uncertain, or incorrect information about the system structure and state. We introduce adaptations of two Bayesian reasoning techniques for polytrees, iterat ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
Abstract—We apply Bayesian reasoning techniques to perform fault localization in complex communication systems while using dynamic, ambiguous, uncertain, or incorrect information about the system structure and state. We introduce adaptations of two Bayesian reasoning techniques for polytrees, iterative belief updating, and iterative most probable explanation. We show that these approximate schemes can be applied to belief networks of arbitrary shape and overcome the inherent exponential complexity associated with exact Bayesian reasoning. We show through simulation that our approximate schemes are almost optimally accurate, can identify multiple simultaneous faults in an event driven manner, and incorporate both positive and negative information into the reasoning process. We show that fault localization through iterative belief updating is resilient to noise in the observed symptoms and prove that Bayesian reasoning can now be used in practice to provide effective fault localization. Index Terms—Fault localization, probabilistic inference, root cause diagnosis. I.
Design, Implementation, and Evaluation of the Remos Network Monitoring System
, 2003
"... Remos provides resource information to distributed applications. Its design goals of scalability, flexibility, and portability are achieved through an architecture that allows components to be positioned across the network, each collecting information about its local network. To collect information ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
Remos provides resource information to distributed applications. Its design goals of scalability, flexibility, and portability are achieved through an architecture that allows components to be positioned across the network, each collecting information about its local network. To collect information from differenttypes of networks, Remos provides several Collectors that use differenttechnologies, including SNMP and benchmarking. By matching the Collector to the particular network environmentandbyproviding an architecture for distributing the output of these collectors across all querying environments, Remos collects appropriately detailed information at each site and distributes this information where needed in a scalable manner. Remos has been implemented and tested in a variety of networks and is in use in a number of differentenvironments.
How to Resolve IP Aliases
, 2004
"... To construct accurate Internet maps, traceroute-based mapping efforts must group interface IP addresses into routers, a task known as alias resolution. In this paper, we introduce two new alias resolution approaches based on inference to handle addresses that cannot be resolved by existing methods b ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
To construct accurate Internet maps, traceroute-based mapping efforts must group interface IP addresses into routers, a task known as alias resolution. In this paper, we introduce two new alias resolution approaches based on inference to handle addresses that cannot be resolved by existing methods based on probe measurements. The first decodes the DNS names assigned by the ISP to recognize the name fragments that identify a router. The second infers aliases from the graph of linked IP addresses and requires no additional measurement traffic. We then experiment with feasible combinations of these techniques and existing ones by resolving aliases during the mapping of PlanetLab, a large wide-area overlay, and UUnet, a large ISP. We find that these techniques have complementary strengths and weaknesses and are best used in concert. The DNS and graph inference methods provide information where existing probe methods fail and are less dependent on router implementation choices. The existing probe methods can be made more effective in practice by using multiple vantage points and taking advantage of implementation synergies.
Multi-Resolution State Retrieval in Sensor Networks
, 2003
"... Large-scale dense sensor networks require mechanisms to extract topology information that can be used for various aspects of sensor network management. It is critical for any topology discovery algorithm in dense networks not only to adhere to the resource constraints of bandwidth and energy but als ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Large-scale dense sensor networks require mechanisms to extract topology information that can be used for various aspects of sensor network management. It is critical for any topology discovery algorithm in dense networks not only to adhere to the resource constraints of bandwidth and energy but also to provide several views of the network. Due to factors of density, redundancy and failures it may not be possible or practical to get a complete view of the topology. In this paper, we describe a distributed parameterized algorithm for Sensor Topology Retrieval at Multiple Resolutions (STREAM), which makes a tradeoff between topology details and resources expended. The algorithm retrieves network state at multiple resolutions at a proportionate communication cost. We also define various classes of topology queries and show how the parameters in the algorithm can be used to support queries specific to sensor networks. We show that topology determined at different resolutions is sufficient for approximating different network properties. We also show that STREAM can be used for general-purpose multi-resolution information retrieval in sensor networks.
Physical Topology Discovery for Large Multi-Subnet Networks
- in Proc. IEEE Infocom
, 2003
"... Knowledge of the up-to-date physical (i.e., layer-2) topology of an Ethernet network is crucial to a number of critical network management tasks, including reactive and proactive resource management, event correlation, and root-cause analysis. Given the dynamic nature of today's IP networks, keeping ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Knowledge of the up-to-date physical (i.e., layer-2) topology of an Ethernet network is crucial to a number of critical network management tasks, including reactive and proactive resource management, event correlation, and root-cause analysis. Given the dynamic nature of today's IP networks, keeping track of topology information manually is a daunting (if not impossible) task. Thus, effective algorithms for automatically discovering physical network topology are necessary. In this paper, we propose the first complete algorithmic solution for discovering the physical topology of a large, heterogeneous Ethernet network comprising multiple subnets as well as (possibly) dumb or uncooperative network elements. Our algorithms rely on standard SNMP MIB information that is widely supported in modern IP networks and require no modifications to the operating system software running on elements or hosts. Furthermore, we formally demonstrate that our solution is complete for the given MIB data; that is, if the MIB information is sufficient to uniquely identify the network topology then our algorithm is guaranteed to recover it. To the best of our knowledge, ours is the first solution to provide such a strong completeness guarantee.
Nondeterministic Queries in a Relational Grid Information Service
- In Proceedings of ACM/IEEE SC 2003 (Supercomputing 2003
, 2003
"... this paper, we describe RGIS, the nondeterministic query extension, and its implementation. We also present a performance evaluation of our implementation, populating our database with networks as large as five million hosts using our GridG grid generator tool. The evaluation shows that a meaning ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
this paper, we describe RGIS, the nondeterministic query extension, and its implementation. We also present a performance evaluation of our implementation, populating our database with networks as large as five million hosts using our GridG grid generator tool. The evaluation shows that a meaningful tradeoff between query processing time and result set size is possible using nondeterministic queries, and that we can use that tradeoff to control the running time of a query largely independent of query complexity
Static and Dynamic Analysis of the Internet's Susceptibility to Faults and Attacks
- in Proceedings of IEEE INFOCOM
, 2003
"... We analyze the susceptibility of the Internet to random faults, malicious attacks, and mixtures of faults and attacks. We analyze actual Internet data, as well as simulated data created with network models. The network models generalize previous research, and allow generation of graphs ranging from ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
We analyze the susceptibility of the Internet to random faults, malicious attacks, and mixtures of faults and attacks. We analyze actual Internet data, as well as simulated data created with network models. The network models generalize previous research, and allow generation of graphs ranging from uniform to preferential, and from static to dynamic. We introduce new metrics for analyzing the connectivity and performance of networks which improve upon metrics used in earlier research. Previous research has shown that preferential networks like the Internet are more robust to random failures compared to uniform networks. We find that preferential networks, including the Internet, are more robust only when more than 95% of failures are random faults, and robustness is measured with average diameter. The advantage of preferential networks disappears with alternative metrics, and when a small fraction of faults are attacks. We also identify dynamic characteristics of the Internet which can be used to create improved network models. This model should allow more accurate analysis for the future Internet, for example facilitating the design of network protocols with optimal performance in the future, or predicting future attack and fault tolerance. We find that the Internet is becoming more preferential as it evolves. The average diameter has been stable or even decreasing as the number of nodes has been increasing. The Internet is becoming more robust to random failures over time, but has also become more vulnerable to attacks.
Enabling Network Measurement Portability Through a Hierarchy of Characteristics
, 2003
"... crucial so that adaptive applications can make use of Grid environments. Although a large number of systems and tools have been developed to provide such measurement services, the diversity of Grid resources and lack of central control prevent the development of a single monitoring system that can b ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
crucial so that adaptive applications can make use of Grid environments. Although a large number of systems and tools have been developed to provide such measurement services, the diversity of Grid resources and lack of central control prevent the development of a single monitoring system that can be deployed to answer every application's resource queries for connections between any pair of machines it can use. We propose a standard for representing network entities and measurements of their properties. Our standard enables the exchange of measurements and will allow applications to function even in environments without the particular measurement system for which they were developed. We present an overview of our measurement representation and evaluate its usefulness. We have used the measurement hierarchy to store and exchange measurement data between several systems, and we discuss its usefulness in comparing the output of several measurement tools.
Ethernet Topology Discovery without Network Assistance
- In ICNP
, 2004
"... This work addresses the problem of Layer 2 topology discovery. Current techniques concentrate on using SNMP to query information from Ethernet switches. In contrast, we present a technique that infers the Ethernet (Layer 2) topology without assistance from the network elements by injecting suitable ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
This work addresses the problem of Layer 2 topology discovery. Current techniques concentrate on using SNMP to query information from Ethernet switches. In contrast, we present a technique that infers the Ethernet (Layer 2) topology without assistance from the network elements by injecting suitable probe packets from the end-systems and observing where they are delivered. We describe the algorithm, formally characterize its correctness and completeness, and present our implementation and experimental results. Performance results show that although originally aimed at the home and small office the techniques scale to much larger networks.

