Results 1 
8 of
8
Diagnosing NetworkWide Traffic Anomalies
 In ACM SIGCOMM
, 2004
"... Anomalies are unusual and significant changes in a network's traffic levels, which can often span multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret anomalous patterns from large amounts of ..."
Abstract

Cited by 358 (18 self)
 Add to MetaCart
Anomalies are unusual and significant changes in a network's traffic levels, which can often span multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret anomalous patterns from large amounts of highdimensional, noisy data.
Realtime Problem Determination in Distributed Systems using Active Probing
 In Proceedings of 2004 IEEE/IFIP Network Operations and Management Symposium (NOMS 2004), Seoul, Korea
, 2004
"... We describe algorithms and an architecture for a realtime problem determination system that uses online selection of mostinformative measurements – the approach called herein active probing. Probes are endtoend test transactions which gather information about system components. Active probing al ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
(Show Context)
We describe algorithms and an architecture for a realtime problem determination system that uses online selection of mostinformative measurements – the approach called herein active probing. Probes are endtoend test transactions which gather information about system components. Active probing allows probes to be selected and sent ondemand, in response to one’s belief about the state of the system. At each step the most informative next probe is computed and sent. As probe results are received, belief about the system state is updated using probabilistic inference. This process continues until the problem is diagnosed. We demonstrate through both analysis and simulation that the active probing scheme greatly reduces both the number of probes and the time needed for localizing the problem when compared with nonactive probing schemes. Keywords selfmanaging networks, realtime monitoring and problem determination, endtoend response time measurements, AI techniques/probabilistic inference, information theory 1.
Intelligent probing: A costeffective approach to fault diagnosis in computer networks
 IBM SYSTEMS JOURNAL
, 2002
"... We consider the use of probing technology for fault diagnosis in computer networks. Probes are test transactions that can be actively selected and sent through the network. The use of probing technology for costeffective diagnosis requires addressing two issues: a planning phase in which the probes ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
We consider the use of probing technology for fault diagnosis in computer networks. Probes are test transactions that can be actively selected and sent through the network. The use of probing technology for costeffective diagnosis requires addressing two issues: a planning phase in which the probes are selected, followed by a diagnosis phase in which problem determination is performed using the results of the probes. The planning phase requires selecting a small but effective subset of all the possible probes. The diagnosis phase requires making inferences about the state of the network from the probe results in an environment of noise and uncertainty. This work addresses the probing problem using methods from artificial intelligence we call the resulting approach intelligent probing. The probes are selected by reasoning about the interactions between the probe paths. Although finding the optimal probe set is prohibitively expensive for large networks, we implement algorithms which find nearoptimal probe sets in linear time. In the diagnosis phase we use a Bayesian network approach and use a localinference approximation scheme that avoids the intractability of exact inference for large networks. Our results show that the quality of this approximate inference “degrades gracefully” under increasing uncertainty and increases as the quality of the probe set increases. 1
Efficient fault diagnosis using probing
 In AAAI Spring Symposium on Information Refinement and Revision for Decision Making
, 2002
"... In this paper, we address the problem of efficient diagnosis in realtime systems capable of online information gathering, such as sending ”probes ” (i.e., test transactions, such as ”traceroute ” or ”ping”) in order to identify network faults and evaluate performance of distributed computer system ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
In this paper, we address the problem of efficient diagnosis in realtime systems capable of online information gathering, such as sending ”probes ” (i.e., test transactions, such as ”traceroute ” or ”ping”) in order to identify network faults and evaluate performance of distributed computer systems. We use a Bayesian network to model probabilistic relations between the problems (faults, performance degradation) and symptoms (probe outcomes). Due to intractability of exact probabilistic inference in large systems, we investigated approximation techniques, such as a localinference scheme called minibuckets(Dechter & Rish 1997). Our empirical study demonstrates advantages of local approximations for large diagnostic problems: the approximation is very efficient and ”degrades gracefully ” with noise; also, the approximation error gets smaller on networks with higher confidence (probability) of the exact diagnosis. Since the accuracy of diagnosis depends on how much information the probes can provide about the system states, the second part of our work is focused on the probe selection task. Small probe sets are desirable in order to minimize the costs imposed by probing, such as additional network load and data management requirements. Our results show that, although finding the optimal collection of probes is expensive for large networks, efficient approximation algorithms can be used to find a nearlyoptimal set.
Multifault Diagnosis in Dynamic Systems
 in Proceedings of the 9th IFIP/IEEE International Symposium on Integrated Network Management (IM 2005, PosterCD
, 2005
"... In this paper, we address the problem of diagnosing multiple faults in dynamically changing systems. Currently used popular techniques such as codebook and active probing suffer from limitations imposed by their ”static”, nontemporal nature, and singlefault assumptions. We propose a very simple, l ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
In this paper, we address the problem of diagnosing multiple faults in dynamically changing systems. Currently used popular techniques such as codebook and active probing suffer from limitations imposed by their ”static”, nontemporal nature, and singlefault assumptions. We propose a very simple, lineartime multifault scheme, capable of tracking system changes and diagnosing multiple faults much more accurately than previously used approaches. We provide empirical results demonstrating the advantages of our approach and analyze the effect of test set quality on the diagnostic accuracy.
Problem Diagnosis in Distributed Systems using Active Probing
"... As distributed systems continue to grow in size and complexity, scalable and costeffective techniques are needed for performing tasks such as problem determination and fault diagnosis. We address these tasks using probes, or endtoend test transactions, which gather information about system co ..."
Abstract
 Add to MetaCart
As distributed systems continue to grow in size and complexity, scalable and costeffective techniques are needed for performing tasks such as problem determination and fault diagnosis. We address these tasks using probes, or endtoend test transactions, which gather information about system components (e.g., using IBM's EPP technology). Effective probing requires minimizing the cost of probing while maximizing the diagnostic accuracy of the probe set. In this paper we introduce an informationtheoretic approach to optimal probe set selection, and combine it with realtime probabilistic diagnosis using Bayesian networks. We show that selecting a preplanned optimal probe set is NPhard, but there exist polynomialtime approximation algorithms that perform well. Finally, the main contribution of the paper is a novel approach called active probing that allows adaptive, ondemand selection of mostinformative probes, based on the current state of diagnosis. We demonstrate through both analysis and simulation that the active probing scheme can greatly reduce the number of probes (by almost 70% in one of our practical applications) and the time needed for localizing the problem when compared with a nonactive probing scheme.
1Strategies for Problem Determination using Probing
"... Abstract—As distributed systems continue to grow in size and complexity, scalable and costefficient techniques are needed for performing tasks such as problem determination and fault diagnosis. In this paper, we address these tasks using probes, or test transactions, which replace traditional “pass ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—As distributed systems continue to grow in size and complexity, scalable and costefficient techniques are needed for performing tasks such as problem determination and fault diagnosis. In this paper, we address these tasks using probes, or test transactions, which replace traditional “passive ” eventcorrelation techniques with a more active, realtime informationgathering approach. We provide a theoretical foundation and a set of practical techniques for implementing efficient probing strategies the main issue is minimizing the cost of probing while maximizing the diagnostic accuracy of the probe set. We show that finding an optimal probe set is NPhard and devise polynomialtime approximation algorithms that demonstrate excellent empirical performance, even on large networks. We also implement an active, online probing strategy that yields a significant reduction in the probe set size. I.
Accuracy vs. Efficiency Tradeoffs in Probabilistic Diagnosis
"... This paper studies the accuracy/efficiency tradeoff in probabilistic diagnosis formulated as finding the mostlikely explanation (MPE) in a Bayesian network. Our work is motivated by a practical problem of efficient realtime fault diagnosis in computer networks using test transactions, or probes, ..."
Abstract
 Add to MetaCart
This paper studies the accuracy/efficiency tradeoff in probabilistic diagnosis formulated as finding the mostlikely explanation (MPE) in a Bayesian network. Our work is motivated by a practical problem of efficient realtime fault diagnosis in computer networks using test transactions, or probes, sent through the network. The key efficiency issues include both the cost of probing (e.g., the number of probes), and the computational complexity of diagnosis, while the diagnostic accuracy is crucial for maintaining high levels of network performance. Herein, we derive a lower bound on the diagnostic accuracy that provides necessary conditions for the number of probes needed to achieve an asymptotically errorfree diagnosis as the network size increases, given prior fault probabilities and a certain level of noise in probe outcomes. Since the exact MPE diagnosis is generally intractable in large networks, we investigate next the accuracy/efficiency tradeoffs for very simple and efficient local approximation techniques, based on variableelimination (the minibucket scheme). Our empirical studies show that these approximations ”degrade gracefully ” with noise and often yield an optimal solution when noise is low enough, and our initial theoretical analysis explains this behavior for the simplest (greedy) approximation. These encouraging results suggest the applicability of such approximations to certain almostdeterministic diagnostic problems that often arise in practical applications.