Results 1 - 10
of
17
Building a Better Mousetrap
, 2007
"... Routers in the network core are unable to maintain detailed statistics for every packet; thus, traffic statistics are often based on packet sampling, which reduces accuracy. Because tracking large (“heavy-hitter”) traffic flows is important both for pricing and for traffic engineering, much attentio ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Routers in the network core are unable to maintain detailed statistics for every packet; thus, traffic statistics are often based on packet sampling, which reduces accuracy. Because tracking large (“heavy-hitter”) traffic flows is important both for pricing and for traffic engineering, much attention has focused on maintaining accurate statistics for such flows, often at the expense of small-volume flows. Eradicating these smaller flows makes it difficult to observe communication structure, which is sometimes more important than maintaining statistics about flowsizes. This paper presentsFlexSample, a sampling framework that allows network operators to get the best of both worlds: For a fixed sampling budget, FlexSample can capture significantly more small-volume flows for only a small increase in relative error of large traffic flows. FlexSample uses a fast, lightweight counter array that provides a coarse estimate of the size (“class”) of each traffic flow; a router then can sample at different rates according to the class of the traffic using any existing sampling strategy. Given a fixed sampling rate and a target fraction of sampled packets to allocate across traffic classes, FlexSample computes packet sampling rates for each class that achieve these allocationsonline. Through analysis and trace-based experiments, we find that FlexSample can extract more communication structure, and can capture at least 50 % more mouse flows, than strategies that do not perform class-dependent packet sampling. We also show how FlexSample can be used to capture unique flows for specific applications.
A Programmable Architecture for Scalable and Real-time Network Traffic Measurements
"... Accurate and real-time traffic measurement is becoming increasingly critical for large variety of applications including accounting, bandwidth provisioning and security analysis. Existing network measurement techniques, however, have major difficulty dealing with large number of flows in today’s hig ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Accurate and real-time traffic measurement is becoming increasingly critical for large variety of applications including accounting, bandwidth provisioning and security analysis. Existing network measurement techniques, however, have major difficulty dealing with large number of flows in today’s high-speed networks and offer limited scalability with increasing link speeds. Consequently, the current state of the art solutions have to resort to conservative sampling of the traffic stream and/or accounting for only a few frequent flows that often fail to provide accurate estimates of traffic features. In this paper, we present a novel hardware-software codesigned solution that is programmable and adaptable to runtime situations offering high-throughputs that can easily
An Analysis of Packet Sampling in the Frequency Domain
"... Packet sampling techniques introduce measurement errors that should be carefully handled in order to correctly characterize the network behavior. In the literature several works have studied the statistical properties of packet sampling and the way it should be inverted to recover the original netwo ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Packet sampling techniques introduce measurement errors that should be carefully handled in order to correctly characterize the network behavior. In the literature several works have studied the statistical properties of packet sampling and the way it should be inverted to recover the original network measurements. Here we take the new direction of studying the spectral properties of packet sampled traffic. A novel technique to model the impact of packet sampling is proposed based on a theoretical analysis of network traffic in the frequency domain. Moreover, a real-time algorithm is also presented to detect the spectrum portion of the network traffic that can be restored once packet sampling has been applied. Preliminary experimental results are reported to validate the proposed approach.
Lightweight, High-Resolution Monitoring for Troubleshooting Production Systems Abhishek Kumar
"... Production systems are commonly plagued by intermittent problems that are difficult to diagnose. This paper describes a new diagnostic tool, called Chopstix, that continuously collects profiles of low-level OS events (e.g., scheduling, L2 cache misses, CPU utilization, I/O operations, page allocatio ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Production systems are commonly plagued by intermittent problems that are difficult to diagnose. This paper describes a new diagnostic tool, called Chopstix, that continuously collects profiles of low-level OS events (e.g., scheduling, L2 cache misses, CPU utilization, I/O operations, page allocation, locking) at the granularity of executables, procedures and instructions. Chopstix then reconstructs these events offline for analysis. We have used Chopstix to diagnose several elusive problems in a largescale production system, thereby reducing these intermittent problems to reproducible bugs that can be debugged using standard techniques. The key to Chopstix is an approximate data collection strategy that incurs very low overhead. An evaluation shows Chopstix requires under 1 % of the CPU, under 256KB of RAM, and under 16MB of disk space per day to collect a rich set of system-wide data. 1
Revisiting the Case for a Minimalist Approach for Network Flow Monitoring
"... Network management applications require accurate estimates of a wide range of flow-level traffic metrics. Given the inadequacy of current packet-sampling-based solutions, several application-specific monitoring algorithms have emerged. While these provide better accuracy for the specific application ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Network management applications require accurate estimates of a wide range of flow-level traffic metrics. Given the inadequacy of current packet-sampling-based solutions, several application-specific monitoring algorithms have emerged. While these provide better accuracy for the specific applications they target, they increase router complexity and require vendors to commit to hardware primitives without knowing how useful they will be to meet the needs of future applications. In this paper, we show using trace-driven evaluations that such complexity and early commitment may not be necessary. We revisit the case for a “minimalist ” approach in which a small number of simple yet generic router primitives collect flow-level data from which different traffic metrics can be estimated. We demonstrate the feasibility and promise of such a minimalist approach using flow sampling and sample-and-hold as sampling primitives and configuring these in a network-wide coordinated fashion using cSamp. We show that this proposal yields better accuracy across a collection of application-level metrics than dividing the same memory resources across metric-specific algorithms. Moreover, because a minimalist approach enables late binding to what applicationlevel metrics are important, it better insulates router implementations and deployments from changing monitoring needs.
Boosting the Scalability of Botnet Detection Using Adaptive Traffic Sampling
"... Botnets pose a serious threat to the health of the Internet. Most current network-based botnet detection systems require deep packet inspection (DPI) to detect bots. Because DPI is a computational costly process, such detection systems cannot handle large volumes of traffic typical of large enterpri ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Botnets pose a serious threat to the health of the Internet. Most current network-based botnet detection systems require deep packet inspection (DPI) to detect bots. Because DPI is a computational costly process, such detection systems cannot handle large volumes of traffic typical of large enterprise and ISP networks. In this paper we propose a system that aims to efficiently and effectively identify a small number of suspicious hosts that are likely bots. Their traffic can then be forwarded to DPI-based botnet detection systems for fine-grained inspection and accurate botnet detection. By using a novel adaptive packet sampling algorithm and a scalable spatial-temporal flow correlation approach, our system is able to substantially reduce the volume of network traffic that goes through DPI, thereby boosting the scalability of existing botnet detection systems. We implemented a proof-of-concept version of our system, and evaluated it using real-world legitimate and botnet-related network traces. Our experimental results are very promising and suggest that our approach can enable the deployment of botnet-detection systems in large, high-speed networks.
The Eternal Sunshine of the Sketch Data Structure
"... In the past years there has been significant research on developing compact data structures for summarizing large data streams. A family of such data structures is the so-called sketches. Sketches bear similarities to the well-known Bloom filters [2] and employ hashing techniques to approximate the ..."
Abstract
- Add to MetaCart
In the past years there has been significant research on developing compact data structures for summarizing large data streams. A family of such data structures is the so-called sketches. Sketches bear similarities to the well-known Bloom filters [2] and employ hashing techniques to approximate the count associated with an arbitrary key in a data stream using fixed memory resources. One limitation of sketches is that when used for summarizing long data streams, they gradually saturate, resulting in a potentially large error on estimated key counts. In this work, we introduce two techniques to address this problem based on the observation that real-world data streams often have many transient keys that appear for short time periods and do not re-appear later on. After entering the data structure, these keys contribute to hashing collisions and thus reduce the estimation accuracy of sketches. Our techniques use a limited amount of additional memory to detect transient keys and to periodically remove their hashed values from the sketch. In this manner the number of keys hashed into a sketch decreases, and as a result the frequency of hashing collisions and the estimation error are reduced. Our first technique in effect slows down the saturation process of a sketch, whereas our second technique completely prevents a sketch from saturating 1. We demonstrate 1 The phrase “eternal sunshine ” in the title reflects that our techniques mitigate or halt the saturation process of a sketch. 1 the performance improvements of our techniques analytically as well as experimentally. Our evaluation results using real network traffic traces show a reduction in the collision rate ranging between 26.1 % and 98.2% and even higher savings in terms of estimation accuracy compared to a state-of-the-art sketch data structure. To our knowledge this is the first work to look into the problem of improving the accuracy of sketches by mitigating their saturation process. 1
A Framework for Efficient Class-based Sampling
"... Abstract—With an increasing requirement for network monitoring tools to classify traffic and track security threats, newer and efficient ways are needed for collecting traffic statistics and monitoring of network flows. However, traditional solutions based on random packet sampling treat all flows a ..."
Abstract
- Add to MetaCart
Abstract—With an increasing requirement for network monitoring tools to classify traffic and track security threats, newer and efficient ways are needed for collecting traffic statistics and monitoring of network flows. However, traditional solutions based on random packet sampling treat all flows as equal and therefore, do not provide the flexibility required for these applications. In this paper, we propose a novel architecture called CLAMP that provides an efficient framework to implement size-based sampling. At the heart of CLAMP is a novel data structure called Composite Bloom filter (CBF) that consists of a set of Bloom filters that work together to encapsulate various class definitions. In comparison to previous approaches that implement simple size-based sampling, our architecture requires substantially lower memory (upto 80x) and results in higher flow coverage (upto 8x more flows) under specific configurations. I.
A Case for a RISC Architecture for Network Flow Monitoring
"... Several network management applications require high fidelity estimates of flow-level metrics. Given the inadequacy of current packet sampling based solutions, many proposals for application-specific monitoring algorithms have emerged. While these provide better accuracy, they increase router comple ..."
Abstract
- Add to MetaCart
Several network management applications require high fidelity estimates of flow-level metrics. Given the inadequacy of current packet sampling based solutions, many proposals for application-specific monitoring algorithms have emerged. While these provide better accuracy, they increase router complexity and require router vendors to commit to hardware primitives without knowing how useful they will be to future monitoring applications. We argue that such complexity is unnecessary and build a case for a “RISC ” approach for flow monitoring, in which generic collection primitives on routers provide data from which traffic metrics can be computed using separate, offline devices. We demonstrate one such RISC approach by combining two well-known primitives: flow sampling and sample-and-hold. We show that allocating a router’s memory resources to these generic primitives can provide similar or better accuracy on metrics of interest than dividing the resources among several metricspecific algorithms. Moreover, this approach better insulates router implementations from changing monitoring needs. 1.
Self-Tuning the Parameter of Adaptive Non-Linear Sampling Method for Flow Statistics
"... Flow statistics is a basic task of passive measurement and has been widely used to characterize the state of the network. Adaptive Non-Linear Sampling (ANLS) is one of the most accurate and memory-efficient flow statistics method proposed recently. This paper studies the parameter setting problem fo ..."
Abstract
- Add to MetaCart
Flow statistics is a basic task of passive measurement and has been widely used to characterize the state of the network. Adaptive Non-Linear Sampling (ANLS) is one of the most accurate and memory-efficient flow statistics method proposed recently. This paper studies the parameter setting problem for ANLS. A parameter self-tuning algorithm is proposed in this paper, which enlarges the parameter to a equilibrium tuning point and renormalizes the counter when counter overflows. It is demonstrated that the estimation error of ANLS with parameter self-tuning algorithm is improved by about 89 times for real trace, 70 times for Pareto traffic scenario and 370 times for exponential traffic, while giving the same memory size.

