Results 1 - 10
of
158
Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes
, 1996
"... Recently the notion of self-similarity has been shown to apply to wide-area and local-area network traffic. In this paper we examine the mechanisms that give rise to the self-similarity of network traffic. We present a hypothesized explanation for the possible self-similarity of traffic by using a p ..."
Abstract
-
Cited by 1023 (22 self)
- Add to MetaCart
Recently the notion of self-similarity has been shown to apply to wide-area and local-area network traffic. In this paper we examine the mechanisms that give rise to the self-similarity of network traffic. We present a hypothesized explanation for the possible self-similarity of traffic by using a particular subset of wide area traffic: traffic due to the World Wide Web (WWW). Using an extensive set of traces of actual user executions of NCSA Mosaic, reflecting over half a million requests for WWW documents, we examine the dependence structure of WWW traffic. While our measurements are not conclusive, we show evidence that WWW traffic exhibits behavior that is consistent with self-similar traffic models. Then we show that the self-similarity insuch traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in le transfer, the effect of user "think time", and the superimposition of many such transfers in a local area network. To do this we rely on empirically measured distributions both from our traces and from data independently collected at over thirty WWW sites.
Self-Similarity Through High-Variability: Statistical Analysis of Ethernet LAN Traffic at the Source Level
- IEEE/ACM TRANSACTIONS ON NETWORKING
, 1997
"... A number of recent empirical studies of traffic measurements from a variety of working packet networks have convincingly demonstrated that actual network traffic is self-similar or long-range dependent in nature (i.e., bursty over a wide range of time scales) -- in sharp contrast to commonly made tr ..."
Abstract
-
Cited by 550 (24 self)
- Add to MetaCart
A number of recent empirical studies of traffic measurements from a variety of working packet networks have convincingly demonstrated that actual network traffic is self-similar or long-range dependent in nature (i.e., bursty over a wide range of time scales) -- in sharp contrast to commonly made traffic modeling assumptions. In this paper, we provide a plausible physical explanation for the occurrence of self-similarity in LAN traffic. Our explanation is based on new convergence results for processes that exhibit high variability (i.e., infinite variance) and is supported by detailed statistical analyses of real-time traffic measurements from Ethernet LAN's at the level of individual sources. This paper is an extended version of [53] and differs from it in significant ways. In particular, we develop here the mathematical results concerning the superposition of strictly alternating ON/OFF sources. Our key mathematical result states that the superposition of many ON/OFF sources (also k...
Self-Similarity and Heavy Tails: Structural Modeling of Network Traffic
, 1996
"... High-resolution traffic measurements from modern communications networks provide unique opportunities for developing and validating mathematical models for aggregate traffic. To exploit these opportunities, we emphasize the need for structural models that take into account specific physical features ..."
Abstract
-
Cited by 128 (13 self)
- Add to MetaCart
High-resolution traffic measurements from modern communications networks provide unique opportunities for developing and validating mathematical models for aggregate traffic. To exploit these opportunities, we emphasize the need for structural models that take into account specific physical features of the underlying communication network structure. This approach is in sharp contrast to the traditional black box modeling methodology from time series analysis that ignores, in general, specific physical structures. We demonstrate, in particular, how the proposed structural modeling approach provides a direct link between the observed self-similarity characteristic of measured aggregate network traffic, and the strong empirical evidence in favor of heavy-tailed, infinite variance phenomena at the level of individual network connections.
Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems
- J. of Autom. Reasoning
, 2000
"... Abstract. We study the runtime distributions of backtrack procedures for propositional satisfiability and constraint satisfaction. Such procedures often exhibit a large variability in performance. Our study reveals some intriguing properties of such distributions: They are often characterized by ver ..."
Abstract
-
Cited by 125 (26 self)
- Add to MetaCart
Abstract. We study the runtime distributions of backtrack procedures for propositional satisfiability and constraint satisfaction. Such procedures often exhibit a large variability in performance. Our study reveals some intriguing properties of such distributions: They are often characterized by very long tails or “heavy tails”. We will show that these distributions are best characterized by a general class of distributions that can have infinite moments (i.e., an infinite mean, variance, etc.). Such nonstandard distributions have recently been observed in areas as diverse as economics, statistical physics, and geophysics. They are closely related to fractal phenomena, whose study was introduced by Mandelbrot. We also show how random restarts can effectively eliminate heavy-tailed behavior. Furthermore, for harder problem instances, we observe long tails on the left-hand side of the distribution, which is indicative of a non-negligible fraction of relatively short, successful runs. A rapid restart strategy eliminates heavy-tailed behavior and takes advantage of short runs, significantly reducing expected solution time. We demonstrate speedups of up to two orders of magnitude on SAT and CSP encodings of hard problems in planning, scheduling, and circuit synthesis. Key words: satisfiability, constraint satisfaction, heavy tails, backtracking 1.
Heavy-Tailed Probability Distributions in the World Wide Web
- IN A PRACTICAL GUIDE TO HEAVY TAILS: STATISTICAL TECHNIQUES AND APPLICATIONS
, 1998
"... The explosion of the World Wide Web as a medium for information dissemination has made it important to understand its characteristics, in particular the distribution of its file sizes. This paper presents evidence that a number of file size distributions in the Web exhibit heavy tails, including ..."
Abstract
-
Cited by 117 (10 self)
- Add to MetaCart
The explosion of the World Wide Web as a medium for information dissemination has made it important to understand its characteristics, in particular the distribution of its file sizes. This paper presents evidence that a number of file size distributions in the Web exhibit heavy tails, including files requested by users, files transmitted through the network, transmission durations of files, and files stored on servers. In addition, we argue that because of the presence of caching in the Web, the size distribution of transmitted files is primarily determined by the distribution of files available in the Web, and is relatively insensitive to the distribution of files requested by users. Finally, we discuss some of the implications of heavy-tailed transmission durations and relate these results to selfsimilarity in network traffic.
Estimation of Tail-Related Risk Measures for Heteroscedastic Financial Time Series: an Extreme Value Approach
- Journal of Empirical Finance
, 1998
"... We propose a method for estimating VaR and related risk measures describing the tail of the conditional distribution of a heteroscedastic financial return series. Our approach combines pseudo-maximum-likelihood fitting of GARCH models to estimate the current volatility and extreme value theory (EVT) ..."
Abstract
-
Cited by 72 (2 self)
- Add to MetaCart
We propose a method for estimating VaR and related risk measures describing the tail of the conditional distribution of a heteroscedastic financial return series. Our approach combines pseudo-maximum-likelihood fitting of GARCH models to estimate the current volatility and extreme value theory (EVT) for estimating the tail of the innovation distribution of the GARCH model. We use our method to estimate conditional quantiles (VaR) and conditional expected shortfalls (the expected size of a return exceeding VaR), this being an alternative measure of tail risk with better theoretical properties than the quantile. Using backtesting of historical daily return series we show that our procedure gives better one-day estimates than methods which ignore the heavy tails of the innovations or the stochastic nature of the volatility. With the help of our fitted models we adopt a Monte Carlo approach to estimating the conditional quantiles of returns over multiple-day horizons and find that t...
Statistical analysis of CCSN/SS7 traffic data from working CCS subnetworks
- IEEE JSAC
, 1994
"... In this paper we report on an ongoing statistical analysis of actual CCSN traffic data. The data consist of approximately 170 million signaling messages collected from a variety of different working CCS subnetworks. The key findings from our analysis concern: (1) the characteristics of both the tele ..."
Abstract
-
Cited by 69 (6 self)
- Add to MetaCart
In this paper we report on an ongoing statistical analysis of actual CCSN traffic data. The data consist of approximately 170 million signaling messages collected from a variety of different working CCS subnetworks. The key findings from our analysis concern: (1) the characteristics of both the telephone call arrival process and the signaling message arrival process, (2) the tail behavior of the call holding time distribution, and (3) the observed performance of the CCSN with respect to a variety of performance and reliability measurements. 1.
Explaining World Wide Web Traffic Self-Similarity
, 1995
"... Recently the notion of self-similarity has been shown to apply to wide-area and local-area network traffic. In this paper we examine the mechanisms that give rise to self-similar network traffic. We present an explanation for traffic self-similarity by using a particular subset of wide area traffic: ..."
Abstract
-
Cited by 68 (2 self)
- Add to MetaCart
Recently the notion of self-similarity has been shown to apply to wide-area and local-area network traffic. In this paper we examine the mechanisms that give rise to self-similar network traffic. We present an explanation for traffic self-similarity by using a particular subset of wide area traffic: traffic due to the World Wide Web (WWW). Using an extensive set of traces of actual user executions of NCSA Mosaic, reflecting over half a million requests for WWW documents, we show evidence that WWW traffic is selfsimilar. Then we show that the self-similarity in such traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in file transfer, the effect of user "think time", and the superimposition of many such transfers in a local area network. To do this we rely on empirically measured distributions both from our traces and from data independently collected at over thirty WWW sites. 1 Introduction Understanding the ...
Estimating the Heavy Tail Index from Scaling Properties
- Methodology and Computing in Applied Probability
, 1999
"... This paper deals with the estimation of the tail index ff for empirical heavy-tailed distributions, such as have been encountered in telecommunication systems. We present a method (called the "scaling estimator") based on the scaling properties of sums of heavy-tailed random variables. It has the ad ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
This paper deals with the estimation of the tail index ff for empirical heavy-tailed distributions, such as have been encountered in telecommunication systems. We present a method (called the "scaling estimator") based on the scaling properties of sums of heavy-tailed random variables. It has the advantages of being nonparametric, of being easy to apply, of yielding a single value, and of being relatively accurate on synthetic datasets. Since the method relies on the scaling of sums, it measures a property that is often one of the most important effects of heavy-tailed behavior. Most importantly, we present evidence that the scaling estimator appears to increase in accuracy as the size of the dataset grows. It is thus particularly suited for large datasets, as are increasingly encountered in measurements of telecommunications and computing systems. 1 Introduction The presence of power-law behavior in the tail of a distribution has important implications for the behavior of a random va...
Statistics for near independence in multivariate extreme values
, 1996
"... We propose a multivariate extreme value threshold model for joint tail estimation which overcomes the problems encountered with existing techniques when the variables are near independence. We examine inference under the model and develop tests for independence of extremes of the marginal variables, ..."
Abstract
-
Cited by 39 (2 self)
- Add to MetaCart
We propose a multivariate extreme value threshold model for joint tail estimation which overcomes the problems encountered with existing techniques when the variables are near independence. We examine inference under the model and develop tests for independence of extremes of the marginal variables, both when the thresholds are fixed, and when they increase with the sample size. Motivated by results obtained from this model, we give a new and widely applicable characterisation of dependence in the joint tail which includes existing models as special cases. A new parameter which governs the form of dependence is of fundamental importance to this characterisation. By estimating this parameter, we develop a diagnostic test which assesses the applicability of bivariate extreme value joint tail models. The methods are demonstrated through simulation and by analysing two previously published data sets.

