On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
 SIGKDD'02
, 2002
Abstract

... mining time series data. Literally hundreds of papers have introduced new algorithms to index, classify, cluster and segment time series. In this work we make the following claim. Much of this work has very little utility because the contribution made (speed in the case of indexing, accuracy in the case of classification and clustering, model accuracy in the case of segmentation) offer an amount of "improvement" that would have been completely dwarfed by the variance that would have been observed by testing on many real world datasets, or the variance that would have been observed by changing minor (unstated) implementation details. To illustrate our point
AGGREGATION BIAS An Empirical Demonstration
Abstract
In order to assess the extent of the bias that exists in ecological correlations census data from the Public Use Sample and Census Summary. Tapes for the Chicago Standard Consolidated Area were used and a comparison was made of correlation and regression coefficients at the individual versus tract level. The assumption of larger aggregate level coefficients was not consistently upheld Although it was most frequently the case that aggiegate coefficients were larger than individual level coefficients, at both the tract and county levels aggregate coefficients were observ ed that w ere smaller or identical to individual level coefficients An acceptable twovariable aggregate regression equation was estimated for the prediction of current fertility at the tract level The equation was acceptable in the sense that the aggregate regression coefficients were relatively close to the individual level coefficients One cannot argue on principle against the statement that individual correlation and ecological correlation will usually not coincidebut neither can one rest with such an observation.Scheuch (1966: 152) Any discussion of aggregation bias necessarily begins with
Powerlaw distributions in empirical data
 ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111
, 2009
Abstract

Powerlaw distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and manmade phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur
Modeling TCP Throughput: A Simple Model and its Empirical Validation
, 1998
Abstract

retransmit events. Our measurements demonstrate that our model is able to more accurately predict TCP throughput and is accurate over a wider range of loss rates. This material is based upon work supported by the National Science Foundation under grants NCR9508274, NCR9523807 and CDA9502639. Any
Loopy belief propagation for approximate inference: An empirical study. In:
 Proceedings of Uncertainty in AI,
, 1999
Abstract

Abstract Recently, researchers have demonstrated that "loopy belief propagation" the use of Pearl's polytree algorithm in a Bayesian network with loops can perform well in the context of errorcorrecting codes. The most dramatic instance of this is the near Shannon
RADAR: an inbuilding RFbased user location and tracking system
, 2000
Abstract

and processing signal strength information at multiple base stations positioned to provide overlapping coverage in the area of interest. It employs techniques that combine empirical measurements with signal propagation modeling to enable locationaware services and applications. We present concrete experimental
Signal recovery from random measurements via Orthogonal Matching Pursuit
 IEEE TRANS. INFORM. THEORY
, 2007
Abstract

This technical report demonstrates theoretically and empirically that a greedy algorithm called Orthogonal Matching Pursuit (OMP) can reliably recover a signal with m nonzero entries in dimension d given O(m ln d) random linear measurements of that signal. This is a massive improvement over
Image denoising using a scale mixture of Gaussians in the wavelet domain
 IEEE TRANS IMAGE PROCESSING
, 2003
Abstract

vector and a hidden positive scalar multiplier. The latter modulates the local variance of the coefficients in the neighborhood, and is thus able to account for the empirically observed correlation between the coefficient amplitudes. Under this model, the Bayesian least squares estimate of each
K.B.: MultiInterval Discretization of ContinuousValued Attributes for Classication Learning. In:
 IJCAI.
, 1993
Abstract

formally derive a criterion based on the minimum description length principle for deciding the partitioning of intervals. We demonstrate via empirical evaluation on several realworld data sets that better decision trees are obtained using the new multiinterval algorithm.
Inducing Features of Random Fields
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1997
Abstract

the KullbackLeibler divergence between the model and the empirical distribution of the training data. A greedy algorithm determines how features are incrementally added to the field and an iterative scaling algorithm is used to estimate the optimal values of the weights. The random field models and techniques
