Results 1  10
of
763,007
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
 SIGKDD'02
, 2002
"... ... mining time series data. Literally hundreds of papers have introduced new algorithms to index, classify, cluster and segment time series. In this work we make the following claim. Much of this work has very little utility because the contribution made (speed in the case of indexing, accuracy in ..."
Abstract

Cited by 324 (59 self)
 Add to MetaCart
... mining time series data. Literally hundreds of papers have introduced new algorithms to index, classify, cluster and segment time series. In this work we make the following claim. Much of this work has very little utility because the contribution made (speed in the case of indexing, accuracy in the case of classification and clustering, model accuracy in the case of segmentation) offer an amount of "improvement" that would have been completely dwarfed by the variance that would have been observed by testing on many real world datasets, or the variance that would have been observed by changing minor (unstated) implementation details. To illustrate our point
AGGREGATION BIAS An Empirical Demonstration
"... In order to assess the extent of the bias that exists in ecological correlations census data from the Public Use Sample and Census Summary. Tapes for the Chicago Standard Consolidated Area were used and a comparison was made of correlation and regression coefficients at the individual versus tract l ..."
Abstract
 Add to MetaCart
In order to assess the extent of the bias that exists in ecological correlations census data from the Public Use Sample and Census Summary. Tapes for the Chicago Standard Consolidated Area were used and a comparison was made of correlation and regression coefficients at the individual versus tract level. The assumption of larger aggregate level coefficients was not consistently upheld Although it was most frequently the case that aggiegate coefficients were larger than individual level coefficients, at both the tract and county levels aggregate coefficients were observ ed that w ere smaller or identical to individual level coefficients An acceptable twovariable aggregate regression equation was estimated for the prediction of current fertility at the tract level The equation was acceptable in the sense that the aggregate regression coefficients were relatively close to the individual level coefficients One cannot argue on principle against the statement that individual correlation and ecological correlation will usually not coincidebut neither can one rest with such an observation.Scheuch (1966: 152) Any discussion of aggregation bias necessarily begins with
Powerlaw distributions in empirical data
 ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111
, 2009
"... Powerlaw distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and manmade phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
Abstract

Cited by 601 (7 self)
 Add to MetaCart
Powerlaw distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and manmade phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur
Modeling TCP Throughput: A Simple Model and its Empirical Validation
, 1998
"... In this paper we develop a simple analytic characterization of the steady state throughput, as a function of loss rate and round trip time for a bulk transfer TCP flow, i.e., a flow with an unlimited amount of data to send. Unlike the models in [6, 7, 10], our model captures not only the behavior of ..."
Abstract

Cited by 1339 (36 self)
 Add to MetaCart
retransmit events. Our measurements demonstrate that our model is able to more accurately predict TCP throughput and is accurate over a wider range of loss rates. This material is based upon work supported by the National Science Foundation under grants NCR9508274, NCR9523807 and CDA9502639. Any
Loopy belief propagation for approximate inference: An empirical study. In:
 Proceedings of Uncertainty in AI,
, 1999
"... Abstract Recently, researchers have demonstrated that "loopy belief propagation" the use of Pearl's polytree algorithm in a Bayesian network with loops can perform well in the context of errorcorrecting codes. The most dramatic instance of this is the near Shannonlimit performanc ..."
Abstract

Cited by 674 (15 self)
 Add to MetaCart
Abstract Recently, researchers have demonstrated that "loopy belief propagation" the use of Pearl's polytree algorithm in a Bayesian network with loops can perform well in the context of errorcorrecting codes. The most dramatic instance of this is the near Shannon
RADAR: an inbuilding RFbased user location and tracking system
, 2000
"... The proliferation of mobile computing devices and localarea wireless networks has fostered a growing interest in locationaware systems and services. In this paper we present RADAR, a radiofrequency (RF) based system for locating and tracking users inside buildings. RADAR operates by recording and ..."
Abstract

Cited by 2038 (14 self)
 Add to MetaCart
and processing signal strength information at multiple base stations positioned to provide overlapping coverage in the area of interest. It employs techniques that combine empirical measurements with signal propagation modeling to enable locationaware services and applications. We present concrete experimental
Signal recovery from random measurements via Orthogonal Matching Pursuit
 IEEE TRANS. INFORM. THEORY
, 2007
"... This technical report demonstrates theoretically and empirically that a greedy algorithm called Orthogonal Matching Pursuit (OMP) can reliably recover a signal with m nonzero entries in dimension d given O(m ln d) random linear measurements of that signal. This is a massive improvement over previous ..."
Abstract

Cited by 802 (9 self)
 Add to MetaCart
This technical report demonstrates theoretically and empirically that a greedy algorithm called Orthogonal Matching Pursuit (OMP) can reliably recover a signal with m nonzero entries in dimension d given O(m ln d) random linear measurements of that signal. This is a massive improvement over
Image denoising using a scale mixture of Gaussians in the wavelet domain
 IEEE TRANS IMAGE PROCESSING
, 2003
"... We describe a method for removing noise from digital images, based on a statistical model of the coefficients of an overcomplete multiscale oriented basis. Neighborhoods of coefficients at adjacent positions and scales are modeled as the product of two independent random variables: a Gaussian vecto ..."
Abstract

Cited by 512 (17 self)
 Add to MetaCart
vector and a hidden positive scalar multiplier. The latter modulates the local variance of the coefficients in the neighborhood, and is thus able to account for the empirically observed correlation between the coefficient amplitudes. Under this model, the Bayesian least squares estimate of each
K.B.: MultiInterval Discretization of ContinuousValued Attributes for Classication Learning. In:
 IJCAI.
, 1993
"... Abstract Since most realworld applications of classification learning involve continuousvalued attributes, properly addressing the discretization process is an important problem. This paper addresses the use of the entropy minimization heuristic for discretizing the range of a continuousvalued a ..."
Abstract

Cited by 831 (7 self)
 Add to MetaCart
formally derive a criterion based on the minimum description length principle for deciding the partitioning of intervals. We demonstrate via empirical evaluation on several realworld data sets that better decision trees are obtained using the new multiinterval algorithm.
Inducing Features of Random Fields
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1997
"... We present a technique for constructing random fields from a set of training samples. The learning paradigm builds increasingly complex fields by allowing potential functions, or features, that are supported by increasingly large subgraphs. Each feature has a weight that is trained by minimizing the ..."
Abstract

Cited by 669 (10 self)
 Add to MetaCart
the KullbackLeibler divergence between the model and the empirical distribution of the training data. A greedy algorithm determines how features are incrementally added to the field and an iterative scaling algorithm is used to estimate the optimal values of the weights. The random field models and techniques
Results 1  10
of
763,007