Results 1 
3 of
3
Optimal workloadbased weighted wavelet synopses
 In ICDT
, 2005
"... Abstract. In recent years wavelets were shown to be effective data synopses. We are concerned with the problem of finding efficiently wavelet synopses for massive data sets, in situations where information about query workload is available. We present linear time, I/O optimal algorithms for building ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
Abstract. In recent years wavelets were shown to be effective data synopses. We are concerned with the problem of finding efficiently wavelet synopses for massive data sets, in situations where information about query workload is available. We present linear time, I/O optimal algorithms for building optimal workloadbased wavelet synopses for point queries. The synopses are based on a novel construction of weighted innerproducts and use weighted wavelets that are adapted to those products. The synopses are optimal in the sense that the subset of retained coefficients is the best possible for the bases in use with respect to either the meansquared absolute or relative errors. For the latter, this is the first optimal wavelet synopsis even for the regular, nonworkloadbased case. Experimental results demonstrate the advantage obtained by the new optimal wavelet synopses, as well as the robustness of the synopses to deviations in the actual query workload. 1
How far will you walk to find your shortcut: Space Efficient Synopsis Construction Algorithms
, 2005
"... In this paper we consider the wavelet synopsis construction problem without the restriction that we only choose a subset of coefficients of the original data. We provide the first near optimal algorithm. We arrive at the above algorithm by considering space efficient algorithms for the restricted ve ..."
Abstract
 Add to MetaCart
In this paper we consider the wavelet synopsis construction problem without the restriction that we only choose a subset of coefficients of the original data. We provide the first near optimal algorithm. We arrive at the above algorithm by considering space efficient algorithms for the restricted version of the problem. In this context we improve previous algorithms by almost a linear factor and reduce the required space to almost linear. Our techniques also extend to histogram construction, and improve the spacerunning time tradeoffs for VOpt and range query histograms. We believe the idea applies to a broad range of dynamic programs and demonstrate it by showing improvements in a knapsacklike setting seen in construction of Extended Wavelets. 1
Research Track Paper Wavelet Synopsis for Data Streams: Minimizing NonEuclidean Error
"... We consider the wavelet synopsis construction problem for data streams where given n numbers we wish to estimate the data by constructing a synopsis, whose size, say B is much smaller than n. The B numbers are chosen to minimize a suitable error between the original data and the estimate derived fro ..."
Abstract
 Add to MetaCart
We consider the wavelet synopsis construction problem for data streams where given n numbers we wish to estimate the data by constructing a synopsis, whose size, say B is much smaller than n. The B numbers are chosen to minimize a suitable error between the original data and the estimate derived from the synopsis. Several good onepass wavelet construction streaming algorithms minimizing the ℓ2 error exist. For other error measures, the problem is less understood. We provide the first onepass small space streaming algorithms with provable error guarantees (additive approximation) for minimizing a variety of nonEuclidean error measures including all weighted ℓp (including ℓ∞) and relative error ℓp metrics. In several previous works solutions (for weighted ℓ2, ℓ∞ and maximum relative error) where the B synopsis coefficients are restricted to be wavelet coefficients of the data were proposed. This restriction yields suboptimal solutions on even fairly simple examples. Other lines of research, such as probabilistic synopsis, imposed restrictions on how the synopsis was arrived at. To the best of our knowledge this paper is the first paper to address the general problem, without any restriction on how the synopsis is arrived at, as well as provide the first streaming algorithms with guaranteed performance for these classes of error measures.