## Similarity search over time series data using wavelets (2002)

Venue: | In ICDE |

Citations: | 61 - 0 self |

### BibTeX

@INPROCEEDINGS{Popivanov02similaritysearch,

author = {Ivan Popivanov},

title = {Similarity search over time series data using wavelets},

booktitle = {In ICDE},

year = {2002},

pages = {212--221}

}

### Years of Citing Articles

### OpenURL

### Abstract

We consider the use of wavelet transformations as a dimensionality reduction technique to permit efficient similarity search over high-dimensional time-series data. While numerous transformations have been proposed and studied, the only wavelet that has been shown to be effective for this application is the Haar wavelet. In this work, we observe that a large class of wavelet transformations (not only orthonormal wavelets but also bi-orthonormal wavelets)can be used to support similarity search. This class includes the most popular and most effective wavelets being used in image compression. We present a detailed performance study of the effects of using different wavelets on the performance of similarity search for time-series data. We include several wavelets that outperform both the Haar wavelet and the best known non-wavelet transformations for this application. To ensure our results are usable by an application engineer, we also show how to configure an indexing strategy for the best performing transformations. Finally, we identify classes of data that can be indexed efficiently using these wavelet transformations. 1.

### Citations

2775 |
Introduction to Statistical Pattern Recognition, 2nd edition
- Fukunaga
- 1990
(Show Context)
Citation Context ...arantee that an F-index does not result in any false dismissals (Equation (1) of Section 2). Fukanaga includes a proof that the Euclidean distance is preserved for the class of orthonormal transforms =-=[13]-=-. The Haar wavelet as well as many other wavelets belong to the class of orthonormal wavelets. However, many wavelet transforms used in practice are not orthonormal. Indeed, the majority of wavelets u... |

1008 | The R*-tree: An efficient and robust access method for points and rectangles - Beckmann, Kriegel, et al. - 1990 |

527 | M-tree: An efficient access method for similarity search in metric spaces
- Ciaccia, Patella, et al.
- 1997
(Show Context)
Citation Context ...o perform similar experiments using other multi-dimensional indices. Collosi and Nascimento have bench-marked a set of promising high-dimensional index structures [8]. The SR-tree [17] and the M-tree =-=[7]-=- are among the structures that clearly outperform the other competitive techniques in their experimental set. We ran an experiment to compare the R*-tree and the SR-tree, an implementation of which is... |

522 | The X-tree: An index structure for highdimensional data
- Berchtold, Keim, et al.
- 1996
(Show Context)
Citation Context ...ll also effect performance. The use of too many features will render the index search less effective as the performance of even the best multidimensional index strategies decreases in high dimensions =-=[32, 3, 17, 7, 5]-=-. Within this framework, we address the following questions. 1. Which transformations are effective for similarityssearch over time series? The original work by Agrawal et al, as well as subsequent re... |

511 | A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces
- Weber, Schek, et al.
- 1998
(Show Context)
Citation Context ... almost all multidimensional indexing methods. Furthermore, a common rule of thumb in indexing is that if more than 20% of the data needs to be retrieved using the index, then a linear scan is better =-=[31]-=-. Hence, for pink noise data, an index that performed efficiently for 32 dimensions (at least) would be required. Our experiment (and conclusion) relied on “pure” synthetically generated pink noise. H... |

489 | Wavelets and Subband Coding
- Vetterli, Kovačević
- 1995
(Show Context)
Citation Context ...orm (DCT) are efficient forms of the Fourier transform often used in applications. Wavelets can be thought of as a generalization of this idea to a muchlarger family of functions than sine and cosine =-=[9, 29]-=-. Mathematically, a “wavelet” denotes a function ���� defined on the real numbers Ê, which includes an integer translation by �, also called a shift, and a dyadic dilation (a product by the powers of ... |

432 | subsequence matching in timeseries databases
- Faloutsos, Manolopoulos
- 1994
(Show Context)
Citation Context ...in this framework, we address the following questions. 1. Which transformations are effective for similarityssearch over time series? The original work by Agrawal et al, as well as subsequent research=-=[10, 25]-=-, used the Discrete Fourier Transform (DFT) for feature extraction. Later on, the Singular Value Decomposition (SVD) transform was suggested as a very accurate (high precision), but computationally ex... |

385 | The sr-tree: an index structure for high-dimensional nearest neighbor queries
- Katayama, Satoh
(Show Context)
Citation Context ...ll also effect performance. The use of too many features will render the index search less effective as the performance of even the best multidimensional index strategies decreases in high dimensions =-=[32, 3, 17, 7, 5]-=-. Within this framework, we address the following questions. 1. Which transformations are effective for similarityssearch over time series? The original work by Agrawal et al, as well as subsequent re... |

306 | Similarity indexing with the ss-tree
- WHITE, JAIN
- 1996
(Show Context)
Citation Context ...ll also effect performance. The use of too many features will render the index search less effective as the performance of even the best multidimensional index strategies decreases in high dimensions =-=[32, 3, 17, 7, 5]-=-. Within this framework, we address the following questions. 1. Which transformations are effective for similarityssearch over time series? The original work by Agrawal et al, as well as subsequent re... |

236 | Locally adaptive dimensionality reduction for indexing large time series databases - Chakrabarti, Keogh, et al. |

212 | Efficient time series matching by wavelets
- Chan, Fu
- 1999
(Show Context)
Citation Context ...raction. Later on, the Singular Value Decomposition (SVD) transform was suggested as a very accurate (high precision), but computationally expensive, alternative [33]. More recently, the Haar Wavelet =-=[6, 34]-=- and other similar techniques [19, 20, 5, 35] have been used to improve various aspects of the similarity searchprocess. The incorporation of new and better transformations has not yet taken advantage... |

182 | Approximate query processing using wavelets
- Chakrabarti, Garofalakis, et al.
(Show Context)
Citation Context ...and others have been exploited extensively for managing images [14, 15, and others] and for a variety of data compression applications including selectivity estimate [22], approximate query answering =-=[4, 30]-=- and clustering [27]. However, their use as a scalable dimensionality reduction technique for time-series data has not yet been fully appreciated. In the next section, we will show how these propertie... |

182 | Efficient retrieval of similar time sequences under time warping - Yi, Jagadish, et al. - 1998 |

174 | Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets
- Vitter, Wang
- 1999
(Show Context)
Citation Context ...and others have been exploited extensively for managing images [14, 15, and others] and for a variety of data compression applications including selectivity estimate [22], approximate query answering =-=[4, 30]-=- and clustering [27]. However, their use as a scalable dimensionality reduction technique for time-series data has not yet been fully appreciated. In the next section, we will show how these propertie... |

164 | Dimensionality reduction for fast similarity search in large time series databases
- Keogh, Chakrabarti, et al.
- 2001
(Show Context)
Citation Context ... Decomposition (SVD) transform was suggested as a very accurate (high precision), but computationally expensive, alternative [33]. More recently, the Haar Wavelet [6, 34] and other similar techniques =-=[19, 20, 5, 35]-=- have been used to improve various aspects of the similarity searchprocess. The incorporation of new and better transformations has not yet taken advantage of the rapidly growing suite of sophisticate... |

136 | Similarity-based queries for time series data
- Rafier, Mendelzon
- 1997
(Show Context)
Citation Context ...ject Ü and Ý) must satisfy the following lower bounding lemma or contractive property. � feature Ì Ü �Ì Ý � � object Ü� Ý � (1) Rafiei and Mendelzon showed how to handle moving averages in an F-Index =-=[24]-=-. In follow-on work, they suggested the use of the symmetric property of the DFT to increase the precision of the distance measure without increasing the number of features stored in index [25]. Time ... |

101 | The hybrid tree: An index structure for high dimensional feature spaces
- Chakrabarti, Mehrotra
- 1999
(Show Context)
Citation Context |

101 | Dimensionality reduction for similarity searching in dynamic databases - Kanth, Agrawal, et al. - 1998 |

94 | Content-based image indexing and searching using Daubechies’ wavelets - Wang, Wiederhold, et al. - 1997 |

86 | Dynamic maintenance of wavelet-based histograms
- MATIAS, VITTER, et al.
- 2000
(Show Context)
Citation Context ...y other methods. These properties and others have been exploited extensively for managing images [14, 15, and others] and for a variety of data compression applications including selectivity estimate =-=[22]-=-, approximate query answering [4, 30] and clustering [27]. However, their use as a scalable dimensionality reduction technique for time-series data has not yet been fully appreciated. In the next sect... |

77 | Laws: Minutes from an Infinite - Schroeder, Fractals, et al. - 1991 |

67 | The Fastest Fourier Transform in the West
- Frigo, Johnson
- 1997
(Show Context)
Citation Context ...g the fastest implementations of specific wavelets, did permit us to experiment witha large suite of dozens of wavelet functions. For comparison withDFT, we used one of the best known implementations =-=[12]-=-. Because of this choice, comparison of CPU times for the transformations would be very biased and unreliable. While the complexity of DFT is Ç ÒÐÓ�Ò compared to Ç Ò for wavelets, we actually observed... |

57 | A Mendelzon, Efficient Retrieval of Similar Time Sequences Using DFT
- Rafiei
- 1998
(Show Context)
Citation Context ...in this framework, we address the following questions. 1. Which transformations are effective for similarityssearch over time series? The original work by Agrawal et al, as well as subsequent research=-=[10, 25]-=-, used the Discrete Fourier Transform (DFT) for feature extraction. Later on, the Singular Value Decomposition (SVD) transform was suggested as a very accurate (high precision), but computationally ex... |

55 | Wavelet-based image indexing techniques with partial sketch retrieval capability - Wang, Wiederhold, et al. - 1997 |

48 | Abbadi. A comparison of DFT and DWT based similarity search in time-series databases
- Wu, Agrawal, et al.
- 2000
(Show Context)
Citation Context ...raction. Later on, the Singular Value Decomposition (SVD) transform was suggested as a very accurate (high precision), but computationally expensive, alternative [33]. More recently, the Haar Wavelet =-=[6, 34]-=- and other similar techniques [19, 20, 5, 35] have been used to improve various aspects of the similarity searchprocess. The incorporation of new and better transformations has not yet taken advantage... |

27 | Efficient retrieval for browsing large image databases
- Wu, Agrawal, et al.
- 1996
(Show Context)
Citation Context ...ourier Transform (DFT) for feature extraction. Later on, the Singular Value Decomposition (SVD) transform was suggested as a very accurate (high precision), but computationally expensive, alternative =-=[33]-=-. More recently, the Haar Wavelet [6, 34] and other similar techniques [19, 20, 5, 35] have been used to improve various aspects of the similarity searchprocess. The incorporation of new and better tr... |

24 | The haar wavelet transform in the time series similarity paradigm
- Struzik, Siebes
(Show Context)
Citation Context ...onsidered [24, 19, 36]. Chan and Fu used the simplest wavelet, the Haar wavelet, and showed performance improvements over DFT [6]. Struzik and Siebes have also applied the Haar wavelet in this domain =-=[28]-=-. The Piecewise Aggregate Approximation (PAA) transform, which is similar to the Haar transform, has also been used for similarity search [19]. This work also extended the F-index framework to support... |

20 |
Hierarchyscan: A hierarchical similarity search algorithm for databases of long sequences
- Li, Yu, et al.
- 1996
(Show Context)
Citation Context ...ion of the signal of different scales, which correspond to basis functions of different length. Hence, the wavelet transform is hierarchical and allows much finer tuning for a variety of applications =-=[21]-=-. ¯ Unlike the Fourier transform, wavelet transforms have an infinite set of possible basis functions. Thus, they provide access to information that can be obscured by other methods. These properties ... |

13 |
Wavelets and their application to computer graphics
- Fournier
- 1995
(Show Context)
Citation Context ...idimensional index structure, namely Norbert Beckman’s Version 2 implementation of the Ê £ -tree [2]. For the wavelet transformations, we used the “Imager Wavelet Library” which is a research library =-=[11]-=-. This library, while perhaps not containing the fastest implementations of specific wavelets, did permit us to experiment witha large suite of dozens of wavelet functions. For comparison withDFT, we ... |

12 |
WaveCluster: A Wavelet based Clustering Approach for Spatial Data in Very Large Database
- Sheikholeslami, Chatterjee, et al.
- 1999
(Show Context)
Citation Context ...ploited extensively for managing images [14, 15, and others] and for a variety of data compression applications including selectivity estimate [22], approximate query answering [4, 30] and clustering =-=[27]-=-. However, their use as a scalable dimensionality reduction technique for time-series data has not yet been fully appreciated. In the next section, we will show how these properties can be exploited t... |

2 |
Efficient similarity searchin sequence databases
- Agrawal, Faloutsos, et al.
- 1993
(Show Context)
Citation Context ...ng (ICDE’02) 1063-6382/02 $17.00 © 2002 IEEE Renée J. Miller University of Toronto, ON, Canada miller@cs.toronto.edu tance metric, most often Euclidean distance or relatives of the Euclidean distance =-=[1]-=-. Other distance metrics, including the ÄÔ Norms may also be used [35]. Because of the high dimensionality of most time series, the direct indexing of time series is prohibitive. As a result dimension... |

2 |
Benchmarking access structures for high-dimensional multimedia data
- Collosi, Nascimento
- 2000
(Show Context)
Citation Context ... on the index structure, we decided to perform similar experiments using other multi-dimensional indices. Collosi and Nascimento have bench-marked a set of promising high-dimensional index structures =-=[8]-=-. The SR-tree [17] and the M-tree [7] are among the structures that clearly outperform the other competitive techniques in their experimental set. We ran an experiment to compare the R*-tree and the S... |

1 |
Wavelets and their applications in databases. http://www.informatik.unikonstanz.de/ keim/TutorialNotes
- Keim
- 2001
(Show Context)
Citation Context ...avelet transforms are as powerful and versatile as the Fourier transform, yet without some of the limitations of the latter. Wavelets have numerous properties that cansbe exploited in data management =-=[18]-=-. We briefly present a few of the most commonly cited properties ¯ Some wavelet transforms have compact support. This means that the basis functions are non-zero only on a finite interval. What this m... |

1 |
Similarity searchover timeseries data using wavelets
- Popivanov, Miller
- 2001
(Show Context)
Citation Context ...let function), how does the filter length effect the precision of the query? 4. Are wavelets effective for different data classes? Additional experiments are reported in the full version of the paper =-=[23]-=-. 4.1 Experimental Setup All the experiments use the same multidimensional index structure, namely Norbert Beckman’s Version 2 implementation of the Ê £ -tree [2]. For the wavelet transformations, we ... |

1 |
Fast time sequence indexing for arbitrary ÄÔ norms
- Yi, Faloutsos
- 2000
(Show Context)
Citation Context ...ty of Toronto, ON, Canada miller@cs.toronto.edu tance metric, most often Euclidean distance or relatives of the Euclidean distance [1]. Other distance metrics, including the ÄÔ Norms may also be used =-=[35]-=-. Because of the high dimensionality of most time series, the direct indexing of time series is prohibitive. As a result dimensionality reduction appears to be the most promising method for overcoming... |