## TSA-tree: A wavelet-based approach to improve the efficiency of multi-level surprise and trend queries on time-series data (2000)

### Cached

### Download Links

- [genesis2.jpl.nasa.gov]
- [infolab.usc.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | IN SSDBM |

Citations: | 43 - 0 self |

### BibTeX

@INPROCEEDINGS{Shahabi00tsa-tree:a,

author = {Cyrus Shahabi and Xiaoming Tian and Wugang Zhao},

title = {TSA-tree: A wavelet-based approach to improve the efficiency of multi-level surprise and trend queries on time-series data},

booktitle = {IN SSDBM},

year = {2000},

pages = {55--68},

publisher = {}

}

### Years of Citing Articles

### OpenURL

### Abstract

We introduce a novel wavelet-based tree structure, termed TSA-tree, which improves the efficiency of multilevel trend and surprise queries on time sequence data. With the explosion of scientific observation data (some conceptualized as time-sequences), we are facing the challenge of efficiently storing, retrieving and analyzing this data. Frequent queries on this data set is to find trends (e.g., global warming) or surprises (e.g., undersea volcano eruption) within the original time-series. The challenge, however, is that these trend and surprise queries are needed at different levels of abstractions (e.g., within the last week, last month, last year or last decade). To support these multi-level trend and surprise queries, sometimes huge subset of raw data needs to be retrieved and processed. To

### Citations

8521 |
Introduction to Algorithms
- Cormen, Stein, et al.
- 2001
(Show Context)
Citation Context ...true. The lemma 4.8 provides us a very desirable property that the error is summable with respect to each node. By this property, the optimal problem can be reduced to the Fractional Knapsack problem =-=[11]-=- where a greedy algorithm with O(n) complexity results in the optimal solution 6 . However, we need to define the value and cost for each node of OTSA-tree. The value is directly proportional to the n... |

2362 | A theory for multiresolution signal decomposition: The wavelet representation
- Mallat
- 1989
(Show Context)
Citation Context ... be very useful in several different areas, such as sub-band filtering, quadratic mirror filters, and pyramid schemes in the area of signal and image processing, for the collections of references see =-=[5, 9, 10, 12, 23]-=-. We can use wavelet transform to obtain multi-level trends and surprises in a uniform scheme. Towards this end, we introduce an operation termed split to generate a multi-level tree, where each node ... |

1574 | Orthonormal bases of compactly supported wavelets," Communications on pure and applied mathematics
- Daubechies
- 1988
(Show Context)
Citation Context ... be very useful in several different areas, such as sub-band filtering, quadratic mirror filters, and pyramid schemes in the area of signal and image processing, for the collections of references see =-=[5, 9, 10, 12, 23]-=-. We can use wavelet transform to obtain multi-level trends and surprises in a uniform scheme. Towards this end, we introduce an operation termed split to generate a multi-level tree, where each node ... |

691 |
An htroduction to Wavelets
- Chui
- 1992
(Show Context)
Citation Context ... be very useful in several different areas, such as sub-band filtering, quadratic mirror filters, and pyramid schemes in the area of signal and image processing, for the collections of references see =-=[5, 9, 10, 12, 23]-=-. We can use wavelet transform to obtain multi-level trends and surprises in a uniform scheme. Towards this end, we introduce an operation termed split to generate a multi-level tree, where each node ... |

414 | Faloutsos,“Efficient similarity search in sequence databases
- Agrawal, Christos
- 1993
(Show Context)
Citation Context ...ing nodes and/or coefficients with less energy. The resulting condensed OTSA-tree, hence, becomes a competitor to other techniques discussed in the literature such as Discrete Fourier Transform (DFT) =-=[2, 3]-=- and Single Value Decomposition (SVD) [20]. Therefore, we conducted comprehensive experiments to compare our results with those techniques. Briefly, we outperformed SVD and DFT in both performance and... |

375 | The Lifting Scheme: A Construction Of Second Generation Wavelets
- Sweldens
(Show Context)
Citation Context ...hin our time-series datablade. Consequently, TSA-trees can be constructed (and updated) automatically, given time-series sequences as inputs. Next, we will use the second generation wavelet transform =-=[29, 30]-=-, to better adapt to day, week and month scheme instead of using down/up-sampling by 2. Finally, so far we only studied trends and surprises on temporal data (i.e., time-series). However, with our app... |

330 |
Gopinath, “Introduction to Wavelets and Wavelet Transforms
- Burrus, A
- 1988
(Show Context)
Citation Context |

254 | Algorithms for Mining Distance-Based Outliers
- Knorr, Ng
- 1998
(Show Context)
Citation Context ...inspection of plots of the original data. Such plots help one defines the nature of the data. The plot of the data against time is often sufficient to identify very extreme observations, i.e. outlier =-=[1, 18, 19, 4]-=-. However the outlier detection techniques are expensive when compared with the wavelet transform and it cannot capture multi-level surprises. In this paper, we also try to provide an efficient way fo... |

233 |
Outliers in statistical data
- Barnett, Lewis
- 1994
(Show Context)
Citation Context ...inspection of plots of the original data. Such plots help one defines the nature of the data. The plot of the data against time is often sufficient to identify very extreme observations, i.e. outlier =-=[1, 18, 19, 4]-=-. However the outlier detection techniques are expensive when compared with the wavelet transform and it cannot capture multi-level surprises. In this paper, we also try to provide an efficient way fo... |

203 | Efficient Time Series Matching by Wavelets
- Chan, Fu
(Show Context)
Citation Context ...or similarity searching, meanwhile, a query processing algorithm that uses the underlying R-tree index of a multidimensional data set is provided to answer similarity queries efficiently. Chan et al. =-=[7]-=- uses wavelet transform instead of the Fourier transform and also keeps the first few coefficients for similarity searching. The shortcoming of these methods is that they drop the surprises. 3. A Tree... |

202 |
The analysis of Time-Series: An introduction
- Chatfield
(Show Context)
Citation Context ...w on our future work.s2. Related work Traditional methods of time series analysis are mainly concerned with decomposing a series into a trend, a seasonal variation, and other “irregular” fluctuations =-=[8, 14]-=-. Yt = Tt + St + Zt where Tt is the “trend” component, St is the “seasonal” component, and Zt is the “irregular” or “random” component. In this model, “trend” is loosely defined as “long term change i... |

195 |
Zur Theorie der orthogonalen Funktionensysteme
- Haar
- 1910
(Show Context)
Citation Context ... is called wavelet synthetic filters , denoted as Hs, Gs. They are uniquely determined by the wavelet transform. For example, for the Haar wavelet, the simplest and most popular wavalet given by Haar =-=[15]-=-, the wavelet analysis filters associated with Haar wavelet are : Ha =(1= p 2� 1= p 2) Ga =(;1= p 2� 1= p 2) The wavelet synthetic filters associated with Haar wavelet are : Hs =(1= p 2� 1= p 2) Gs =(... |

171 | WaveCluster: A multi-resolution clustering approach for very large spatial databases - Sheikholeslami, Chatterjee, et al. - 1998 |

149 | The lifting scheme: A new philosophy in biorthogonal wavelet constructions
- Sweldens
- 1995
(Show Context)
Citation Context ...hin our time-series datablade. Consequently, TSA-trees can be constructed (and updated) automatically, given time-series sequences as inputs. Next, we will use the second generation wavelet transform =-=[29, 30]-=-, to better adapt to day, week and month scheme instead of using down/up-sampling by 2. Finally, so far we only studied trends and surprises on temporal data (i.e., time-series). However, with our app... |

137 | On sirnilarity-based queries for time series data
- Rafiei
- 1999
(Show Context)
Citation Context ...to add new coefficients at the end of each TSA-tree node. In the database literature, much work has been done on querying time series data. Most of the effort, however, has been on similarity queries =-=[2, 3, 6, 13, 26, 25, 28]-=-. Some of these works can be considered as a complement to our study, to perform similarity match on trends and surprises (i.e., TSA-tree nodes). With data mining applications, it is often necessary t... |

100 |
Wavelets: "A tutorial of theory and applications
- Chui
- 121
(Show Context)
Citation Context |

100 | Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences
- Korn, Jagadish, et al.
- 1997
(Show Context)
Citation Context ...gy. The resulting condensed OTSA-tree, hence, becomes a competitor to other techniques discussed in the literature such as Discrete Fourier Transform (DFT) [2, 3] and Single Value Decomposition (SVD) =-=[20]-=-. Therefore, we conducted comprehensive experiments to compare our results with those techniques. Briefly, we outperformed SVD and DFT in both performance and accuracy for our application. This is bec... |

84 | A linear method for deviation detection in large databases
- Arning, Agrawal, et al.
- 1996
(Show Context)
Citation Context ...inspection of plots of the original data. Such plots help one defines the nature of the data. The plot of the data against time is often sufficient to identify very extreme observations, i.e. outlier =-=[1, 18, 19, 4]-=-. However the outlier detection techniques are expensive when compared with the wavelet transform and it cannot capture multi-level surprises. In this paper, we also try to provide an efficient way fo... |

77 | Approximate Queries and Representations for Large Data Sequences
- Shatkay, Zdonik
- 1996
(Show Context)
Citation Context ... cannot handle multi-level trends, i.e., how to deal with “short term” trends as well as “long term” trends. Database community typically uses a curve-fitting method to obtain trends and movement. In =-=[21, 28]-=- straight lines are used to approximate the time series, and it employs divide-and-conquer technique to segment before approximation, which is time-consuming and loses surprise information. Qu et. al,... |

59 | Finding intensional knowledge of distance-based outliers
- Knorr, Ng
- 1999
(Show Context)
Citation Context |

50 |
Technical analysis of stock trends
- Edwards, Magee
- 1969
(Show Context)
Citation Context ...so that P i ai =1and then the operation is often referred to as a moving average. Moving average is discussed in detail by Kendall [17], and it is widely used in stock data analysis (for example, see =-=[27]-=-). However, model (1) requires pre-defined fitting functions for Tt and St , also assumes Zt is a stationary time series 3 , which in some practical cases (sharp changes) is infeasible, and models (1)... |

38 |
On Autoregressive Time-Series
- Kendall
- 1944
(Show Context)
Citation Context ...ns and estimate the local mean, we should clearly choose the weights so that P i ai =1and then the operation is often referred to as a moving average. Moving average is discussed in detail by Kendall =-=[17]-=-, and it is widely used in stock data analysis (for example, see [27]). However, model (1) requires pre-defined fitting functions for Tt and St , also assumes Zt is a stationary time series 3 , which ... |

36 | Supporting fast search in time series for movement patterns in multiples scales
- Qu, Wang, et al.
- 1998
(Show Context)
Citation Context ...straight lines are used to approximate the time series, and it employs divide-and-conquer technique to segment before approximation, which is time-consuming and loses surprise information. Qu et. al, =-=[24]-=- use line-fitting to obtain multi-scales movement. In their study, a pattern is defined as a regular expression of letters, where each letter describes a movement direction and covers a specified leng... |

23 |
3–7). MALM: A framework for mining sequence database at multiple abstraction levels
- Li, Yu, et al.
- 1998
(Show Context)
Citation Context ... cannot handle multi-level trends, i.e., how to deal with “short term” trends as well as “long term” trends. Database community typically uses a curve-fitting method to obtain trends and movement. In =-=[21, 28]-=- straight lines are used to approximate the time series, and it employs divide-and-conquer technique to segment before approximation, which is time-consuming and loses surprise information. Qu et. al,... |

6 |
Querying shapes of histories. VLDB
- Agrawal, Psaila, et al.
- 1995
(Show Context)
Citation Context ...to add new coefficients at the end of each TSA-tree node. In the database literature, much work has been done on querying time series data. Most of the effort, however, has been on similarity queries =-=[2, 3, 6, 13, 26, 25, 28]-=-. Some of these works can be considered as a complement to our study, to perform similarity match on trends and surprises (i.e., TSA-tree nodes). With data mining applications, it is often necessary t... |

5 |
Introduction to Statistical Time Series. Wiley series in probability and statistics
- Fuller
- 1996
(Show Context)
Citation Context ...w on our future work.s2. Related work Traditional methods of time series analysis are mainly concerned with decomposing a series into a trend, a seasonal variation, and other “irregular” fluctuations =-=[8, 14]-=-. Yt = Tt + St + Zt where Tt is the “trend” component, St is the “seasonal” component, and Zt is the “irregular” or “random” component. In this model, “trend” is loosely defined as “long term change i... |

4 |
Ranganathan and Y.Manolopoulos, Fast subsequence matching in time-series database
- Faloutsos, M
- 1993
(Show Context)
Citation Context ...to add new coefficients at the end of each TSA-tree node. In the database literature, much work has been done on querying time series data. Most of the effort, however, has been on similarity queries =-=[2, 3, 6, 13, 26, 25, 28]-=-. Some of these works can be considered as a complement to our study, to perform similarity match on trends and surprises (i.e., TSA-tree nodes). With data mining applications, it is often necessary t... |

4 |
Similarity-Based Query for Time
- Rafiei, Mendelzon
- 1996
(Show Context)
Citation Context |

1 | Wetton.: Implicit-explicit methos for time dependent partial di erential equations - Ascher, Ruuth, et al. - 1995 |

1 | Fu.: E cient time series matching by wavelets. ICDE - Chan, W - 1999 |

1 | M.Safar and G.Hajj. 2D TSA-tree: A Wavelet-Based Approach to Improve the E cieny of Multi-Level Spatial Data Mining - Shahabi - 2001 |