#### DMCA

## Hot sax: Efficiently finding the most unusual time series subsequence (2005)

### Cached

### Download Links

- [www.cs.cuhk.hk]
- [www.cse.cuhk.edu.hk]
- [www.cs.cuhk.hk]
- DBLP

### Other Repositories/Bibliography

Citations: | 108 - 5 self |

### Citations

10600 | Introduction to algorithms
- Cormen, Leiserson, et al.
- 1995
(Show Context)
Citation Context ...o smaller sub-problems, which can be solved and admissibly recombined. Depending on the exact definitions, such techniques are variously called dynamic programming, divide and conquer, bottom-up, etc =-=[3]-=-. Unfortunately, as we show below, such ideas are unlikely to help us efficiently find discords. Imagine that we break a time series T into two sections, A and B, and that we find the discords for bot... |

325 | On e need for time series data mining benchmarks: A survey and empirical demonstration”.
- Keogh, Kasetty
- 2002
(Show Context)
Citation Context ...ering. 1. INTRODUCTION The previous decade has seen hundreds of papers on time series similarity search, which is the task of finding a time series that is most similar to a particular query sequence =-=[5]-=-. In this work, we pose the new problem of finding the sequence that is least similar to all other sequences. We call such sequences time series discords. Figure 1 gives a visual intuition of a time s... |

315 | A symbolic representation of time series, with implications for streaming algorithms.
- Lin, Keogh, et al.
- 2003
(Show Context)
Citation Context ...ues for approximating the perfect ordering returned by the hypothetical Magic heuristics, we must briefly review the Symbolic Aggregate ApproXimation (SAX) representation of time series introduced in =-=[10]-=-. While there are at least 200 different symbolic approximation of time series in the literature, SAX is unique in that it is the only one that allows both dimensionality reduction and lower bounding ... |

185 | Exact discovery of time series motifs.
- Mueen, Keogh, et al.
- 2009
(Show Context)
Citation Context ...r future work. 2. RELATED WORK AND BACKGROUND Our review of related work is exceptionally brief because we are considering a new problem. Most real valued time series problems such as motif discovery =-=[2]-=-, longest common subsequence matching, sequence averaging, segmentation, indexing [5], etc. have approximate or exact analogues in the discrete world, and have been addressed by the text processing or... |

185 | Distance-Based Outlier : Algorithms and Applications.
- Knorr, Hg, et al.
- 2000
(Show Context)
Citation Context ... [5], so one might imagine that such a representation would be useful for the task at hand. We could simply project our time series into n-dimensional space and use existing outlier detection methods =-=[7]-=-. The problem with this idea is the unintuitive fact that discords do not necessarily live in sparse areas of n-dimensional space (Conversely, repeated patterns do not necessarily live in dense parts ... |

165 | Fast algorithms for sorting and searching strings. In:
- Bentley, Sedgewick
- 1997
(Show Context)
Citation Context ...ntly visiting). However, if we want to know the location of the other occurrence, we must visit the trie. Surprisingly, both data structures can be created in time and space linear in the length of T =-=[1]-=-. In fact, if we take advantage of the fact that we only need �log2(�)� bits for each SAX symbol, then both data structures are significantly smaller than the raw time series data they were derived fr... |

145 | Towards parameter-free data mining
- Keogh, Lonardi, et al.
- 2004
(Show Context)
Citation Context ...vely recent introduction, SAX has become an important tool in the time series data mining toolbox. It has been used to find time series motifs [2], to mine rules in health data, for anomaly detection =-=[6]-=-, to extract features from ashepatitis database, for visualization [8][11], and a host of other data mining tasks. 4.1 A Brief Review of SAX A time series C of length n can be represented in a wdimens... |

68 | Compressed text databases with efficient query algorithms based on the compressed suffix array. - Sadakane - 2000 |

50 | Visually Mining and Monitoring Massive Time Series.
- Lin, Keogh, et al.
- 2004
(Show Context)
Citation Context ...ries data mining toolbox. It has been used to find time series motifs [2], to mine rules in health data, for anomaly detection [6], to extract features from ashepatitis database, for visualization [8]=-=[11]-=-, and a host of other data mining tasks. 4.1 A Brief Review of SAX A time series C of length n can be represented in a wdimensional space by a vector C c1, � , c . The i w th � element of C is calcula... |

34 | Time- series bitmaps: A practical visualization tool for working with large time series databases,” in
- Kumar, Lolla, et al.
- 2005
(Show Context)
Citation Context ... series data mining toolbox. It has been used to find time series motifs [2], to mine rules in health data, for anomaly detection [6], to extract features from ashepatitis database, for visualization =-=[8]-=-[11], and a host of other data mining tasks. 4.1 A Brief Review of SAX A time series C of length n can be represented in a wdimensional space by a vector C c1, � , c . The i w th � element of C is cal... |

13 |
Motif Discovery Algorithm from Motion Data.
- Tanaka, Uehara
- 2004
(Show Context)
Citation Context ...gs of the Fifth IEEE International Conference on Data Mining (ICDM’05) 1550-4786/05 $20.00 © 2005 IEEE bioinformatics community and increasingly understood in the time series data mining community [2]=-=[13]-=-. We will therefore use the definition of non-self matches to define time series discords: Definition 6. Time Series Discord: Given a time series T, the subsequence D of length n beginning at position... |

3 |
Distinguishing string selection problems, Information and Computation 185: pp 41–55
- Wang, S, et al.
- 2003
(Show Context)
Citation Context ...er, time series discords do not appear to have a discrete version. Note that the superficially similar sounding Furthest (Sub)String Problem requires us to build a string, not to find one in the data =-=[9]-=-. 2.1 Notation For concreteness, we begin with a definition of our data type of interest, time series: Definition 1. Time Series: A time series T = t 1,…,t m is an ordered set of m real-valued variabl... |