## Rule discovery from time series (1997)

Venue: | In Proceedings of the 1997 ACM SIGKDD International Conference, ACM SIGKDD |

Citations: | 143 - 0 self |

### BibTeX

@INPROCEEDINGS{Das97rulediscovery,

author = {Gautam Das and King-ip Lin and Heikki Mannila and Gopal Renganathan and Padhraic Smyth},

title = {Rule discovery from time series},

booktitle = {In Proceedings of the 1997 ACM SIGKDD International Conference, ACM SIGKDD},

year = {1997}

}

### Years of Citing Articles

### OpenURL

### Abstract

We consider the problem of finding rules relating patterns in a time series to other patterns in that series, or patterns in one series to patterns in another series. A simple example is a rule such as "a period of low telephone call activity is usually followed by a sharp rise ill call vohune". Examples of rules relating two or more time series are "if the Microsoft stock price goes up and lntel falls, then IBM goes up the next. day, " and "if Microsoft goes up strongly fro " one day, then declines strongly on the next day, and on the same days Intel stays about, level, then IBM stays about level. " Our emphasis is in the discovery of local patterns in multivariate time series, in contrast to traditional time series analysis which largely focuses on global models. Thus, we search for rules whose conditions refer to patterns in time series. However, we do not want to define beforehand which patterns are to be used; rather, we want the patterns to be formed fl’om the data in the context of rule discovery. We describe adaptive methods for finding rules of the above type fi’om time-series data. The methods are based on discretizing the sequence hy methods resembling vector quantization. \,Ve first form subsequences by sliding window through the time series, and then cluster these subsequences by using a suitable measure of time-series similarity. The discretized version of the time series is obtained by taldng the cluster identifiers corresponding to the subsequence. Once tl,e time-series is discretized, we use simple rule finding methods to obtain rifles from the sequence. "vVe present empMcal resuh.s on the behavior of the method.

### Citations

2429 | Mining association rules between sets of items in large databases - Agrawal, Imielinski, et al. - 1993 |

2143 |
Dubes R.C. Algorithms for Clustering Data
- Jain
- 1988
(Show Context)
Citation Context ...he clustering. Recall that w is one of the parameters to the system; it is used to define the set W (s). In principle, any clustering algorithms can be used to cluster the subsequences in W (s); see (=-=Jain & Dubes 1988-=-; Kaufman & Rousseauw 1990) for overviews. We have experimented with the following two methods. The first method is a greedy method for producing clusters with at most a given diameter. Treat each sub... |

1648 |
Vector Quantization and Signal Compression
- Gersho, Gray
- 1992
(Show Context)
Citation Context ...as "primitives" in a more abstract representation of a signal is not a new idea. Essentially the same method is used in the well-known vector quantization (VQ) method of data compression (se=-=e, e.g., (Gersho & Gray 1992-=-)). VQ is based on the notion of replacing local windows (of size w) (of signals or images) by pattern centroids determined by an algorithm quite similar to k-means clustering. For data compression, o... |

1323 |
Finding Groups in Data: An Introduction to Cluster Analysis
- Kaufman, Rousseeuw
- 1998
(Show Context)
Citation Context ...ll that w is one of the parameters to the system; it is used to define the set W (s). In principle, any clustering algorithms can be used to cluster the subsequences in W (s); see (Jain & Dubes 1988; =-=Kaufman & Rousseauw 1990-=-) for overviews. We have experimented with the following two methods. The first method is a greedy method for producing clusters with at most a given diameter. Treat each subsequence in W (s) as a poi... |

1174 | Mining Sequential Patterns
- Agrawal, Srikant
- 1995
(Show Context)
Citation Context ...erns. Thus our rule discovery method aims at finding local relationshipssfrom the series, in the spirit of association rules, sequential patterns, or episode rules (Agrawal, Imielinski, & Swami 1993; =-=Agrawal & Srikant 1995-=-; Mannila, Toivonen, & Verkamo 1997). Unlike traditional time series modeling, we do not seek a global model for the time series, instead searching for local patterns in a relatively non-parametric ma... |

490 | Beyond market basket: generalizing association rules to correlations - Brin, Motwani, et al. |

472 |
Fast discovery of association rules
- Agrawal, Mannila, et al.
- 1996
(Show Context)
Citation Context ...etters in the alphabet and m is the number of different possibilities for T . 1 Note that this differs from the usage of frequency or support for association rules (Agrawal, Imielinski, & Swami 1993; =-=Agrawal et al. 1996-=-), where frequency is defined as the fraction of objects that satisfy the left and right hand sides of the rule. Informative Rules The above method produces lots of rules, with varying confidences. Fo... |

413 | Efficient similarity search in sequence databases - Agrawal, Faloutsos, et al. - 1993 |

300 | Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery - Mannila, Toivonen, et al. - 1997 |

204 | Finding Interesting Rules from Large Sets of Discovered -4ssociation Rules
- Klemettinen, Mannila, et al.
- 1994
(Show Context)
Citation Context ...ith varying confidences. For interactive knowledge discovery, a good strategy is to allow the user to browse through rule sets and provide tools for the selection of interesting rules (Kloesgen 1995; =-=Klemettinen et al. 1994-=-; Brin, Motwani, & Silverstein 1997). Nonetheless, no single significance criterion can probably suffice to select the most valuable rules. Still, the user needs some guidance in determining which rul... |

168 |
Using dynamic time warping to find patterns in time series
- Berndt
- 1994
(Show Context)
Citation Context ...e standard deviation of the sequence) forcing the mean to be 0 and the variance 1. Recently, more sophisticated time series distance measures have been investigated, such as the dynamic time warping (=-=Berndt & Clifford 1994-=-) measure, the longest common subsequence measure (Das, Gunopulos, & Mannila 1997; Bollob'as et al. 1997), and various probabilistic distance measures (Keogh & Smyth 1997). Due to space limitations we... |

168 |
Discovery, analysis, and presentation of strong rules”, Knowledge Discovery in Databases
- Piatetsky-Shapiro
- 1991
(Show Context)
Citation Context ...ill, the user needs some guidance in determining which rules have a confidence that differs substantially from the expected. There are a variety of metrics which can be used to rank rules (e.g., see (=-=Piatetsky-Shapiro 1991-=-) for a general overview of such methods). Here we use the J-measure for rule-ranking (Smyth & Goodman 1991; 1992) defined as: J(BT ; A) = p(A)s` p(BT jA) log( p(BT jA) p(B T ) ) +(1 \Gamma p(BT jA)) ... |

137 | On sirnilarity-based queries for time series data
- Rafiei
- 1999
(Show Context)
Citation Context ...adings in the Pacific. There has been a lot of interest into querying time series on the basis of similarity (see, e.g., (Agrawal, Faloutsos, & Swami 1993; Shatkay & Zdonik 1996; Agrawal et al. 1995; =-=Rafiei & Mendelzon 1997-=-; Yazdani & Ozsoyoglu 1996)). In this paper we are interested in finding rules relating the behavior of patterns within a sequence over time, or the relationship between two or more sequences over tim... |

107 | Querying shapes of histories
- Agrawal, Psaila, et al.
- 1995
(Show Context)
Citation Context ...urface temperature readings in the Pacific. There has been a lot of interest into querying time series on the basis of similarity (see, e.g., (Agrawal, Faloutsos, & Swami 1993; Shatkay & Zdonik 1996; =-=Agrawal et al. 1995-=-; Rafiei & Mendelzon 1997; Yazdani & Ozsoyoglu 1996)). In this paper we are interested in finding rules relating the behavior of patterns within a sequence over time, or the relationship between two o... |

100 | A Probabilistic Approach to Fast Pattern matching in Time Series Databases
- Keogh, Smyth
- 1997
(Show Context)
Citation Context ...the dynamic time warping (Berndt & Clifford 1994) measure, the longest common subsequence measure (Das, Gunopulos, & Mannila 1997; Bollob'as et al. 1997), and various probabilistic distance measures (=-=Keogh & Smyth 1997-=-). Due to space limitations we omit the details of their use but note that the results below can be easily generalized to handle any such distance measures. Clustering methods The first step in the di... |

91 |
An Information Theoretic Approach to Rule Induction from Databases
- Smyth, Goodman
- 1992
(Show Context)
Citation Context ... The second term is well-known as the cross-entropy, namely the information gained (or degree of surprise) in going from a prior probability p(B T ) to a posterior probability p(B T jA). As shown in (=-=Smyth & Goodman 1992-=-) the product of the two terms (the J-measure above) has unique properties as a rule information measure and is in a certain sense a special case of Shannon's mutual information. From a practical view... |

87 | Finding Similar Time Series - Das, Gunopulos, et al. - 1997 |

78 |
Rule induction using information theory
- Smyth, Goodman
- 1990
(Show Context)
Citation Context ...tingness is estimated informativeness, i.e., whether the rule gives additional information about the sequences. We can assign a measure of informativeness to the discovered rules using the J-measure (=-=Smyth & Goodman 1991-=-; 1992). One can also run the method for several choices of the parameters and let the user browse the different rule sets. The running time of the method is small enough so that this is feasible. An ... |

77 | Approximate Queries and Representations for Large Data Sequences
- Shatkay, Zdonik
- 1996
(Show Context)
Citation Context ...Europe, and daily sea-surface temperature readings in the Pacific. There has been a lot of interest into querying time series on the basis of similarity (see, e.g., (Agrawal, Faloutsos, & Swami 1993; =-=Shatkay & Zdonik 1996-=-; Agrawal et al. 1995; Rafiei & Mendelzon 1997; Yazdani & Ozsoyoglu 1996)). In this paper we are interested in finding rules relating the behavior of patterns within a sequence over time, or the relat... |

33 | Time-Series Similarity Problems and Well-Separated Geometric Sets
- Bollobas, Das, et al.
(Show Context)
Citation Context ...icated time series distance measures have been investigated, such as the dynamic time warping (Berndt & Clifford 1994) measure, the longest common subsequence measure (Das, Gunopulos, & Mannila 1997; =-=Bollob'as et al. 1997-=-), and various probabilistic distance measures (Keogh & Smyth 1997). Due to space limitations we omit the details of their use but note that the results below can be easily generalized to handle any s... |

18 |
Efficient discovery of interesting statements in databases
- Kloesgen
- 1995
(Show Context)
Citation Context ...ots of rules, with varying confidences. For interactive knowledge discovery, a good strategy is to allow the user to browse through rule sets and provide tools for the selection of interesting rules (=-=Kloesgen 1995-=-; Klemettinen et al. 1994; Brin, Motwani, & Silverstein 1997). Nonetheless, no single significance criterion can probably suffice to select the most valuable rules. Still, the user needs some guidance... |

1 | Finding Groups in Data: An Introduction to Cluster Analysis - Kauflnan, Rousseauw - 1990 |

1 |
Efficient discovery of interesting st.atenlents in databases
- Kloesgen
- 1995
(Show Context)
Citation Context ...ots of rules, with varying confidences. For interactive knowledge discovery, a good strategy is to allow the user to browse through rule sets and provide tools for the selection of interesting rules (=-=Kloesgen 1995-=-; Klemettinen et al. 1994; Brin, Motwani, & Silverstein 1997). Nonetheless, no single significance criterion can probably suffice to select the most valuable rules. Still, the user needs some guidance... |

1 | Discovery of fl’equent episodes in event sequences. Data Mining and Knowledge Discovery 1(3):259 - Mannila, Toivonen, et al. - 1997 |