Results 1  10
of
44
Time series knowledge mining
, 2006
"... An important goal of knowledge discovery is the search for patterns in data that can help explain the underlying process that generated the data. The patterns are required to be new, useful, and understandable to humans. In this work we present a new method for the understandable description of loca ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
An important goal of knowledge discovery is the search for patterns in data that can help explain the underlying process that generated the data. The patterns are required to be new, useful, and understandable to humans. In this work we present a new method for the understandable description of local temporal relationships in multivariate data, called Time Series Knowledge Mining (TSKM). We define the Time Series Knowledge Representation (TSKR) as a new language for expressing temporal knowledge. The patterns have a hierarchical structure, each level corresponds to a single temporal concept. On the lowest level, intervals are used to represent duration. Overlapping parts of intervals represent coincidence on the next level. Several such blocks of intervals are connected with a partial order relation on the highest level. Each pattern element consists of a semiotic triple to connect syntactic and semantic information with pragmatics. The patterns are very compact, but offer details for each element on demand. In comparison with related approaches, the TSKR is shown to have advantages in robustness, expressivity, and comprehensibility. Efficient algorithms for the discovery of the patterns are proposed. The search for coincidence as well as partial order can be formulated as variants of the well known frequent itemset problem. One of the best known algorithms for this problem is therefore adapted for our purposes. Human interaction is used during the mining to analyze and validate partial results as early as possible and guide further processing steps. The efficacy of the methods is demonstrated using several data sets. In an application to sports medicine the results were recognized as valid and useful by an expert of the field.
A better tool than Allen’s relations for expressing temporal knowledge in interval data
, 2006
"... Temporal patterns composed of symbolic intervals are commonly formulated with Allen’s interval relations originating in temporal reasoning. We show that this representation has severe disadvantages for knowledge discovery. The patterns are not robust, in the sense that small disturbances of interva ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Temporal patterns composed of symbolic intervals are commonly formulated with Allen’s interval relations originating in temporal reasoning. We show that this representation has severe disadvantages for knowledge discovery. The patterns are not robust, in the sense that small disturbances of interval boundaries lead to different patterns for similar situations. The representation is ambiguous since the same pattern can have quantitatively widely varying appearances. For all but very simple cases the patterns are not understandable because the textual descriptions are lengthy and unstructured. We present the Time Series Knowledge Representation (TSKR), a new hierarchical language for interval patterns to express the temporal concepts of coincidence and partial order. We demonstrate the superiority of this novel form of representing temporal knowledge over Allen’s relations for data mining. Results on a real data set support our claims and show a successful application.
Modified Gath–Geva clustering for fuzzy segmentation of multivariate timeseries
, 2005
"... ..."
(Show Context)
Indexing of Compressed Time Series
 Data Mining in Time Series Databases
"... We describe a procedure for identifying major minima and maxima of a time series, and present two applications of this procedure. The first application is fast compression of a series, by selecting major extrema and discarding the other points. The compression algorithm runs in linear time and takes ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
We describe a procedure for identifying major minima and maxima of a time series, and present two applications of this procedure. The first application is fast compression of a series, by selecting major extrema and discarding the other points. The compression algorithm runs in linear time and takes constant memory. The second application is indexing of compressed series by their major extrema, and retrieval of series similar to a given pattern. The retrieval procedure searches for the series whose compressed representation is similar to the compressed pattern. It allows the user to control the tradeoff between the speed and accuracy of retrieval. We show the effectiveness of the compression and retrieval for stock charts, meteorological data, and electroencephalograms. Keywords. Time series, compression, fast retrieval, similarity measures. 1
Extracting interpretable muscle activation patterns with time series knowledge mining
 International Journal of KnowledgeBased & Intelligent Engineering Systems
, 2005
"... The understanding of complex muscle coordination is an important goal in human movement science. There are numerous applications in medicine, sports, and robotics. The coordination process can be studied by observing complex, often cyclic movements, which are dynamically repeated in an almost ident ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
(Show Context)
The understanding of complex muscle coordination is an important goal in human movement science. There are numerous applications in medicine, sports, and robotics. The coordination process can be studied by observing complex, often cyclic movements, which are dynamically repeated in an almost identical manner. The muscle activation is measured using kinesiological EMG. Mining the EMG data to identify patterns, which explain the interplay and coordination of muscles is a very difficult Knowledge Discovery task. We present the Time Series Knowledge Mining framework to discover knowledge in multivariate time series and show how it can be used to extract such temporal patterns.
A Compact and Accurate Model for Classification
 IEEE Transactions on Knowledge and Data Engineering
, 2004
"... We describe and evaluate an informationtheoretic algorithm for datadriven induction of classification models based on a minimal subset of available features. The relationship between input (predictive) features and the target (classification) attribute is modeled by a treelike structure termed an ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
(Show Context)
We describe and evaluate an informationtheoretic algorithm for datadriven induction of classification models based on a minimal subset of available features. The relationship between input (predictive) features and the target (classification) attribute is modeled by a treelike structure termed an information network (IN). Unlike other decisiontree models, the information network uses the same input attribute across the nodes of a given layer (level). The input attributes are selected incrementally by the algorithm to maximize a global decrease in the conditional entropy of the target attribute. We are using the prepruning approach: when no attribute causes a statistically significant decrease in the entropy, the network construction is stopped. The algorithm is shown empirically to produce much more compact models than other methods of decisiontree learning, while preserving nearly the same level of classification accuracy.
Discovering System Health Anomalies Using Data Mining Techniques
 Proceedings of the Joint Army Navy NASA Air Force Conference on Propulsion
, 2005
"... We discuss a statistical framework that underlies envelope detection schemes as well as dynamical models based on Hidden Markov Models (HMM) that can encompass both discrete and continuous sensor measurements for use in Integrated System Health Management (ISHM) applications. The HMM allows for the ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
(Show Context)
We discuss a statistical framework that underlies envelope detection schemes as well as dynamical models based on Hidden Markov Models (HMM) that can encompass both discrete and continuous sensor measurements for use in Integrated System Health Management (ISHM) applications. The HMM allows for the rapid assimilation, analysis, and discovery of system anomalies. We motivate our work with a discussion of an aviation problem where the identification of anomalous sequences is essential for safety reasons. The data in this application are discrete and continuous sensor measurements and can be dealt with seamlessly using the methods described here to discover anomalous flights. We specifically treat the problem of discovering anomalous features in the time series that may be hidden from the sensor suite and compare those methods to standard envelope detection methods on test data designed to accentuate the differences between the two methods. Identification of these hidden anomalies is crucial to building stable, reusable, and costefficient systems. We also discuss a data mining framework for the analysis and discovery of anomalies in highdimensional time series of sensor measurements that would be found in an ISHM system. We conclude with recommendations that describe the tradeoffs in building an integrated scalable platform for robust anomaly detection in ISHM applications.
Unsupervised Temporal Rule Mining with Genetic Programming and Specialized Hardware
, 2003
"... Rule mining is the practice of discovering interesting and unexpected rules from large data sets. Depending on the exact problem formulation, this may be a very complicated problem. Existing methods typically make strong simplifying assumptions about the form of the rules, and limit the measure of r ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Rule mining is the practice of discovering interesting and unexpected rules from large data sets. Depending on the exact problem formulation, this may be a very complicated problem. Existing methods typically make strong simplifying assumptions about the form of the rules, and limit the measure of rule quality to simple properties, such as confidence. Because confidence in itself is not a good indicator of how interesting a rule is to the user, the mined rules are typically sorted according to some secondary interestingness measure. In this paper we present a rule mining method that is based on genetic programming. Because we use specialized pattern matching hardware to evaluate each rule, our method supports a very wide range of rule formats, and can use any reasonable fitness measure. We develop a fitness measure that is wellsuited for our method, and give empirical results of applying the method to synthetic and realworld data sets.
Mining hierarchical temporal patterns in multivariate time series
 Proceedings of the 27th Annual German Conference in Artificial Intelligence (KI’04
, 2004
"... Abstract. The Unificationbased Temporal Grammar is a temporal extension of static unificationbased grammars. It defines a hierarchical temporal rule language to express complex patterns present in multivariate time series. The Temporal Data Mining Method is the accompanying framework to discover t ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Abstract. The Unificationbased Temporal Grammar is a temporal extension of static unificationbased grammars. It defines a hierarchical temporal rule language to express complex patterns present in multivariate time series. The Temporal Data Mining Method is the accompanying framework to discover temporal knowledge based on this rule language. A semiotic hierarchy of temporal patterns, which are not a priori given, is build in a bottom up manner from static logical descriptions of multivariate time instants. We demonstrate the methods using music data, extracting typical parts of songs. 1