Results 11 - 20
of
80
Efficient Searches for Similar Subsequences of Different Lengths in Sequence Databases
- In ICDE
, 2000
"... We propose an indexing technique for fast retrieval of similar subsequences using time warping distances. A time warping distance is a more suitable similarity measure than the Euclidean distance in many applications, where sequences may be of different lengths or different sampling rates. Our index ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
We propose an indexing technique for fast retrieval of similar subsequences using time warping distances. A time warping distance is a more suitable similarity measure than the Euclidean distance in many applications, where sequences may be of different lengths or different sampling rates. Our indexing technique uses a disk-based suffix tree as an index structure and employs' lower-bound distance functions to filter out dissimilar subsequences without false dismissals. To make the index structure compact and thus accelerate the query processing, we convert sequences of continuous values to sequences of discrete values via a categorization method and store only a subset of suffixes whose first values are different from their preceding values. The experimental results' reveal that our proposed technique can be a few orders' of magnitude faster than sequential scanning.
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases
- In ICDE
, 2001
"... This paper discusses an effective processing of similarity search that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. Previous methods for processing similarity search that supports time warp ..."
Abstract
-
Cited by 35 (2 self)
- Add to MetaCart
This paper discusses an effective processing of similarity search that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. Previous methods for processing similarity search that supports time warping fail to employ multi-dimensional indexes without false dismissal since the time warping distance does not satisfy the triangular inequality. They have to scan all the database, thus suffer from serious performance degradation in large databases. Another method that hires the suffix tree, which does not assume any distance function, also shows poor performance due to the large tree size. In this paper, we propose a new novel method for similarity search that supports time warping. Our primary goal is to innovate on search performance in large databases without permitting any false dismissal. To attain this goal, we devise a new distance function D tw\Gammalb that consistently unde...
Density-connected sets and their application for trend detection in spatial databases
- PROC. 3RD INT. CONF. KNOWLEDGE DISCOVERY AND DATA MINING (KDD'97)
, 1997
"... Several clustering algorithms have been proposed for class identification in spatial databases such as earth observation databases. The effectivity of the well-known algorithms such as DBSCAN, however, is somewhat limited because they do not fully exploit the richness of the different types of dat ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Several clustering algorithms have been proposed for class identification in spatial databases such as earth observation databases. The effectivity of the well-known algorithms such as DBSCAN, however, is somewhat limited because they do not fully exploit the richness of the different types of data contained in a spatial database. In this paper, we introduce the concept of density-connected sets and present a significantly generalized version of DBSCAN. The major properties of this algorithm are as follows: (1) any symmetric predicate can be used to define the neighborhood of an object allowing a natural definition in the case of spatially extended objects such as polygons, and (2) the cardinality function for a set of neighboring objects may take into account the non-spatial attributes of the objects as a means of assigning application specific weights. Density-connected sets can be used as a basis to discover trends in a spatial database. We define trends in spatial databases and show how to apply the generalized DBSCAN algorithm for the task of discovering such knowledge. To demonstrate the practical impact of our approach, we performed experiments on a geographical information system on Bavaria which is representative for a broad class of spatial databases.
Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances
- VLDB Journal
, 2000
"... distances ..."
Time series data mining: Identifying temporal patterns for characterization and prediction of time series events
- Marquette University
, 1999
"... This work is dedicated to my wife, Christine, our son, Christopher, and his brother, who will arrive shortly. Acknowledgment I would like to thank Dr. Xin Feng for the encouragement, support, and direction he has provided during the past three years. His insightful suggestions, enthusiastic endorsem ..."
Abstract
-
Cited by 19 (6 self)
- Add to MetaCart
This work is dedicated to my wife, Christine, our son, Christopher, and his brother, who will arrive shortly. Acknowledgment I would like to thank Dr. Xin Feng for the encouragement, support, and direction he has provided during the past three years. His insightful suggestions, enthusiastic endorsement, and shrewd proverbs have made the completion of this research possible. They provide an example to emulate. I owe a debt of gratitude to my committee members, Drs. Naveen Bansal, Ronald Brown, George Corliss, and James Heinen, who each have helped me to expand the breadth of my research by providing me insights into their areas of expertise. I am grateful to Marquette University for its financial support of this research, and the faculty of the Electrical and Computer Engineering Department for providing a rigorous and stimulating environment that exemplifies cura personalis. I thank Mark Palmer for many interesting, insightful, and thought provoking
Segment-Based Approach for Subsequence Searches in Sequence Databases
, 2001
"... This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadraticall ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the SegmentBased Approach for Subsequence Searches (SBASS), which modifies the similarity measure from time warping to piecewise time warping and limits the number of possible subsequences to be compared with a query sequence. For efficient
Finding Informative Rules in Interval Sequences
- Intelligent Data Analysis
, 2001
"... Observing a binary feature over a period of time yields a sequence of observation intervals. To ease the access to continuous features (like time series), they are often broken down into attributed intervals, such that the attribute describes the series' behaviour within the segment (e.g. increasing ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Observing a binary feature over a period of time yields a sequence of observation intervals. To ease the access to continuous features (like time series), they are often broken down into attributed intervals, such that the attribute describes the series' behaviour within the segment (e.g. increasing, high-value, highly convex, etc.). In both cases, we obtain a sequence of interval data, in which temporal patterns and rules can be identified. A temporal pattern is defined as a set of labeled intervals together with their interval relationships described in terms of Allen's interval logic. In this paper, we consider the evaluation of such rules in order to find the most informative rules. We discuss rule semantics and outline de ciencies of the previously used rule evaluation. We apply the J-measure to rules with a modified semantics in order to better cope with different lengths of the temporal patterns. We also consider the problem of specializing temporal rules by additional attributes of the state intervals.
Paradigms for spatial and spatio-temporal data mining
- In Geographic Data Mining and Knowledge
, 2001
"... With some significant exceptions, current applications for data mining are either in those areas for which there is little accepted discovery methodology or are being used within a knowledge discovery process that does not expect authoritative results but finds the discovered rules useful none-the-l ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
With some significant exceptions, current applications for data mining are either in those areas for which there is little accepted discovery methodology or are being used within a knowledge discovery process that does not expect authoritative results but finds the discovered rules useful none-the-less. This is in contrast to its application in the fields applicable to spatial or spatio-temporal discovery which possess a
Mining for Similarities in Aligned Time Series Using Wavelets
, 1999
"... Discovery of non-obvious relationships between time series is an important problem in many domains, such as financial, sensory, and scientific data analysis. We consider data mining in aligned time series, which arise, e.g., in numerous online monitoring applications, and we are interested in findin ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Discovery of non-obvious relationships between time series is an important problem in many domains, such as financial, sensory, and scientific data analysis. We consider data mining in aligned time series, which arise, e.g., in numerous online monitoring applications, and we are interested in finding time series that reflect the same external events. The time series can have different vertical positions, scales and overall trends, but still show related features at the same locations. The features can be short term such as small peaks and turns, or long term such as wider mountains and valleys. We propose using a wavelet transformation of a time series to produce a natural set of features for the sequence. Wavelet transformation yields features that describe properties of the sequence both at various locations and at varying time granularities. In the proposed method, these features are processed so that they are insensitive to changes in the vertical position, scaling, and overall tre...
Learning first order logic time series classifiers: Rules and boosting
- Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD’00
, 2000
"... Departamento de Inform'atica ..."

