Results 1 -
3 of
3
Iterate: A conceptual clustering algorithm for data mining
- IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS
, 1998
"... The data exploration task can be divided into three interrelated subtasks: (i) feature selection, (ii) discovery, and (iii) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm, ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
The data exploration task can be divided into three interrelated subtasks: (i) feature selection, (ii) discovery, and (iii) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm, ITERATE, employs: (i) a data ordering scheme and (ii) an iterative redistribution operator to produce maximally cohesive and distinct clusters. Cohesion or intra-class similarity is measured in terms of the match between individual objects and their assigned cluster prototype. Distinctness or inter-class dissimilarity is measured by an average of the variance of the distribution matchbetween clusters. We demonstrate that interpretability, from a problem solving viewpoint, is addressed by theintra- and interclass measures. Empirical results demonstrate the properties of the discovery algorithm, and its applications to problem solving.
30 Conference on Data Mining | DMIN'06 | Mining of Stock Data: Intra- and Inter-Stock Pattern Associative Classification
"... Abstract—In this paper, a pattern-based stock data mining approach which transforms the numeric stock data to symbolic sequences, carries out sequential and non-sequential association analysis and uses the mined rules in classifying/predicting the further price movements is proposed. Two formulation ..."
Abstract
- Add to MetaCart
Abstract—In this paper, a pattern-based stock data mining approach which transforms the numeric stock data to symbolic sequences, carries out sequential and non-sequential association analysis and uses the mined rules in classifying/predicting the further price movements is proposed. Two formulations of the problem are considered. They are intra-stock mining which focuses on finding frequently appearing patterns for the stock time series itself and inter-stock mining which discovers the strong inter-relationship among several stocks. Three different methods are proposed for carrying out associative classification/prediction, namely, Best Confidence, Maximum Window Size and Majority Voting. They select the mined rule(s) and make the final prediction. A modified Apriori algorithm is also proposed to mine the frequent symbolic sequences in intra-stock mining and the frequent symbol-sets in inter-stock mining. Various experimental results are reported.
A Comparative Study of Numeric and Symbolic Representation of Stock Data 1
"... Recently there has been a lot of interest in mining the time series data. Stock data mining plays an important role to visualize the behavior of financial market. In financial data mining the data is normally represented in the numeric format, however, the symbolic representation is also used to eva ..."
Abstract
- Add to MetaCart
Recently there has been a lot of interest in mining the time series data. Stock data mining plays an important role to visualize the behavior of financial market. In financial data mining the data is normally represented in the numeric format, however, the symbolic representation is also used to evaluate the overall impact. Time series data are difficult to manipulate, but when they are treated as symbols instead of data points, interesting patterns can be discovered and it becomes an easier task to mine them. In this paper, a preliminary comparative study of numeric and symbolic representation of NSE stock data of thirteen years period i.e. from Jan. 1996 to Dec.2008 is presented.. First of all the data was normalized, and the study has been conducted on the normalized data. Euclidean distance measure has been used to establish relationships among various stocks. Three symbols [up, down, neutral] have been used for symbolic representation of the data and distance is evaluated as per the matching pattern of these symbols. It has been found that in most of the cases the numeric representation of stock data gives better results than symbolic representation, but symbolic representation provides an easier interpretation and helped to determine an overall pattern. Symbolic pattern is having resemblance with price change pattern in numeric representation.

