Results 1 -
3 of
3
Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining
- Journal of Machine Learning Research
"... This paper gives a survey of contrast set mining (CSM), emerging pattern mining (EPM), and subgroup discovery (SD) in a unifying framework named supervised descriptive rule discovery. While all these research areas aim at discovering patterns in the form of rules induced from labeled data, they use ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
This paper gives a survey of contrast set mining (CSM), emerging pattern mining (EPM), and subgroup discovery (SD) in a unifying framework named supervised descriptive rule discovery. While all these research areas aim at discovering patterns in the form of rules induced from labeled data, they use different terminology and task definitions, claim to have different goals, claim to use different rule learning heuristics, and use different means for selecting subsets of induced patterns. This paper contributes a novel understanding of these subareas of data mining by presenting a unified terminology, by explaining the apparent differences between the learning tasks as variants of a unique supervised descriptive rule discovery task and by exploring the apparent differences between the approaches. It also shows that various rule learning heuristics used in CSM, EPM and SD algorithms all aim at optimizing a trade off between rule coverage and precision. The commonalities (and differences) between the approaches are showcased on a selection of best known variants of CSM, EPM and SD algorithms. The paper also provides a critical survey of existing supervised descriptive rule discovery visualization methods.
Classification of Changes in Evolving Data Streams using Online Clustering Result Deviation
- Proc. Of Internatinal Workshop on Knowledge Discovery in Data Streams
, 2006
"... Abstract. Data stream analysis has attracted considerable attention over the past few years. Streaming data is often evolving over time. Capturing changes could be used for detecting an event or a phenomenon in various applications. Weather conditions, economical changes, astronomical and scientific ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Data stream analysis has attracted considerable attention over the past few years. Streaming data is often evolving over time. Capturing changes could be used for detecting an event or a phenomenon in various applications. Weather conditions, economical changes, astronomical and scientific phenomena are among a wide range of applications. Due to the high volume and speed of data streams, it is computationally hard to capture these changes from raw data in real-time. In this paper, we propose a novel algorithm that we term as STREAM-DETECT to capture these changes in data stream distribution and/or domain using clustering result deviation. STREAM-DETECT is followed by a process of offline classification CHANGE-CLASS. This classification is concerned with the association of the history of change characteristics with the observed event or phenomenon. Experimental results show the efficiency of the proposed framework in both detecting the changes and classification accuracy. Keywords: Data streams, Change detection, Classification and Clustering. 1-
Conceptual Equivalence for Contrast Mining in Classification Learning
"... Learning often occurs through comparing. In classification learning, in order to compare data groups, most existing methods compare either raw instances or learned classification rules against each other. This paper takes a different approach, namely conceptual equivalence, that is, groups are equiv ..."
Abstract
- Add to MetaCart
Learning often occurs through comparing. In classification learning, in order to compare data groups, most existing methods compare either raw instances or learned classification rules against each other. This paper takes a different approach, namely conceptual equivalence, that is, groups are equivalent if their underlying concepts are equivalent while their instance spaces do not necessarily overlap and their rule sets do not necessarily present the same appearance. A new methodology of comparing is proposed that learns a representation of each group’s underlying concept and respectively crossexams one group’s instances by the other group’s concept representation. The innovation is five-fold. First, it is able to quantify the degree of conceptual equivalence between two groups. Second, it is able to retrace the source of discrepancy at two levels: an abstract level of underlying concepts and a specific level of instances. Third, it applies to numeric data as well as categorical data. Fourth, it circumvents direct comparisons between (possibly alarge number of) rules that demand substantial effort. Fifth, it reduces dependency on the accuracy of employed classification algorithms. Empirical evidence suggests that this new methodology is effective and yet simple to use in scenarios such as noise cleansing and concept-change learning.

